Abstract

A facility |$f$| is said to influence a user |$u$| if |$f$| is one of the |$k$| closest facilities of the user |$u$|⁠. This is because users usually prefer to visit/use nearby facilities. The influence set of a facility |$f$| is the set of users influenced by |$f$|⁠. The computation of the influence set has gained extensive research attention in the past decade. Note that the definition of influence set assumes that each user behaves the same, i.e. each user prefers the |$k$| closest facilities where |$k$| is a constant. However, in real-world scenarios, different users may prefer different values of |$k$| depending on, for example, the density of facilities around them, the mode of transport available to them etc. Therefore, assuming a constant value of |$k$| for each user may not be able to capture the essence of influence effectively. In this paper, we compute the influence of a facility |$f$| using the query logs that contain the |$k$|-nearest neighbors queries issued in the past and essentially represent the preferred value of |$k$| for each user. Specifically, a facility |$f$| has an impact on a user |$u$| if |$f$| is one of the facilities in the answer set of the query issued by the user. The set of such users is called impact set to avoid ambiguity with influence set. To the best of our knowledge, we are the first to study the problem of impact set computation using query logs. Although existing techniques can be extended to compute the impact set, these techniques suffer from several serious limitations. We carefully analyze these limitations and propose an algorithm to answer impact set queries for 2D location data. The proposed algorithm uses a novel access order and several non-trivial observations to address these limitations. We conduct an extensive experimental study on real and synthetic data sets and demonstrate that our algorithm significantly outperforms existing algorithms in terms of both CPU cost and I/O cost.

You do not currently have access to this article.