Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to have sampled results over a geographical search ? #90

Open
CyprienGottstein opened this issue May 20, 2019 · 3 comments
Labels

Comments

@CyprienGottstein
Copy link

Hello,

As said in the title, I would like to know if you have sampled results over a geographical (circle in my case) search.

When I perform geographical search, I have cases where i will have 500k to potentially millions of objets. I believe I don't need all of those to do my stuff, a sample should do the trick.

For this to work, the sample needs to be distributed (uniformly in an utopic universe, semi-uniformly in a realistic one) over the space covered by the circle search.

Is this possible ? I didn't dug too deep into the backpressure code and such before coming here. Since it looks a bit complex I'd like to know if its even possible before foolishly loosing time :)

Thank you very much for you work, very nice job.

Best regards.

@davidmoten
Copy link
Owner

Interesting question. So the R-tree is is an index based on rectangles. You can infer the tree structure by looking at the R-tree split diagram in README.md. Every non-leaf node has up to maxChildren children. To sample 1000 from a non-leaf node you might concatenate samples of 1000/numChildren from each child, obtained recursively. If you want to play with the effectiveness of this then use https://github.com/davidmoten/rtree2 which is a simplification of rtree (without a reactive api).

@davidmoten
Copy link
Owner

Note that you can traverse the internal structure via the RTree.root() method which returns a Node. Every Node is either a NonLeaf or a Leaf and every node has an mbr() (minimum bounding rectangle). Thus you can make your own experiments still using public API.

@plokhotnyuk
Copy link

@CyprienGottstein a couple years ago we have used this library for the big index (up to 1M) of geo circles. The main idea and formulas are described here.

P.S. I suppose that you can use a spherical model of the Earth with the Mean radius and Haversine formula allow to get ±0.3% accuracy in calculation of distances comparing with Vincenty’s formulae on an oblate spheroid model. But, please doube-check if it is acceptable for you...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants