Long coordinates #72

eklinger · 2017-11-21T07:56:23Z

Hi,
I would love to use the RTree as an interval tree but would need int or long coordinates rather than float. Is there any reason it doesn't support long type? Thanks!

ConradGroth · 2017-12-15T12:23:07Z

Hi David,

we are currently evaluating your RTree implementation and found the API very nice, especially the rx java return types and the immutability, which lets us query the tree under parallel writes. Unfortunately the Rectangle internally uses floats, but we need double precision. Is there a reason for using floats? If it’s because of the memory consumption, is it maybe possible to provide another Rectangle with double precision?

Hi Conrad
Thanks for your interest in the project. I'm glad the reactive and immutability aspects are proving attractive.

The choice of float over double was made to reduce memory consumption, yes. Can you describe your use case to me a little so that I can understand why you need double precision? I'm a little surprised that you might need it because you can use the float representation as a first cut if you like in terms of searching and then refine as you like but including the full precision values in the value of the Entry (as opposed to the geometry). Are you referring to the Entry geometry requiring double precision or a search rectangle (or both)?

By the way, I much prefer having this sort of conversation in an issue on the github site so it turns up on searches for other people and so others can participate in the discussion if they want. Of course if you want to retain some confidentiality in this matter that's ok too.

Cheers
Dave

Thanks for you quick response.

I cannot tell you all the details about our data model, but one axis of our rectangles is the time in milliseconds and float reduces that to 24 usable bits (for the mantissa), which means a maximum of 2 hours between the lowest and the highest value in milliseconds precision.

I already found the issue #72 which is exactly what we need. I will add some information there too, to let other interested users participate in the discussion.

Regards
Conrad

davidmoten · 2017-12-15T19:33:46Z

@eklinger @ConradGroth I'm happy to explore supporting other primitive types explicitly like long/double/integer. I'll have a look at the code base to see what's involved.

In the meantime, there are workarounds that may still give you what you need. When you add items to the r-tree you may not lose precision in a significant way. 2^23 bits still gives a grid of 8 million x 8 million to play with. Is that grid size insufficient for your use cases?

Just use the r-tree as a coarse search index and then refine with full resolution. Here's an example,

class Thing {
    long x1, y1, x2, y2;
    String content;
}
//build
RTree<Thing, Rectangle> tree = RTree.create();
tree = tree.add(
    new Thing(x1, y1, x2, y2, content), 
    Geometries.rectangle(floor(x1), floor(y1), ceil(x2), ceil(y2)));

//search
long xx1,xx2,yy1,yy2;
tree
    .search(Geometries.rectangle(floor(xx1), floor(yy1), ceil(xx2), ceil(yy2))) 
    .filter(entry -> gte(entry.value().x1, xx1) 
                    && lte(entry.value().x2, xx2) 
                    && gte(entry.value().y1, yy1) 
                    && lte(entry.value().y2, yy2))
  ...

I have not written these functions:

floor(x) returns largest float less than or equal to the long value x
ciel(x) returns smallest float greater than or equal to the long value x
gte(a, b) returns true if a is greater than or equal to b where a is float and b long
lte(a, b) returns true if a is less than or equal to b where a is float and b long

Most likely, BigDecimal would be used to implement these functions.

davidmoten · 2017-12-15T20:28:56Z

Here are the functions:

    private static final MathContext FLOOR = new MathContext(7, RoundingMode.FLOOR);
    private static final MathContext CEILING = new MathContext(7, RoundingMode.CEILING);

    public static float floor(long x) {
        return new BigDecimal(x).round(FLOOR).floatValue();
    }

    public static float ceil(long x) {
        return new BigDecimal(x).round(CEILING).floatValue();
    }

    public static boolean gte(float a, long b) {
        return new BigDecimal(a).compareTo(new BigDecimal(b)) >=0;
    }
    ...

davidmoten · 2017-12-16T07:48:58Z

I cannot tell you all the details about our data model, but one axis of our rectangles is the time in milliseconds and float reduces that to 24 usable bits (for the mantissa), which means a maximum of 2 hours between the lowest and the highest value in milliseconds precision.

@ConradGroth just thought I'd mention that the full range of long values represents an enormous range of time. If you map a realistic sub-range of time (say 20 years?) on to float you may get sufficient precision especially with the technique above. A quick estimate tells me that you could end up with 10 seconds between neighbouring float values for a 20 year range.

Mapping your axes to more compact ranges is a good idea also because euclidean distance is used to split rectangles by the algorithm and you want the axes equally weighted in that distance measurement if possible.

davidmoten · 2017-12-20T23:05:42Z

I've just about completed double support for all geometries including flatbuffers serialization.

I'm not planning on supporting axes of other types because of the plethora of combinations which would involve an explosion of types (RectangleFloatDouble, RectangleDoubleFloat, RectangleLongFloat, RectangleFloatLong, Circle*,Line*, Point*, etc..). Let me know if double doesn't suit your needs.

There are breaking changes in the API but are reasonably small (all geometries return their coordinates as double now regardless of how they are stored (float or double)). When you want a Geometry instance that supports double precision just pass double parameters to the Geometries creation method:

// create a double precision rectangle
Rectangle r = Geometries.rectangle(1.0, 1.0, 2.0, 2.0);

// create a single precision circle
Rectangle r = Geometries.circle(1.0f, 1.0f, 5.0f);

An interesting side-effect is that performance has improved (on my i5 laptop at least) with the switch to using double for all calculations. I'll add some details on perf improvements soon.

I've released 0.8.1beta to Maven Central if you'd like to trial @eklinger @ConradGroth .

ConradGroth · 2018-01-05T13:05:08Z

Thanks for your answers. Indeed the internal usage of float is no problem, if you do an exact filtering after searching. Also the methods floor and ceil are not really necessary, as we can rely on the rounding which is done by the JVM during conversion from double/long to float, because the coordinates of the search rectangle are rounded the same way.

davidmoten added the question label Dec 15, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Long coordinates #72

Long coordinates #72

eklinger commented Nov 21, 2017

ConradGroth commented Dec 15, 2017 •

edited

Loading

davidmoten commented Dec 15, 2017 •

edited

Loading

davidmoten commented Dec 15, 2017

davidmoten commented Dec 16, 2017 •

edited

Loading

davidmoten commented Dec 20, 2017

ConradGroth commented Jan 5, 2018 •

edited

Loading

Long coordinates #72

Long coordinates #72

Comments

eklinger commented Nov 21, 2017

ConradGroth commented Dec 15, 2017 • edited Loading

davidmoten commented Dec 15, 2017 • edited Loading

davidmoten commented Dec 15, 2017

davidmoten commented Dec 16, 2017 • edited Loading

davidmoten commented Dec 20, 2017

ConradGroth commented Jan 5, 2018 • edited Loading

ConradGroth commented Dec 15, 2017 •

edited

Loading

davidmoten commented Dec 15, 2017 •

edited

Loading

davidmoten commented Dec 16, 2017 •

edited

Loading

ConradGroth commented Jan 5, 2018 •

edited

Loading