Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

floor and ceiling for Sorted Arrays #804

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

ali-ghanbari
Copy link

If I am not mistaking, JDK standard library and the current implementation of Apache Commons Lang do not have functions for binary searching for floor and ceiling in sorted arrays.

This PR adds efficient implementation for floor and ceiling inside the class ArrayUtils (Java's Arrays class has binarySearch methods, so I thought this class would be a good place for these methods).
The methods handle various Java types.
A few test cases are also included.

ali-ghanbari added 3 commits September 14, 2021 23:24
@garydgregory
Copy link
Member

garydgregory commented Sep 15, 2021

I have no idea what this new code does as there are ZERO Javadocs, so there is nothing to review from my POV. There is also a lot of code duplication but I am not sure if lambdas and primitives would mix in an effective manner here.

@ali-ghanbari
Copy link
Author

I have no idea what this new code does as there are ZERO Javadocs, so there is nothing to review from my POV.

Thank you for your feedback @garydgregory :)

I will add Javadoc asap.

This PR adds two methods floor (which finds the index of the largest element in the array that is smaller than or equal to the given value) and ceiling (which finds the index of the smallest element in the array that is larger than or equal to the given value). These are sorted-array counterparts of floor and ceiling functions in Java's TreeSet and TreeMap.

There is also a lot of code duplication but I am not sure if lambdas and primitives would mix is an effective manner here.

At the time I was writing the code, I could not find a better way to write overloads of the functions for different primitive Java types. Any suggestions is greatly appreciated.

@garydgregory
Copy link
Member

What happens if the array is not sorted? Is it possible for the code to cause an infinite loop?

@ali-ghanbari
Copy link
Author

ali-ghanbari commented Sep 15, 2021

Hi @garydgregory ,

I have added Javadoc and refined a code a little bit to make it similar to the rest of the codebase (e.g., I avoided using return -1 and instead return the INDEX_NOT_FOUND).

What happens if the array is not sorted? Is it possible for the code to cause an infinite loop?

This algorithm is based on binary search, so we assume the array will be sorted (at least within the specified range) prior to passing it to the method.
As I have mentioned in the Javadoc, the behavior of the method is undefined when you pass an unsorted array to the method.

Thanks

@coveralls
Copy link

coveralls commented Sep 16, 2021

Coverage Status

Coverage decreased (-1.5%) to 93.501% when pulling 20b240b on ali-ghanbari:adding-floor-and-ceiling into d268741 on apache:master.

@garydgregory
Copy link
Member

Hi @kinow, @chtompki, and all:

I am pondering on this PR again: Since floor and ceiling are mathematical concepts, I would think it belongs in Commons Math, which already has functions to implement these concepts: https://commons.apache.org/proper/commons-math/javadocs/api-3.4/org/apache/commons/math3/analysis/UnivariateFunction.html

I do not know how easy it is to couple the above CM functions with arrays, but it's a question for the user's mailing list ;-)

Thoughts?

@ali-ghanbari
Copy link
Author

ali-ghanbari commented Nov 19, 2021

[...] Since floor and ceiling are mathematical concepts, I would think it belongs in Commons Math [...]

Thanks, @garydgregory , for reviewing my PR.
While waiting for @kinow and @chtompki , and others, I just wanted to add that floor and ceiling are generic array (binary) search methods, and in my opinion they should be come with normal binary search.
Yes, they might be seen as multivariate functions, if we had variants only for searching numerical arrays, but floor and ceiling can be used for searching any array as long as a comparator is defined for their elements.

Many thanks

Copy link
Member

@kinow kinow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi again @ali-ghanbari 👋

Code looks good, docs and tests well written too. Thanks! 👍

@garydgregory,

I am pondering on this PR again: Since floor and ceiling are mathematical concepts, I would think it belongs in Commons Math, which already has functions to implement these concepts: https://commons.apache.org/proper/commons-math/javadocs/api-3.4/org/apache/commons/math3/analysis/UnivariateFunction.html

Good point on whether it belongs to Lang or to another component. I haven't been following commons-math, but I thought it was also being split into smaller modules. Maybe commons-math or another of those math related modules would make more sense.

Or maybe commons-collections? The JDK has a NavigableSet 1 which has methods for floor and ceiling for a sorted set. Commons Collections has the AbstractNavigableSetDecorator class that also implements those methods 2.

I think if I were to search for these functions/methods, I would probably look for them in the collections component, and if not found then the lang or math-related components, in this order.

Bruno

Footnotes

  1. https://docs.oracle.com/javase/7/docs/api/java/util/NavigableSet.html

  2. https://github.com/apache/commons-collections/blob/0b365e4c1833bf55648ca34b9dffd966c31cc69b/src/main/java/org/apache/commons/collections4/set/AbstractNavigableSetDecorator.java#L70-L78

@garydgregory
Copy link
Member

Hi @kinow

I'd prefer to avoid feature creep in Lang in the form of a "non-java.lang" domain coming in which would open the door to request for perhaps other "mathy" requests: "Well, if floor and ceiling are in there, why not foo and bar?" Somewhat in the same vein as we are looking at Commons Text for text-based functionality instead of Lang. IOW, focus.

Since we are talking about arrays here, Commons Collections does not feel like a good fit.

@ali-ghanbari
Copy link
Author

Hi @kinow ! Thanks for your constructive comments and thanks for reviewing the code :)

@ali-ghanbari
Copy link
Author

I understand @garydgregory 's concern about keeping ArrayUtils's cohesion high.
So, I am suggesting the following. Please kindly let me know what you think.
I am planning to create a new class named SortedArrays which might have other methods in it in the future.
For the time being, I want to add an enum (as a subclass), named BinarySearchStrategy, with three elements ANY, FIRST, and LAST.
BinarySearchStrategy implements an interface with many variants of the methods binarySearch, floor, and ceiling.
JDK's Arrays.binarySearch return's any index matching a search key, but with FIRST and LAST we extend this functionality to return first/last index matching a search key (these two strategies make more sense in a sorted array with duplicates).
We can have similar strategies for floor and ceiling.
I think this will make addition of these new features more manageable.

I also suggest adding this class to: Commons Lang or Commons Collections for we are essentially extending java.util.Arrays.

As soon as we decide on the right module for this, I will raise my PR :) Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants