-
-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce more efficient implementations of str and join #545
Introduce more efficient implementations of str and join #545
Conversation
fa1632b
to
9b7a6cc
Compare
Could you please share benchmarks (if you still have your repl open, by any chance)? Just wonder how much performance it gives. |
I was mostly measuring the allocations improvements across multiple Metabase endpoints. This PR focuses mostly on allocations, the execution speed is not impacted as much. Here's one of the concrete queries: (crit/quick-bench
(format-dsl {:select [[[:min :recent_views.user_id] :user_id]
:model
:model_id
[[:max [:coalesce :d.view_count :t.view_count]] :cnt]
[:%max.timestamp :max_ts]]
:group-by [:model :model_id]
:where [:and
[:= :context "view"]
[:in :model #{"dashboard" "table"}]]
:order-by [[:max_ts :desc] [:model :desc]]
:limit 10
:left-join [[:report_dashboard :d]
[:and
[:= :model "dashboard"]
[:= :d.id :model_id]]
[:metabase_table :t]
[:and
[:= :model "table"]
[:= :t.id :model_id]]]}))
;; Before:
Evaluation count : 13122 in 6 samples of 2187 calls.
Execution time mean : 50.880818 µs
Execution time std-deviation : 11.514117 µs
Execution time lower quantile : 44.925380 µs ( 2.5%)
Execution time upper quantile : 70.466794 µs (97.5%)
Overhead used : 1.879403 ns
Allocations per call: 118,560b
---
;; After:
Evaluation count : 13188 in 6 samples of 2198 calls.
Execution time mean : 45.462514 µs
Execution time std-deviation : 53.006990 ns
Execution time lower quantile : 45.418067 µs ( 2.5%)
Execution time upper quantile : 45.542755 µs (97.5%)
Overhead used : 1.879403 ns
Allocations per call: 105,072b I've made further improvements locally, but I don't want to dump everything into one PR to accommodate for easier review. |
A couple of counterpoints:
|
|
I'll note that the caching is built into HoneySQL -- but it is opt-in and you have to add the I like the idea of speeding up
|
I understand what you mean, but I really really dislike "util" deps in other libraries, and I don't think it's a good idea to multiplicate those. Maybe, at some point, when the list of "improved core functions" is big enough, it would make sense to move those into a separate library and depend upon. Until then, there is a a big value in HoneySQL having no dependencies, and keeping it this way is valuable. Adding two utility functions is a small price to pay for that. I agree. I didn't know which to pick between the two; glad you have a strong stance on it. |
b3338f1
to
98a7fd2
Compare
Updated WRT to comment 2. by @seancorfield. Also added more |
98a7fd2
to
846123c
Compare
Thanks. That makes the diff a lot more readable as well as having a more intuitive arg order. |
Can you explain why this line is present? clj-kondo flags it as an error (unresolved symbol): {:tag String} |
Signed-off-by: Sean Corfield <sean@corfield.org>
I copied it from |
Given that most of the purpose of HoneySQL is to mash strings together so that the user doesn't have to, it's natural that string mashing is responsible for most of its running time and generated allocations. Clojure's base library functions are not the most efficient for this kind of work:
clojure.core/str
performs badly when provided more than 1 argument. It resorts to varargs and inefficient sequence iteration.clojure.string/join
also performs inefficient seq-based iteration of the provided collection.This PR proposes adding a new namespace that contains optimized implementations of those two functions. As a bonus, the new
join
also has a transducer-accepting arity. Since it is a common pattern to string/join something after some action is performed on every element, the transducer arity becomes useful in many situations.The new functions only have optimized bodies for
:clj
language selector and delegate to base functions everywhere else.In the
honey.sql
namespace,clojure.core/str
is replaced withhoney.sql.util/str
in the namespace form.join
is referred directly, so everywhere in the codestr/join
is replaced withjoin
and also modified to use transducer arity where necessary.Obviously, I'm open to restructure the solution or dial back some changes if this is too much in one go.