-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance optimizations #18
base: main
Are you sure you want to change the base?
Conversation
6331ebb
to
b62cb77
Compare
This looks good to me. I'll leave it up to @4kimov to merge this who has a deeper knowledge across our language implementations. |
b62cb77
to
c010630
Compare
c010630
to
554aaea
Compare
554aaea
to
66d5ae8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can understand that it's a big PR and that you might have doubts about the merge. I've added comments to explain the changes. Given the performance gain, I think it's important to look into it.
throw new InvalidArgumentException('Alphabet must contain unique characters'); | ||
} | ||
|
||
$minLengthLimit = 255; | ||
if ( | ||
!is_int($minLength) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type is already validated in the arg type.
$inRangeNumbers = array_filter($numbers, fn($n) => $n >= 0 && $n <= self::maxValue()); | ||
if (count($inRangeNumbers) != count($numbers)) { | ||
throw new InvalidArgumentException( | ||
'Encoding supports numbers between 0 and ' . self::maxValue(), | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of creating a new array, the exception is thrown directly when there is an invalid number.
$alphabetChars = str_split($this->alphabet); | ||
foreach (str_split($id) as $c) { | ||
if (!in_array($c, $alphabetChars)) { | ||
return $ret; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This split operation is replaced by a more efficient regex.
for ($i = 0, $j = strlen($alphabet) - 1; $j > 0; $i++, $j--) { | ||
$r = ($i * $j + ord($alphabet[$i]) + ord($alphabet[$j])) % strlen($alphabet); | ||
[$alphabet[$i], $alphabet[$r]] = [$alphabet[$r], $alphabet[$i]]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manipulation of individual characters of the string, instead of using an array of chars.
$id = []; | ||
$chars = str_split($alphabet); | ||
|
||
$result = $num; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reuse and modify the variable $num
instead of creating a new one.
array_unshift($id, $chars[$this->math->intval($this->math->mod($result, count($chars)))]); | ||
$result = $this->math->divide($result, count($chars)); | ||
} while ($this->math->greaterThan($result, 0)); | ||
$id = $alphabet[$this->math->intval($this->math->mod($num, strlen($alphabet)))] . $id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Appending to the end of the string is the same as using array_unshift
on an array.
Well, to say that these optimizations are welcome would be an understatement. Thank you for taking the time! A few basic questions as I look at this:
|
After multiple reviews, I don't see any breaking change in this PR. In #17, there is something negligible with the
Regexes operate on bytes; unicode characters are split into bytes and not considered as a single character. But this was already the case with There is no other restriction in the characters accepted by the alphabet. The alphabet is not used as part of the regex, and if it was, I would have used |
use function
to enable compiler optimization of some native functions.preg_match('/(.).*\1/', $alphabet)
to validate there is no duplicate char, faster than splitting into an array$string[$char]
array_reduce
,array_filter
), they are slower thanforeach
Benchmark
PHPBench code
In
phpbench.json
In
tests/SqidsBench.php
Before
After