forked from facebookincubator/velox
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Handle unescaped UTF-8 characters in Presto url_extract_* UDFs (faceb…
…ookincubator#11535) Summary: Presto Java supports UTF-8 characters that are not control or whitespace characters appearing anywhere in a URL where a % escaped character can appear. This change modifies Velox's URIParser to do the same. Velox's URIParser would produce incorrect results when any non-ASCII character appeared anywhere in the URL and this has been fixed as well. In order to facilitate this I modified the tryGetCharLength helper function in UTF8Utils to take in a int32_t reference which it populates with the code point if the UTF-8 character is valid. It was already calculating this value and throwing it away, returning it allows me to avoid an additional call to repeat those steps and is consistent with the Airlift function on which it's based. Differential Revision: D65927918
- Loading branch information
1 parent
4a0d2c5
commit 18ee41a
Showing
8 changed files
with
153 additions
and
51 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.