Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: LIKE operator in JSON_MATCH #7842

Closed
Eywek opened this issue Nov 30, 2021 · 3 comments
Closed

feature request: LIKE operator in JSON_MATCH #7842

Eywek opened this issue Nov 30, 2021 · 3 comments

Comments

@Eywek
Copy link

Eywek commented Nov 30, 2021

Hello,

I was wondering if it was planned to add the LIKE operator to JSON_MATCH ? I’m currently using

REGEXP_LIKE(JSONEXTRACTSCALAR("labels", '$.demande_intention', 'STRING'), 'terminal')

but it’s very slow (even with small number of scanned documents (21). And I was thinking that maybe having it directly with JSON_MATCH could speed-up this operation?

JSON_MATCH("labels", 'demande_intention LIKE ''terminal''')

edit: the labels column already have a JSON index

Thank you

cc @atris

@richardstartin
Copy link
Member

Hi @Eywek how slow is "very slow"?

Is there a natural way for you to filter the JSON documents before applying the JSONEXTRACTSCALAR and REGEXP_LIKE functions? Without an index structured to support these functions, they are naturally quite expensive.

Have you considered extracting $.demande_intention into a text column during ingestion so you can create a text index on it, instead of storing it embedded in JSON?

@Eywek
Copy link
Author

Eywek commented Nov 30, 2021

how slow is "very slow"?

~600-800ms for 21 documents (vs 22-40ms without this filter)

Is there a natural way for you to filter the JSON documents before applying the JSONEXTRACTSCALAR and REGEXP_LIKE functions? Without an index structured to support these functions, they are naturally quite expensive.

I'm using a JSON index on the column and operations like JSON_MATCH(labels, 'demande_intention = ''foo''') are fast (30-40ms)

Have you considered extracting $.demande_intention into a text column during ingestion so you can create a text index on it, instead of storing it embedded in JSON?

Yep I know it would be the best way to do it but since the labels column contains dynamic data, we don't want to add each field of the JSON object to the schema

@Jackie-Jiang
Copy link
Contributor

Supported added in #12568

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants