Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mysql rewrite query not applying use index in subquery (MySQL 5.7) #704

Closed
shubham26sep opened this issue Sep 1, 2020 · 8 comments
Closed

Comments

@shubham26sep
Copy link

Summary: use_index not applying on subquery

Hi there.
I am using django-mysql==3.2.0 and trying to implement a query (which contains a LEFT OUTER JOIN with group by subquery ). I need to apply use_index in both outer and inner subquery as well.
After applying django-mysql use_index method on django queryset, following sql_query is formed:

SELECT a.id
FROM a
INNER JOIN b
ON ( a.b_id = b.id )
LEFT OUTER JOIN (SELECT b.name AS summarize_lookup_id,
Count(a.id) AS cnt
FROM a
INNER JOIN b
ON ( a.b_id = b.id )
WHERE ( a.is_tombstoned = 0
AND a.client_id = 277
AND a.name = 'Open A'
AND b.is_active = 1
AND ( /QueryRewrite':index=A USE A_index/
1 )
AND NOT ( b.name IS NULL ) )
GROUP BY b.name
ORDER BY NULL) FirstSummarizeAggr
ON ( b.name = FirstSummarizeAggr.summarize_lookup_id )
LEFT OUTER JOIN task
ON ( a.task_id = task.id )
WHERE ( a.is_tombstoned = 0
AND a.client_id = 277
AND a.name = 'Open A'
AND b.is_active = 1
AND And (/QueryRewrite':index=A USE A_index/1) )
ORDER BY task.name ASC,
a.company ASC
LIMIT 10;

In above query, QueryRewrite':index=A USE A_index, this is added in both group by subquery and outer query, but during execution mysql added USE INDEX (A_index) two times in outer query (given below) instead of adding one USE INDEX in group by query. How can this be solved. Please let me know asap.

SELECT
A.id
FROM
A USE INDEX (A_index) USE INDEX (A_index)
INNER JOIN
B
ON ( A.B_id = B.id )
LEFT OUTER JOIN
(
SELECT
B.name AS summarize_lookup_id,
COUNT(A.id) AS cnt
FROM
A
INNER JOIN
B
ON ( A.B_id = B.id )
WHERE
(
A.is_tombstoned = 0
AND A.client_id = 277
AND A.name = 'Open A'
AND B.is_active = 1
AND
(1)
AND NOT (B.name IS NULL)
)
GROUP BY
B.name
ORDER BY
NULL
)
FirstSummarizeAggr
ON ( B.name = FirstSummarizeAggr.summarize_lookup_id )
LEFT OUTER JOIN
Task
ON ( A.task_id = Task.id )
WHERE
(
A.is_tombstoned = 0
AND A.client_id = 277
AND A.name = 'Open A'
AND B.is_active = 1
AND
(1)
)
ORDER BY
Task.name ASC,
A.company ASC LIMIT 10;

@adamchainz
Copy link
Owner

You have comments coming through like /QueryRewrite':index=A USE A_index/. These are rewritten by database instrumentaiton.

Please check you have followed the complete installation guide at https://django-mysql.readthedocs.io/en/latest/installation.html , plus set DJANGO_MYSQL_REWRITE_QUERIES = True as per https://django-mysql.readthedocs.io/en/latest/queryset_extensions.html#query-hints .

If your issue persists please give more information and format your post properly.

Thanks,

Adam

@shubham26sep
Copy link
Author

shubham26sep commented Sep 1, 2020

Yes Adam, I have followed the complete installation guide given in https://django-mysql.readthedocs.io/en/latest/installation.html

and DJANGO_MYSQL_REWRITE_QUERIES = True is also set in settings.

Regarding this /QueryRewrite':index=A USE A_index/, this has been added through in django queryset through the below function use_index() in django-mysql package.

def use_index(self, *index_names, **kwargs):
kwargs["hint"] = "USE"
return self._index_hint(*index_names, **kwargs)

@requires_query_rewrite
def _index_hint(self, *index_names, **kwargs):
    hint = kwargs.pop("hint")
    table_name = kwargs.pop("table_name", None)
    for_ = kwargs.pop("for_", None)
    if kwargs:
        raise ValueError(
            "{}_index accepts only 'for_' and 'table_name' as keyword "
            "arguments".format(hint.lower())
        )
    if hint != "USE" and not len(index_names):
        raise ValueError(
            "{}_index requires at least one index name".format(hint.lower())
        )

    if table_name is None:
        table_name = self.model._meta.db_table

    if for_ in ("JOIN", "ORDER BY", "GROUP BY"):
        for_bit = "FOR {} ".format(for_)
    elif for_ is None:
        for_bit = ""
    else:
        raise ValueError(
            "for_ must be one of: None, 'JOIN', 'ORDER BY', " "'GROUP BY'"
        )

    if len(index_names) == 0:
        indexes = "NONE"
    else:
        indexes = "`" + "`,`".join(index_names) + "`"

    hint = (
        "/*QueryRewrite':index=`{table_name}` {hint} {for_bit}{indexes}*/1"
    ).format(table_name=table_name, hint=hint, for_bit=for_bit, indexes=indexes)
    return self.extra(where=[hint])

Please let me know if any other information is required.

@adamchainz
Copy link
Owner

Please check you're on the latest version.

For whatever reason the database instrumentation is not getting installed: https://github.com/adamchainz/django-mysql/blob/master/src/django_mysql/apps.py#L20

There was a bug until version 3.7.1: https://github.com/adamchainz/django-mysql/blob/master/HISTORY.rst#371-2020-06-24

Or you may have missed the step to include Django-MySQL in your INSTALLED_APPS

@shubham26sep
Copy link
Author

shubham26sep commented Sep 1, 2020

Adam, database instrumentation is getting applied and latest version is installed. And also, django_mysql is present in INSTALLED_APPS. There is no problem with configuration.

The only problem is mysql is not applying USE INDEX correctly. It is applying both USE INDEX in outer query..as given below.

SELECT
A.id
FROM
A USE INDEX (A_index) USE INDEX (A_index)
INNER JOIN
B
ON ( A.B_id = B.id )
LEFT OUTER JOIN
(
SELECT
B.name AS summarize_lookup_id,
COUNT(A.id) AS cnt
FROM
A
INNER JOIN
B
ON ( A.B_id = B.id )
WHERE
(
A.is_tombstoned = 0
AND A.client_id = 277
AND A.name = 'Open A'
AND B.is_active = 1
AND
(1)
AND NOT (B.name IS NULL)
)
GROUP BY
B.name
ORDER BY
NULL
)
FirstSummarizeAggr
ON ( B.name = FirstSummarizeAggr.summarize_lookup_id )
LEFT OUTER JOIN
Task
ON ( A.task_id = Task.id )
WHERE
(
A.is_tombstoned = 0
AND A.client_id = 277
AND A.name = 'Open A'
AND B.is_active = 1
AND
(1)
)
ORDER BY
Task.name ASC,
A.company ASC LIMIT 10;

Expected Output is:

SELECT
A.id
FROM
A USE INDEX (A_index)
INNER JOIN
B
ON ( A.B_id = B.id )
LEFT OUTER JOIN
(
SELECT
B.name AS summarize_lookup_id,
COUNT(A.id) AS cnt
FROM
A USE INDEX (A_index)
INNER JOIN
B
ON ( A.B_id = B.id )
WHERE
(
A.is_tombstoned = 0
AND A.client_id = 277
AND A.name = 'Open A'
AND B.is_active = 1
AND
(1)
AND NOT (B.name IS NULL)
)
GROUP BY
B.name
ORDER BY
NULL
)
FirstSummarizeAggr
ON ( B.name = FirstSummarizeAggr.summarize_lookup_id )
LEFT OUTER JOIN
Task
ON ( A.task_id = Task.id )
WHERE
(
A.is_tombstoned = 0
AND A.client_id = 277
AND A.name = 'Open A'
AND B.is_active = 1
AND
(1)
)
ORDER BY
Task.name ASC,
A.company ASC LIMIT 10;

@adamchainz
Copy link
Owner

Aha, now I see. I missed the paragraph between SQL queries in your first post because it was not well formatted. I scrolled past it as part of the SQL query. Please use the code highlighting feature of markdown to make things clear: https://guides.github.com/features/mastering-markdown/#GitHub-flavored-markdown

SELECT
    "like this"

As for your issue - I think this is what PR #286 is intended to fix. It's a bit of a scary PR opened 4 years ago that mucks aronud more with the ORM internals, and I never had time to look into. It needs a rebase and more explanation, but unfortunately the author has fallen silent. If you think you can figure it out, I'm open to seeing a new version of it. As I wrote in this comment, knowing it works would be useful:

#286 (comment)

Another option is work around to rewrite your query. It's quite probable you can get a performant version of your subquery using prefetch_related() paired with Prefetch and the queryset representing the aggregation.

Otherwise I'm afraid it will have to be "one day" that I will get around to looking at PR #286, or you can hire me as a consultant to fix it.

Thanks,

Adam

@shubham26sep
Copy link
Author

Adam, can you please look into it and let me know what changes are needed in package to solve this issue.

@shubham26sep
Copy link
Author

shubham26sep commented Sep 2, 2020

Adam, I tried with above PR also, but same issue persists. USE INDEX is getting added in outer query.

And, how can I use prefetch_related() paired with Prefetch for group by subquery.

@adamchainz
Copy link
Owner

Adam, I tried with above PR also, but same issue persists. USE INDEX is getting added in outer query.

Thanks for letting me know. I don't have a solution then for now.

And, how can I use prefetch_related() paired with Prefetch for group by subquery.

Can you show me the queryset rather than the query? I'm not sure how you're joining a subquery to a table with the ORM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants