Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPS: Amélioration de l'autocomplete avec recherche plein texte #5365

Merged
merged 1 commit into from
Jan 14, 2025

Conversation

tonial
Copy link
Contributor

@tonial tonial commented Jan 10, 2025

🤔 Pourquoi ?

https://www.notion.so/gip-inclusion/Am-liorations-de-l-autocomplete-Ajout-d-un-b-n-ficiaire-14b5f321b6048096bc87ea7a6bd27c50?pvs=4

Alternative à #5363

Un peu plus rapide que l'autre version avec une gestion différente : il faut taper les noms en entier (Jean ne matche pas sur Jeanne) et donc connaître l'orthographe exacte.

Niveau performances :
Juste avec la recherche plein text -> ~1 seconde
Avec la colonne GeneratedField SearchVectorField, sans index -> 0.25s
Avec la colonne + index -> 0.04 s

Le 2nd commit utilise un dictionnaire custom (simple + unaccent) pour ne pas avoir à rendre unaccent immutable
performance identiques, donc c'est un peu à nous de choisir ce qu'on préfère :)

🍰 Comment ?

Décrivez en quelques mots la solution retenue et mise en oeuvre, les difficultés ou problèmes rencontrés. Attirez l'attention sur les décisions d'architecture ou de conception importantes.

🚨 À vérifier

  • Mettre à jour le CHANGELOG_breaking_changes.md ?

🏝️ Comment tester

Les instructions pour reproduire le problème, les profils de test, le parcours spécifique à utiliser, etc. Si vous disposez d'une recette jetable, mettre l'URL pour tester dans cette partie.

💻 Captures d'écran

@tonial tonial force-pushed the alaurent/gps_autocomplete_full_text_search branch from 261fa63 to e65eb52 Compare January 10, 2025 14:53
Comment on lines 12 to 13
migrations.RunSQL("ALTER FUNCTION unaccent(text) IMMUTABLE"),
migrations.RunSQL("ALTER FUNCTION unaccent(regdictionary, text) IMMUTABLE"),
Copy link
Contributor

@xavfernandez xavfernandez Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

En voyant ces 2 RunSQL, je me demande du coup si faire des:

CREATE TEXT SEARCH CONFIGURATION unaccent_simple (COPY=simple);
ALTER TEXT SEARCH CONFIGURATION unaccent_simple ALTER MAPPING FOR hword, hword_part, word WITH unaccent,simple;

à la place ne serait pas plus propre ?

(et si l'index

            GinIndex(
                SearchVector("first_name", "last_name", config="unaccent_simple"),
                name="full_name_search_idx",
            ),

est bien créable sans unaccent immutable.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

J'avais oublié de pusher avec le unaccent_simple.

J'ai donc 2 version dans la PR à présent (pour que tout le monde puisse tester facilement)

  • 1er commit : unaccent en immutable
  • 2e commit : dictionnaire unaccent_simple, sans passer unaccent en immutable
    (il faut penser à faire un ./manage.py migrate users 0019 à chaque fois qu'on change de commit)

@tonial tonial force-pushed the alaurent/gps_autocomplete_full_text_search branch from e65eb52 to 249a180 Compare January 10, 2025 19:36
@tonial tonial self-assigned this Jan 10, 2025
@tonial tonial force-pushed the alaurent/gps_autocomplete_full_text_search branch 2 times, most recently from 1668581 to 1bd5d3b Compare January 10, 2025 22:15
Comment on lines 14 to 25
migrations.AddField(
model_name="user",
name="full_name",
field=models.GeneratedField(
db_persist=True,
expression=django.contrib.postgres.search.SearchVector(
"first_name", "last_name", config="simple_unaccent"
),
output_field=django.contrib.postgres.search.SearchVectorField(),
verbose_name="nom complet utilisé pour rechercher un utilisateur",
),
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cette migration prend 15s à exécuter, attention au moment où on la lance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Je la lancerais un soir tard, ou un matin tôt

Copy link
Contributor

@francoisfreitag francoisfreitag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Les perfs sont top ! :)

francoisfreitag

This comment was marked as outdated.

itou/users/models.py Outdated Show resolved Hide resolved
@tonial tonial force-pushed the alaurent/gps_autocomplete_full_text_search branch from 1bd5d3b to b153447 Compare January 13, 2025 12:15
@tonial tonial force-pushed the alaurent/gps_autocomplete_full_text_search branch from b153447 to 124e8d0 Compare January 13, 2025 12:15
Copy link
Contributor

@xavfernandez xavfernandez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beaucoup de snapshots modifiés sans raison et j'imagine que c'est ce qui déplaît aux tests :)

@tonial tonial force-pushed the alaurent/gps_autocomplete_full_text_search branch from 124e8d0 to 0d6e1d5 Compare January 13, 2025 12:37
@tonial
Copy link
Contributor Author

tonial commented Jan 13, 2025

J'ai fait l'erreur de mettre à jour le nom du champ avec un replace dans vscode qui en a profiter pour sournoisement reformater les fichiers en retirant les espaces les lignes avec que des espaces...

@tonial tonial requested a review from xavfernandez January 13, 2025 14:00
Comment on lines 9 to 10
("auth", "0012_alter_user_first_name_max_length"),
("cities", "0001_initial"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ces deux dépendances me semblent étonnantes ?

Copy link
Contributor Author

@tonial tonial Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Je pensais que c'était parce que cities_0001 contenait BtreeGinExtension(), mais en fait je ne le vois nulle part dans notre code, donc je me demande comment on a activé les indexes GIN 🤔
Est-ce qu'on ne devrait pas ajouter BtreeGinExtension() dans cette migration pour être safe ?

En tout cas J'ai retiré les dépendances peu utiles (on verra si les tests plantent)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Je pense que notre utilisation reste dans le natif des index Gin et ne nécessite pas d'extension :)

@@ -231,7 +238,8 @@ class Meta(AbstractUser.Meta):
models.Index(
OpClass(Upper("email"), name="text_pattern_ops"),
name="users_user_email_upper",
)
),
GinIndex("full_name_search_vector", name="full_name_search_idx"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
GinIndex("full_name_search_vector", name="full_name_search_idx"),
GinIndex("full_name_search_vector", name="users_user_full_name_search_idx"),

pour coller aux noms des autres index

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

La longueur max c'est 30 caractères, j'ai pu ajouter users_user devant, mais pour cela j'ai retiré search (et vector n'y est pas non plus)

ajax:{
delay: 250 // wait 250 milliseconds before triggering the request
}
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ajax:{
delay: 250 // wait 250 milliseconds before triggering the request
}
},
},
ajax:{
delay: 250 // wait 250 milliseconds before triggering the request
}

Le ajax doit être au même niveau que la clef language je pense 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😢 j'étais pourtant sur d'être au bon niveau d'indentation

@tonial tonial force-pushed the alaurent/gps_autocomplete_full_text_search branch from dbf9c56 to 4aca0fd Compare January 13, 2025 22:42
@tonial tonial force-pushed the alaurent/gps_autocomplete_full_text_search branch from 4aca0fd to 3b80f59 Compare January 14, 2025 08:18
@@ -19,8 +19,11 @@ htmx.onLoad((target) => {
Enregistrer un nouveau bénéficiaire
</a>
</div>
`),
`),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`),
`),

Copy link
Contributor Author

@tonial tonial Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

J'ai aligné sur noResults pourtant

Est-ce que je ne devrais pas plutot décaler tout le

d'un cran vers la droite ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oui, effectivement.
Idéalement, on ne mixerait pas des indentations de 4 et de 2...

},
ajax: {
delay: 250 // wait 250 milliseconds before triggering the request
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
},

@tonial tonial force-pushed the alaurent/gps_autocomplete_full_text_search branch from 3b80f59 to de70cc6 Compare January 14, 2025 08:40
Copy link
Contributor

@francoisfreitag francoisfreitag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏

@tonial tonial force-pushed the alaurent/gps_autocomplete_full_text_search branch from de70cc6 to 9f8ff9b Compare January 14, 2025 09:04
Use psql full-text with a generated search vector field and an index on
it
@tonial tonial force-pushed the alaurent/gps_autocomplete_full_text_search branch from 9f8ff9b to d51194a Compare January 14, 2025 14:06
@tonial tonial added this pull request to the merge queue Jan 14, 2025
Merged via the queue into master with commit f98ff88 Jan 14, 2025
9 checks passed
@tonial tonial deleted the alaurent/gps_autocomplete_full_text_search branch January 14, 2025 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants