Using Cerebras Llama-3.1-70B LLM for Recursive Vector Disambiguation
Recursive Vector Disambiguation (RVD) is a technique to improve semantic vector search process by using LLMs like Cerebras Llama-3.1-70B to prioritize terms in the query and then matching for similar vectors in the order of prioritized terms.
The terms may be from the query or generated by the Cerebras Llama-3.1-70B to better encapsulate the intent of the query.
So, you first get similar matches for the highest priority term and then rerank the list of vectors based on the remaining terms. This ensures high quality results/top-matches.
For Ex: When not using RVD,
the query 'go break a leg' might match sentences related to literal leg injuries instead of the correct interpretation of wishing good luck.
Similarly, the query Italian food serving restaurant having outdoor seating might wrongly match sentences that contain entities related to Italy or restaurant while the user has specifically constrained the search to the mentioned terms. A lot depends on what the training objective of the embedding model was. RVD tries to improve this search experience.