You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have a few queries and a list of documents and we will submit, for each query, a list of pairs {query, document}. The model should rank the documents that are most relevant to our query.
Bumblebee does not yet implement the architecture :for_sequence_classification.
Conclusion
The results are the same.
The conclusion is that Elixir is at least as good as Python. Note that we needed to exhibit a tokenizer for this cross-encoder
Queries and documents
I asked ChatGPT to generate queries and documents to test the result of the reranking. The input:
queries=["What is the capital of France?",# Factual query"What is the capital of Germany?",# Similar but different query"What is the Eiffel Tower?",# Entity-related query"What is not the capital of France?",# Negation query"Where is London located?",# Geographical location query]documents=[# Highly relevant"Paris is the capital of France.",# Directly answers the question"The capital city of France is Paris.",# Rephrased but still relevant"Paris, the largest city in France, is its capital.",# Adds context but remains relevant# Somewhat relevant"The Eiffel Tower is one of Paris' most famous landmarks.",# Paris-related but not about the capital"Paris is known for its museums and cultural landmarks.",# Related to Paris but not directly answering# Slightly misleading but still connected"London is a global city, and the capital of the UK.",# Related to capitals, but not relevant to the query"France is a country in Europe with Paris as a key tourist destination.",# Mention of Paris but indirect relevance# Irrelevant"Berlin is the capital of Germany.",# Different country and unrelated"Spain's capital, Madrid, is famous for its architecture.",# Completely irrelevant country"New York City is one of the most populous cities in the USA.",# Entirely unrelated to the query# Misleading/Confusing information"The capital of France was once London during World War II.",# Incorrect or misleading historical claim"Paris is the capital of Italy.",# Deliberately false statement# Edge cases (minimal or abstract information)"France is a country.",# Vague, doesn't address the capital at all"There are many capital cities in Europe.",# General statement about capitals without specifics"Paris is a word with many meanings.",# Abstract, contextless statement"The sky in Paris is often overcast.",# Completely off-topic but mentions Paris]
Python running the model
I consider Python as the "source of truth". In a Jupyter notebook:
defrerank(query, documents, first=3, model_name="cross-encoder/ms-marco-MiniLM-L-6-v2"):
# Load the cross-encoder modelmodel=CrossEncoder(model_name)
# Prepare input pairs: (query, document)input_pairs= [[query, doc] fordocindocuments]
# Compute the scoresscores=model.predict(input_pairs)
# Sort the results based on the scores and keep only the top 'first' resultsreranked_results= [xfor_, xinsorted(zip(scores, documents), key=lambdapair: pair[0], reverse=True)[:first]]
reranked_scores=sorted(scores, reverse=True)[:first]
returnreranked_results, reranked_scores
start=time.time()
first=3# Number of top results to keepforqueryinqueries:
reranked, scores=rerank(query, documents, first)
print(f"Query: {query}")
fordoc, scoreinzip(reranked, sorted(scores, reverse=True)):
print(f"Score: {score:.4f} - {doc}")
print(f"time: {time.time()-start:.4f}")
Sequential Python Results
Query: What is the capital of France?
Score: 8.9898 - The capital city of France is Paris.
Score: 8.5007 - Paris is the capital of France.
Score: 7.7378 - Paris, the largest city in France, is its capital.
Query: What is the capital of Germany?
Score: 8.6557 - Berlin is the capital of Germany.
Score: -3.3820 - There are many capital cities in Europe.
Score: -3.4038 - Paris is the capital of Italy.
Query: What is the Eiffel Tower?
Score: 9.3518 - The Eiffel Tower is one of Paris' most famous landmarks.
Score: -8.0965 - Paris is known for its museums and cultural landmarks.
Score: -8.2520 - Paris is the capital of Italy.
Query: What is not the capital of France?
Score: 6.3721 - The capital city of France is Paris.
Score: 5.8387 - Paris is the capital of France.
Score: 5.2583 - Paris, the largest city in France, is its capital.
Query: Where is London located?
Score: 5.5550 - London is a global city, and the capital of the UK.
Score: -1.8151 - The capital of France was once London during World War II.
Score: -7.4113 - Paris, the largest city in France, is its capital.
time: 3.3038
start=time.time()
first=3# Number of top results to keeprerank_concurrent(queries, top_k_results, first)
print(f"time: {time.time()-start:.4f}")
Concurrent Python results
... same results, order of arrival may be different
time: 1.2592
Elixir running the code
Sequential run
defmoduleCrossEncoderdo@first3defload_modeldorepo={:hf,"cross-encoder/ms-marco-MiniLM-L-6-v2"}tokenizer={:hf,"bert-base-uncased"}{:ok,model_info}=Bumblebee.load_model(repo){:ok,tokenizer}=Bumblebee.load_tokenizer(tokenizer){model_info,tokenizer}enddefrerank(documents,query)do# Prepare input pairs for cross-encoder{model_info,tokenizer}=load_model()input_pairs=Bumblebee.apply_tokenizer(tokenizer,Enum.map(documents,fndoc->{query,doc}end))# Run cross-encoder in batchesoutputs=Axon.predict(model_info.model,model_info.params,input_pairs)# Combine scores with original documents and sortEnum.zip(documents,outputs.logits|>Nx.to_flat_list())|>Enum.sort_by(fn{_,score}->scoreend,:desc)|>Enum.map(fn{doc,score}->{Float.round(score,4),doc.content}end)|>Enum.take(@first)|>IO.inspect()endend
{2857,[{"What is the capital of France?",[{8.9898,"The capital city of France is Paris."},{8.5007,"Paris is the capital of France."},{7.7378,"Paris, the largest city in France, is its capital."}]},{"What is the capital of Germany?",[{8.6557,"Berlin is the capital of Germany."},{-3.382,"There are many capital cities in Europe."},{-3.4038,"Paris is the capital of Italy."}]},{"What is the Eiffel Tower?",[{9.3518,"The Eiffel Tower is one of Paris' most famous landmarks."},{-8.0965,"Paris is known for its museums and cultural landmarks."},{-8.252,"Paris is the capital of Italy."}]},{"What is not the capital of France?",[{6.3721,"The capital city of France is Paris."},{5.8387,"Paris is the capital of France."},{5.2583,"Paris, the largest city in France, is its capital."}]},{"Where is London located?",[{5.555,"London is a global city, and the capital of the UK."},{-1.8151,"The capital of France was once London during World War II."},{-7.4113,"Paris, the largest city in France, is its capital."}]}]}
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
We have a few queries and a list of documents and we will submit, for each query, a list of pairs {query, document}. The model should rank the documents that are most relevant to our query.
The Elixir code is based on this issue: elixir-nx/bumblebee#251
Bumblebee does not yet implement the architecture
:for_sequence_classification
.Conclusion
Queries and documents
I asked ChatGPT to generate queries and documents to test the result of the reranking. The input:
Python running the model
I consider Python as the "source of truth". In a
Jupyter notebook
:Sequential run
Sequential Python Results
Concurrent run
Concurrent Python results
Elixir running the code
Sequential run
Sequential Elixir results
Elixir Concurrent run
Elixir concurrent results
0.8915
Beta Was this translation helpful? Give feedback.
All reactions