Skip to content

Commit

Permalink
update tutorials
Browse files Browse the repository at this point in the history
  • Loading branch information
ZiyiXia committed Nov 15, 2024
1 parent c9cfa7c commit 2aa9204
Show file tree
Hide file tree
Showing 4 changed files with 85 additions and 15 deletions.
File renamed without changes.
100 changes: 85 additions & 15 deletions Tutorials/4_Evaluation/4.4.2_BEIR.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,40 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Use BEIR"
"## 1. Evaluate using BEIR"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"BEIR contains 18 datasets which can be downloaded from the [link](https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/), while 4 of them are private datasets that need appropriate licences. If you want to access to those 4 datasets, take a look at their [wiki](https://github.com/beir-cellar/beir/wiki/Datasets-available) for more information. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"| Dataset Name | Type | Queries | Documents | Avg. Docs/Q | Public | \n",
"| ---------| :-----------: | ---------| --------- | ------| :------------:| \n",
"| ``msmarco`` | `Train` `Dev` `Test` | 6,980 | 8.84M | 1.1 | Yes | \n",
"| ``trec-covid``| `Test` | 50| 171K| 493.5 | Yes | \n",
"| ``nfcorpus`` | `Train` `Dev` `Test` | 323 | 3.6K | 38.2 | Yes |\n",
"| ``bioasq``| `Train` `Test` | 500 | 14.91M | 8.05 | No | \n",
"| ``nq``| `Train` `Test` | 3,452 | 2.68M | 1.2 | Yes | \n",
"| ``hotpotqa``| `Train` `Dev` `Test` | 7,405 | 5.23M | 2.0 | Yes |\n",
"| ``fiqa`` | `Train` `Dev` `Test` | 648 | 57K | 2.6 | Yes | \n",
"| ``signal1m`` | `Test` | 97 | 2.86M | 19.6 | No |\n",
"| ``trec-news`` | `Test` | 57 | 595K | 19.6 | No |\n",
"| ``arguana`` | `Test` | 1,406 | 8.67K | 1.0 | Yes |\n",
"| ``webis-touche2020``| `Test` | 49 | 382K | 49.2 | Yes |\n",
"| ``cqadupstack``| `Test` | 13,145 | 457K | 1.4 | Yes |\n",
"| ``quora``| `Dev` `Test` | 10,000 | 523K | 1.6 | Yes | \n",
"| ``dbpedia-entity``| `Dev` `Test` | 400 | 4.63M | 38.2 | Yes | \n",
"| ``scidocs``| `Test` | 1,000 | 25K | 4.9 | Yes | \n",
"| ``fever``| `Train` `Dev` `Test` | 6,666 | 5.42M | 1.2| Yes | \n",
"| ``climate-fever``| `Test` | 1,535 | 5.42M | 3.0 | Yes |\n",
"| ``scifact``| `Train` `Test` | 300 | 5K | 1.1 | Yes |"
]
},
{
Expand All @@ -52,6 +85,13 @@
"### 1.1 Load Dataset"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First prepare the logging setup."
]
},
{
"cell_type": "code",
"execution_count": 12,
Expand All @@ -66,6 +106,13 @@
" handlers=[LoggingHandler()])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this demo, we choose the `arguana` dataset for a quick demonstration."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -140,6 +187,13 @@
"### 1.2 Evaluation"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then we load `bge-base-en-v1.5` from huggingface and evaluate its performance on arguana."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -248,7 +302,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Evaluate using FlagEmbedding"
"## 2. Evaluate using FlagEmbedding"
]
},
{
Expand All @@ -267,7 +321,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -290,7 +344,8 @@
" --eval_metrics ndcg_at_10 recall_at_100 \n",
" --ignore_identical_ids True \n",
" --embedder_name_or_path BAAI/bge-base-en-v1.5 \n",
" --devices cuda:7\n",
" --embedder_batch_size 1024\n",
" --devices cuda:4\n",
"\"\"\".replace('\\n','')\n",
"\n",
"sys.argv = arguments.split()"
Expand All @@ -305,9 +360,24 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 4,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Split 'dev' not found in the dataset. Removing it from the list.\n",
"ignore_identical_ids is set to True. This means that the search results will not contain identical ids. Note: Dataset such as MIRACL should NOT set this to True.\n",
"pre tokenize: 100%|██████████| 9/9 [00:00<00:00, 16.19it/s]\n",
"You're using a BertTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.\n",
"Inference Embeddings: 100%|██████████| 9/9 [00:11<00:00, 1.27s/it]\n",
"pre tokenize: 100%|██████████| 2/2 [00:00<00:00, 19.54it/s]\n",
"Inference Embeddings: 100%|██████████| 2/2 [00:02<00:00, 1.29s/it]\n",
"Searching: 100%|██████████| 44/44 [00:00<00:00, 208.73it/s]\n"
]
}
],
"source": [
"from transformers import HfArgumentParser\n",
"\n",
Expand Down Expand Up @@ -343,7 +413,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 5,
"metadata": {},
"outputs": [
{
Expand All @@ -352,16 +422,16 @@
"text": [
"{\n",
" \"arguana-test\": {\n",
" \"ndcg_at_10\": 0.6361,\n",
" \"ndcg_at_100\": 0.66057,\n",
" \"map_at_10\": 0.55766,\n",
" \"map_at_100\": 0.56337,\n",
" \"recall_at_10\": 0.88407,\n",
" \"ndcg_at_10\": 0.63668,\n",
" \"ndcg_at_100\": 0.66075,\n",
" \"map_at_10\": 0.55801,\n",
" \"map_at_100\": 0.56358,\n",
" \"recall_at_10\": 0.88549,\n",
" \"recall_at_100\": 0.99147,\n",
" \"precision_at_10\": 0.08841,\n",
" \"precision_at_10\": 0.08855,\n",
" \"precision_at_100\": 0.00991,\n",
" \"mrr_at_10\": 0.55766,\n",
" \"mrr_at_100\": 0.56337\n",
" \"mrr_at_10\": 0.55809,\n",
" \"mrr_at_100\": 0.56366\n",
" }\n",
"}\n"
]
Expand Down

0 comments on commit 2aa9204

Please sign in to comment.