This is the official repository of "IR From Bag-of-words to BERT and Beyond through Practical Experiments", a CIKM 2021 full-day tutorial with PyTerrier and OpenNIR search toolkits.
This is an updated edition of our previous ECIR 2021 Tutorial
- Part 1: Classical IR: indexing, retrieval and evaluation
- Part 2: Modern Retrieval Architectures: PyTerrier data model and operators, towards re-rankers and learning-to-rank
- Part 3: Contemporary Retrieval Architectures: Neural re-rankers such as BERT, EPIC, T5 and neural inverted index augmentation such as DeepCT and Doc2Query
- Part 4: Recent Advances beyond the classical inverted index: learned sparse retrieval, dense retrieval, nearest neighbour search
If you participated in either run of this tutorial, please complete the Exit Feedback Quiz: https://forms.office.com/r/RiYSAxAKhk
Our tutorial will run twice on Friday 5th November 2021, with live lab sessions.
Suggested viewing:
- Europe, Africa, West/South/Central Asia: Run 1
- North/South America: Run 2
- East Asia and Oceania: Run 1 (parts 1 & 2), followed by Run 2 (parts 3 & 4)
Run 1: Aligned with Europe, Africa, and West/South/Central Asia, partially accessible to East Asia/Oceania
London (GMT) | Content | Live | Johannesburg (GMT+2) | Delhi (GMT+5:30) | Beijing (GMT+8) | Brisbane (GMT+10) |
---|---|---|---|---|---|---|
0900-1000 | part 1 slides | ✅ | 1100-1200 | 1430-1530 | 1700-1800 | 1900-2000 |
1000-1030 | part 1 lab | ✅ | 1200-1230 | 1530-1600 | 1800-1830 | 2000-2030 |
1030-1100 | break | ☕ | 1230-1300 | 1600-1630 | 1830-1900 | 2030-2100 |
1100-1200 | part 2 slides | ✅ | 1300-1400 | 1630-1730 | 1900-2000 | 2100-2200 |
1200-1230 | part 2 lab | ✅ | 1400-1430 | 1730-1800 | 2000-2030 | 2200-2230 |
1230-1330 | break | ☕ | 1430-1530 | 1800-1900 | ||
1330-1430 | part 3 slides | ✅ | 1530-1630 | 1900-2000 | ||
1430-1500 | part 3 lab | ✅ | 1630-1700 | 2000-2030 | ||
1500-1530 | break | ☕ | 1700-1730 | 2030-2100 | ||
1530-1630 | part 4 slides | ✅ | 1730-1830 | 2100-2200 | ||
1630-1700 | part 4 lab | ✅ | 1830-1900 | 2200-2230 |
Run 2: Aligned with North/South America, partially accessible to East Asia/Oceania
LA (GMT-7) | Content | Live | New York (GMT-4) | São Paulo (GMT-3) | Beijing (GMT+8) | Brisbane (GMT+10) |
---|---|---|---|---|---|---|
1100-1200 | part 1 slides | 🎥 | 1400-1500 | 1500-1600 | ||
1200-1230 | part 1 lab | ✅ | 1500-1530 | 1600-1630 | ||
1230-1300 | break | ☕ | 1530-1600 | 1630-1700 | ||
1300-1400 | part 2 slides | 🎥 | 1600-1700 | 1700-1800 | ||
1400-1430 | part 2 lab | ✅ | 1700-1730 | 1800-1830 | ||
1430-1530 | break | ☕ | 1730-1830 | 1830-1930 | ||
1530-1630 | part 3 slides | 🎥 | 1830-1930 | 1930-2030 | 0630-0730 (6 Nov) | 0830-0930 (6 Nov) |
1630-1700 | part 3 lab | ✅ | 1930-2000 | 2030-2100 | 0730-0800 (6 Nov) | 0930-1000 (6 Nov) |
1700-1730 | break | ☕ | 2000-2030 | 2100-2130 | 0800-0830 (6 Nov) | 1000-1030 (6 Nov) |
1730-1830 | part 4 slides | 🎥 | 2030-2130 | 2130-2230 | 0830-0930 (6 Nov) | 1030-1030 (6 Nov) |
1830-1900 | part 4 lab | ✅ | 2130-2200 | 2230-2300 | 0930-1000 (6 Nov) | 1030-1200 (6 Nov) |
We are thankful to Luca and Eugene for supporting additional live labs, and for CIKM 2021 volunteers, for being able to support this schedule.
- Sean MacAvaney, University of Glasgow, UK
- Craig Macdonald, University of Glasgow, UK
- Nicola Tonellotto, University of Pisa, IT
Contributors to Live Labs:
- Luca Soldaini, Amazon, US
- Eugene Yang, Johns Hopkins University, US
- PyTerrier: [Github] [Documentation]
- OpenNIR: [Github] [Documentation]
- PyTerrier_ColBERT: [Github]
- PyTerrier_T5: [Github]
- PyTerrier_doc2query: [Github]
- PyTerrier_DeepCT: [Github]
- PyTerrier_ANCE: [Github]
- PyTerrier_DeepImpact [Github]
If you make using of any of these slides, notebooks, or additional PyTerrier plugins, please cite our tutorial abstract:
@inproceedings{cikm2021-tut-bow2b,
author = {MacAvaney, Sean and Macdonald, Craig and Tonellotto, Nicola},
title = {IR From Bag-of-words to BERT and Beyond through Practical Experiments: A CIKM 2021 tutorial with PyTerrier and OpenNIR},
booktitle = {Proceedings of CIKM 2021},
year = {2021}
}