Fupi is a serverless multilingual semantic search system based on LanceDB.
Once upon a time a giraffe calf was orphaned during a severe drought and was saved thanks to the kindness and efforts of the local community and a knowledgeable animal whisperer. Named Fupi, he became so attached to his rescuer that he visited him frequently long after he recuperated and was set free.
Today we use complex machine learning technologies thanks to the knowledge, persistence and efforts of many people. Just like the small Fupi, we should always be thankful to them for their goodwill and contributions!
- 1. Multilingual (Cross-Language) Semantic Search:
ability to search using one language in texts of another language - 2. Usability in serverless or scale-to-zero applications for low operational costs
- 3. Adaptability to different cloud environments or on-premise systems
- 4. No dependency on AI as a service (AIaaS) for production of embeddings
- 5. No dependency on software as a service (SaaS) for storage of embeddings
@misc{bge-m3,
title={BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation},
author={Jianlv Chen and Shitao Xiao and Peitian Zhang and Kun Luo and Defu Lian and Zheng Liu},
year={2024},
eprint={2402.03216},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@misc{fan2020englishcentric,
title={Beyond English-Centric Multilingual Machine Translation},
author={Angela Fan and Shruti Bhosale and Holger Schwenk and Zhiyi Ma and Ahmed El-Kishky and Siddharth Goyal and Mandeep Baines and Onur Celebi and Guillaume Wenzek and Vishrav Chaudhary and Naman Goyal and Tom Birch and Vitaliy Liptchinsky and Sergey Edunov and Edouard Grave and Michael Auli and Armand Joulin},
year={2020},
eprint={2010.11125},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
This program is licensed under the terms of the Apache License 2.0.
Dimitar D. Mitov, 2024,
Adam Fauzi, 2024