This repository includes all of the necessary files for a comprehensive database of foods in the United States.
Included are both grocery store foods and restaurant foods. MySQL files, images for foods, and scripts are also included. The scripts are mainly for scraping the web for images of the foods. The web-scraping involves threaded processing to make it faster.
For more information on this project, here is the link to the paper on IEEExplore.
The paper above was published in the 2022 International Conference on Computational Science and Computational Intelligence (CSCI). If you use this dataset, please site with
@INPROCEEDINGS{10216759, author={Whalen, Lexington and Turner-McGrievy, Brie and McGrievy, Matthew and Hester, Andrew and Valafar, Homayoun}, booktitle={2022 International Conference on Computational Science and Computational Intelligence (CSCI)}, title={On Creating a Comprehensive Food Database}, year={2022}, volume={}, number={}, pages={1610-1614}, keywords={Costs;Databases;Scientific computing;Soft sensors;Eating disorders;Fats;Reliability;MySQL;Table Design;USDA;Food Database}, doi={10.1109/CSCI58124.2022.00288}}
The USDA food database can be quite confusing to deal with. It has many tables with relations between them that are not well defined, leading to issues in using it for any real project. We have decided to tackle this problem, creating a simple database containing all the foods present in the USDA database, alongside with images for each food.
Due to Git storage limits, we have decided to move the files for the project onto the file storage platform MEGA. The link to zip files of the database files and images can be found here.
Here is an example website using this database. It is relatively simple, but should give a general idea of how the database can be used in a project.
Furthermore, we would like to thank @jpoles1 for a creating a great tutorial on how to potentially use this database. See their posts here.
- 96 Restaurants
- 193,369 food items
- 105,077 image files
- 17,619 Brands
- 991,665 Foods
- 322,401 image files
- 50,254 Foods
- 7 data types:
- SR Legacy Food
- Sample Food
- Sub Sample Food
- Foundation Food
- Agricultural Acquisition Food
- Survey FNDDS Food
- 26,028 image files
- 205,000 foods
- The discrepancy between the number of foods and the number of images is due to the fact that many foods do not have a particular image for them.
The Comprehensive Food Database team would like to especially thank those users who have provided more data or tools related to the project to the community.
- Username: jpoles1
- GitHub: https://github.com/jpoles1
- Contribution: Noticed issue with LFS storage, added Canadian food data, provided helpful scripts and tutorials. Please check out his blog here where you can find helpful tips about the food database and other interesting posts.
- If this code or anyother I have written has helped you, feel free to make a donation at https://www.buymeacoffee.com/whalenlexn.