This project implements a speech-to-text system using the Google Cloud Speech-to-Text API and folder monitoring with the Chokidar library. The system monitors the "files" folder for MP3 files, converts the audio content to text, and saves the result to the MongoDB database. After conversion, the MP3 file is moved to the "files-processed" folder.
- Folder Monitoring: The Chokidar library is used to monitor the "files" folder for new additions. When a new MP3 file is detected, the system triggers an event.
- Speech-to-Text Conversion: The Google Cloud Speech-to-Text API is used to convert the content of the MP3 file to text. The converted text is stored in a variable.
- Database Storage: The converted text and relevant information from the MP3 file (name, creation date, etc.) are saved to the MongoDB database.
- Move File to Processed Folder: After conversion and storage, the original MP3 file is moved to the "files-processed" folder.
This project uses several open source projects to function correctly::
- Docker - Platform for developing, shipping, and running applications using containerization;
- Node.js - JavaScript runtime built on Chrome’s V8 JavaScript engine;
- google-cloud/speech - Turn text into natural-sounding speech in 220+ voices across 40+ languages and variants with an API powered by Google's machine learning technology;
- fs-js - A native module for effectively working with files built on top of Node's famous fs module;
- Chokidar - Chokidar does still rely on the Node.js core fs module, but when using fs.watch and fs.watchFile for watching, it normalizes the events it receives, often checking for truth by getting file stats and/or dir contents;
- MongoDB - MongoDB is a source-available, cross-platform, document-oriented database program. Classified as a NoSQL database product, MongoDB utilizes JSON-like documents with optional schemas;
Dillinger requires Node.js v20+ to run.
Install the dependencies and devDependencies and start the server.
yarn
Replace DB_URL, DB_NAME, DB_COLLECTION, DB_ROOT_USERNAME and DB_ROOT_PASSWORD with your database connection details in docker-compose file.
Access your Google Cloud account, generate your key.json file and place it in the root of the project.
Replace files/example.mp3 with the actual MP3 file path in index.js.
docker-compose up
MIT Free Software, Hell Yeah!