playlist_tag,
playlist_url,
video_url,
channel_id,
channel_url,
video_title,
video_length,
video_publish_date,
video_rating,
video_views,
video_author,
video_keywords,
video_description,
video_transcript
- Identify Data source ( videos from heath gamer YouTube Channel)
- Use pytube + OpenAI Whisper to transcribe the videos
- Save the Daa into a csv with meta data
- Conduct Analysis on the Dataset
- Build an report on the dataset
- Design the A RAG based Chatbot
- Implement the knwoledge base data store (Melvius, Faiss, ChromaDB, ...)+ embeddings using Open Source model.
- Use Hugging face transformers library + LlamaIndex ( Llama 3 8B, Phi3 mini 128k, Llama mini, mistral 7B)
- Technical tests
- Working Tests
- priliminary deployment on the web.
- gather opinions to improve on the web.