Skip to content

Latest commit

 

History

History
60 lines (42 loc) · 2.39 KB

README.md

File metadata and controls

60 lines (42 loc) · 2.39 KB

Lluminating-Language

Official Repository for the Summer Project Lluminating Language offered by BCS

Project Objective : To learn about the wonderful world of Natural language Processing(NLP) from ground up and finally building our own custom open-source RAG infused Chatbot (We may see finetuning as well).

Week 0

To start with, just to jog your memory, here are some resources to brush up on your basics.

  1. Building a Neural Network from Scratch using only Numpy

  2. Neural Networks from Ground Up

  3. Git :

    1. https://rogerdudler.github.io/git-guide/
    2. https://github.com/firstcontributions/first-contributions
  4. MarkDown

  5. Latex:

    1. https://www.overleaf.com/learn/latex/Learn_LaTeX_in_30_minutes
    2. https://latex-tutorial.com/
  6. Basic Python:

    1. https://dabeaz-course.github.io/practical-python/
    2. https://automatetheboringstuff.com/

NOTE : DO NOT try to finish the entire thing, or even try to become perfect with every single concept. Get comfortable with things like printing, conditionals, loops, functions and importing libraries and you should be good to go :)

Week 1

The first week will be dedicated to understanding the basics of NLP and the tools that we will be using throughout the project.

Task : To build a sentiment analysis model using the IMDB dataset and trying and testing different models and techniques.

Resources:

  1. EDA :

    1. https://www.geeksforgeeks.org/what-is-exploratory-data-analysis/
    2. https://www.youtube.com/watch?v=-o3AxdVcUtQ
  2. Pre-Processing :

    1. Tokenization
    2. Stemming and Lemmatization:
      1. https://www.ibm.com/topics/stemming-lemmatization
      2. https://www.youtube.com/watch?v=HHAilAC3cXw
  3. Feature Extraction :

    1. https://www.geeksforgeeks.org/ml-one-hot-encoding/
    2. https://neptune.ai/blog/vectorization-techniques-in-nlp-guide
  4. Model Selection :

    1. https://towardsdatascience.com/top-machine-learning-algorithms-for-classification-2197870ff501
  5. Evaluation:

    1. https://www.analyticsvidhya.com/blog/2021/07/metrics-to-evaluate-your-classification-model-to-take-the-right-decisions/

Mentors:

  1. Udbhav Agarwal
  2. Arin Dhariwal
  3. Himanshu Shekhar
  4. Shreya Gupta