Skip to content

Latest commit

 

History

History
32 lines (21 loc) · 1.35 KB

README.md

File metadata and controls

32 lines (21 loc) · 1.35 KB

This Repositary has 4 projects and 1 Final Project.

Text Author identification • The goal of this project is to identify the author of a particular piece of text based on the value of perplexity. • Uses n-grams along with Laplace smoothing and interpolation for out of vocabulary words • Also predicts the likeliest word using a trained model.

Sentiment Analysis • Used regular expressions to find the actual words • Performed POS tagging using ntlk library • Generate features for these words along with the word activeness and pleasantness scores • Predict and evaluate the sentiment

POS tagging with HMM • Use various types of n-grams to generate feature vector • Train the Logistic Regression model over these features • Implement Viterbi Algorithm to get the highest probable sequence of tags

Hypernym/Hyponym Relations • In this project, we would determine the Hypernym/hyponym relations between two words • Developed a simple rule based chunker to identify noun phrases POS tagged sentences • Hearst patters are used to achieve the purpose

Final Project: Machine Translation: This System translates Hindi sentences to English sentences. We use RNN with LSTM for this purpose. Them we implement Attention at global level and finally implement Byte-Pair Encoding technique to improve the performance in case of unknown/out-of-vocabulary words