TED-Talks-Analysis

Creating an approch to kickstart EDA on a dataset with many Text, Numerical, Categorical, and Datetime features like TED Talks and with limited Domain Knowledge. The idea, approach and code are very generic, and so would apply to almost any dataset.

Contents:

Text Preprocessing
200+ Feature Creation - mostly on Text columns with basic NLP like character/token count, POS and NER tags, and Sentiment
Understanding relation among columns by
- Visualizing Correlation as Interactive Graphs (currently, unweighted)
- Feature Clustering based on Correlation
n-grams and Keyphrase extraction
A Talks Recommendation Engine
Topic Modelling and Text Clustering

Please find other input/intermediate files here

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
EDA_01.ipynb		EDA_01.ipynb
LICENSE		LICENSE
README.md		README.md
contractions.py		contractions.py
most_common_english_words_1000__ef_com.txt		most_common_english_words_1000__ef_com.txt
most_common_english_words_3000__ef_com.txt		most_common_english_words_3000__ef_com.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TED-Talks-Analysis

About

Releases

Packages

Languages

License

The-Gupta/TED-Talks-Analysis

Folders and files

Latest commit

History

Repository files navigation

TED-Talks-Analysis

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages