NLP (Natural Language Processing)

What is NLP ?

NLP: Natural Language Processing (NLP) is a field of artificial intelligence (AI) that focuses on enabling machines to understand, interpret and generate human language. NLP involves analyzing and processing large amounts of human language data, such as written text or spoken language, and extracting meaning and insights from it.

Application of NLP

chatbots
voice assistants
sentiment analysis
language translation
text summarization and many more
With the growing popularity of digital assistants and chatbots, NLP has become an essential tool for businesses to provide efficient and personalized customer service.

1) Regular Expression in NLP

Regular expression (regex) is a pattern-matching language used to manipulate and extract text data in NLP. Regular expressions consist of a sequence of characters and metacharacters that represent a particular pattern in a text string.
For example, regular expressions can be used to extract all email addresses or phone numbers from a text document, or to remove all punctuation marks or stop words from a piece of text.
Extracting phone Numbers

as mentioned above in code we are extracting 10 digits, using '\d' we can extract digits and {n} here in place of n you can replace any number that much digits you want .
we are using findall function for matching data with our pattern

Matching Random Pattern

Extracting Email Address

here we are matching text with our designed pattern for mail that is '[a-z0-9A-z_]@[a-z0-9A-z_].[a-zA-Z]'
here a-z: means any character between a to z, simillar for A-Z and 0-9.

you can view my full content of regular expression in my jupyter file :
https://github.com/meet5398/NLP-Natural-Language-Processing-/blob/57611b2b14c58a205c3f93a264daa88f31acc341/regular%20expression%20in%20NLP.ipynb

2) Text Tokenization using spacy and nltk

Text Tokenization: Text tokenization involves breaking text into smaller units or tokens, such as words or sentences. This process enables computers to analyze and understand human language.

Tokenization is a crucial step in many natural language processing tasks, including sentiment analysis, named entity recognition, and machine translation.

Difference between spacy and nltk

spacy: is an open-source software library for advanced natural language processing, written in Python and Cython. It provides a variety of tools for language understanding and processing, including named entity recognition, dependency parsing, and word vectors. it returns value in terms of object.

nltk (Natural Language Toolkit) is a leading platform for building Python programs to work with human language data. It provides a range of tools for text processing and analysis, including tokenization, stemming, tagging, and parsing. it returns value in terms of string.

Prerequisites

Before running the code, make sure you have the following installed:

Python 3.x
spacy library (can be installed via pip)
English language model for spacy (can be downloaded via python -m spacy download en)
nltk library (can be installed via pip)

Some imp Screenshots:

in above code we are using spacy and in output we can see that it is returning sentence in object form

In above code we are using nltk and we can see that it is returning output of sentence in string format

For more topics and code you can view my full repository where I have also updated 6 projects on nlp :https://github.com/meet5398/NLP-Natural-Language-Processing-

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP (Natural Language Processing)

What is NLP ?

Application of NLP

1) Regular Expression in NLP

2) Text Tokenization using spacy and nltk

Difference between spacy and nltk

Prerequisites

Some imp Screenshots:

For more topics and code you can view my full repository where I have also updated 6 projects on nlp :https://github.com/meet5398/NLP-Natural-Language-Processing-

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
Project : Classify text of e-commerce dataset using tf-idf vectorization and Three different Classifier		Project : Classify text of e-commerce dataset using tf-idf vectorization and Three different Classifier
Project : Custom train of model on Indian Food Receipes using Fast Text		Project : Custom train of model on Indian Food Receipes using Fast Text
Project : news categorisation using Bag of n- grams and naive bayes classifier		Project : news categorisation using Bag of n- grams and naive bayes classifier
Project :Fake News Detection using Spacy		Project :Fake News Detection using Spacy
Project: News Classification using Gensim word vector		Project: News Classification using Gensim word vector
Project: Spam mail detection using BOW and naive base classifier		Project: Spam mail detection using BOW and naive base classifier
Removing stop words from dataset (json format)		Removing stop words from dataset (json format)
Difference between Spacy and Nltk .ipynb		Difference between Spacy and Nltk .ipynb
LICENSE		LICENSE
Label and One Hot Encoding.ipynb		Label and One Hot Encoding.ipynb
Named Entity recognition (NER).ipynb		Named Entity recognition (NER).ipynb
Part of speech in NLP.ipynb		Part of speech in NLP.ipynb
README.md		README.md
Stemming and Lemmatization .ipynb		Stemming and Lemmatization .ipynb
Text Representation using tf-idf .ipynb		Text Representation using tf-idf .ipynb
Tokenization in spacy.ipynb		Tokenization in spacy.ipynb
Word Vector in Gensim.ipynb		Word Vector in Gensim.ipynb
Word Vector in Spacy (overview).ipynb		Word Vector in Spacy (overview).ipynb
fake_news_classificaton using spacy word vector .ipynb		fake_news_classificaton using spacy word vector .ipynb
pipeline in spacy.ipynb		pipeline in spacy.ipynb
regular expression in NLP.ipynb		regular expression in NLP.ipynb

License

meet5398/NLP-Natural-Language-Processing-

Folders and files

Latest commit

History

Repository files navigation

NLP (Natural Language Processing)

What is NLP ?

Application of NLP

1) Regular Expression in NLP

2) Text Tokenization using spacy and nltk

Difference between spacy and nltk

Prerequisites

Some imp Screenshots:

For more topics and code you can view my full repository where I have also updated 6 projects on nlp :https://github.com/meet5398/NLP-Natural-Language-Processing-

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages