# Part-of-Speech (POS) Tagging with NLTK
## Overview
This project demonstrates how to perform Part-of-Speech (POS) tagging on a given sentence using the Natural Language Toolkit (NLTK) library in Python. POS tagging involves labeling words in a sentence with their corresponding parts of speech, such as nouns, verbs, adjectives, etc.
## Features
- Tokenizes a given sentence into words.
- Tags each word with its part-of-speech label.
- Prints the sentence with each word and its POS tag.
## Prerequisites
- Python 3.x
- NLTK library
## Installation
1. **Clone the repository:**
```bash
git clone https://github.com/yourusername/pos-tagging-nltk.git
cd pos-tagging-nltk
-
Install the required libraries:
Make sure you have Python installed. Then, install NLTK using pip:
pip install nltk
-
Download NLTK data:
You need to download specific NLTK data files for tokenization and POS tagging. This is done automatically by the script if not already present:
import nltk nltk.download('punkt') nltk.download('averaged_perceptron_tagger')
-
Run the POS tagging script:
Navigate to the directory where the
pos_tagging.py
file is located and execute the script using Python:python pos_tagging.py
-
Input Sentence:
The script uses a predefined sentence. Modify the
sentence
variable in thepos_tagging.py
file to test with different sentences.
Here is a snippet of the POS tagging script:
import nltk
from nltk.tokenize import word_tokenize
from nltk import pos_tag
# Download NLTK data (only need to do this once)
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
# Function to perform POS tagging
def pos_tag_sentence(sentence):
tokens = word_tokenize(sentence)
tagged_tokens = pos_tag(tokens)
return tagged_tokens
# Example sentence
sentence = "The quick brown fox jumps over the lazy dog."
# Get POS tags
tagged_sentence = pos_tag_sentence(sentence)
# Print the sentence with POS tags
print("Sentence with POS tags:")
for word, tag in tagged_sentence:
print(f"{word}: {tag}")
For the example sentence "The quick brown fox jumps over the lazy dog."
, the output will be:
Sentence with POS tags:
The: DT
quick: JJ
brown: NN
fox: NN
jumps: VBZ
over: IN
the: DT
lazy: JJ
dog: NN
.: .
- DT: Determiner
- JJ: Adjective
- NN: Noun (singular or mass)
- VBZ: Verb, 3rd person singular present
- IN: Preposition or subordinating conjunction
- .: Punctuation mark
Contributions are welcome! If you have suggestions for improvements or new features, please fork the repository and create a pull request. For bug reports or feature requests, open an issue on the GitHub repository.
This project is licensed under the MIT License. See the LICENSE file for details.
For questions or feedback, please reach out to your-email@example.com.
### Instructions:
1. **Update GitHub URL:**
Replace `https://github.com/yourusername/pos-tagging-nltk.git` with the actual URL of your repository if you have one.
2. **Email Address:**
Replace `[your-email@example.com](mailto:your-email@example.com)` with your actual contact email.
3. **License:**
If you use a different license, update the license section accordingly. If you do not have a `LICENSE` file, you can remove that section or include license details directly in the `README.md`.
Save this content as `README.md` in the root directory of your project. It will provide users with the necessary information to get started with POS tagging using your script.