Skip to content

Using SpaCy to deal with text data such tokenization, normalization and labeling including ethics review.

License

Notifications You must be signed in to change notification settings

MLWorkshops/nlp-dealing-with-text-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dealing with Text Data Workshop

Through examples we will:

  1. Retrieve data
  2. Ethics checklist
  3. Tokenize
  4. Normalize
  5. Demo: Label data with doccano
  6. Convert to spacy format
  7. Extra: Train a model
  8. References

Code and guides on dealing with text data for the Seattle Artificial Intelligence Workshops NLP series. Solutions to exercises in assets folder in this repo.

About

Using SpaCy to deal with text data such tokenization, normalization and labeling including ethics review.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published