Detecting Phishing Emails with NLP and AI

This repository corresponds to a blog post I wrote. The post gives explanation behind the choices I made while writing this code. This code trains models to detect phishing emails based on the body text of the emails. I do not plan on maintaining this code; I uploaded it so that others could see and understand my process, and reproduce it if they wish. However, if anyone needs any further explanations, or notices any egregious errors or mistakes, you can open an issue or PR and I will try to address it. I apologize for the messy code. I was writing it for functionality, not maintainability.

This code requires various datasets to run. The datasets I used were placed in ../datasets/. The list of datasets I used is listed on my blog post.

Spam wordlist

The spam wordlist was created based on two sites: here and here

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
phishing-detection.ipynb		phishing-detection.ipynb
spam_wordlist.txt		spam_wordlist.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Detecting Phishing Emails with NLP and AI

Spam wordlist

About

Releases

Packages

Languages

morgenm/nlp-ai-phishing

Folders and files

Latest commit

History

Repository files navigation

Detecting Phishing Emails with NLP and AI

Spam wordlist

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages