Email Datasets can be found here
-
Updated
Jan 21, 2020 - Python
Email Datasets can be found here
Fraud Detection by finding the Person of Interest (POI)
The code and data for "Are Large Pre-Trained Language Models Leaking Your Personal Information?" (Findings of EMNLP '22)
A Person Of Interest identifier based on ENRON CORPUS data.
The fraud identification models were build using Python Scikit-learn machine-learning module.
CEREC and Seed corpus for coreference resolution for email threads taken from the Enron Corpus
[Incomplete] A chrome extension that tells you if a mail you're currently drafting is going to be classified as spam or not.
A project on Extract-Transform-Load (ETL) operations performed on the emails from the infamous enron corpus database.
Natural Language Processing (NLP) and programmatic data extraction in large scale fraud investigations.
Enron Email Analysis
📩 Modeling the Enron dataset of emails using graphs
Spam and No Spam text classification with Convolutional Neuronal Network and Word Embedding
📧 A data engineering exercise
This repository contains code for normalizing the Enron dataset.
Machine learning algorithms applied to explore Enron email dataset and figure out patterns about people involved in the scandal.
Phishing Detection classifier to filter fraudolent and phishing e-mail.
Identifying and cleaning the outliers of the Enron Dataset.
Machine learning algorithms are used to determine some possible people involved in Enron fraud---Udacity project
LT2212 V20 Assignment 3: Same-author-classification via feed-forward neural networks: Transformed email text (Enron) into a machine readable representation and built a classifier that determines whether two texts are authored by the same person or not.
Add a description, image, and links to the enron-emails topic page so that developers can more easily learn about it.
To associate your repository with the enron-emails topic, visit your repo's landing page and select "manage topics."