Enron Dataset Normalization

This repository contains code for normalizing the Enron dataset. The Enron dataset is a collection of emails and other documents that were exchanged by employees of Enron Corporation, a major energy company that collapsed in 2001 due to accounting fraud. The dataset is a valuable resource for researchers who are studying corporate fraud and other financial crimes. Files

The repository contains the following files:

Enron_Data_normalization.ipynb: A Jupyter notebook that contains the code for normalizing the Enron dataset.
requirements.txt: A file that lists the dependencies that need to be installed in order to run the code.

#Instructions

To run the code, first install the dependencies

Then, open the Jupyter notebook and run the cells one by one. Dataset

The Enron dataset is not included in this repository. You can download the dataset from the following URL: https://www.cs.cmu.edu/~enron/

The code is written for A CSV version of the dataset, which I am sharing using Google Drive due to GitHub's restriction on large file uploads https://drive.google.com/file/d/1VLY0Xqhkg25FGuTIvUKiAfcGeX1fczQa/view?usp=drive_link

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Enron_Data_Normalization.ipynb		Enron_Data_Normalization.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Enron Dataset Normalization

About

Languages

RutujChheda/Enron_Emails_Dataset_Processed

Folders and files

Latest commit

History

Repository files navigation

Enron Dataset Normalization

About

Topics

Resources

Stars

Watchers

Forks

Languages