Skip to content

Short analysis of the domains, top-level-domains and the geographical origin of mail addresses.

License

Notifications You must be signed in to change notification settings

RobinMaas95/mail_analysis

Repository files navigation


Project license

Pull Requests welcome code with love by RobinMaas95

Table of Contents

About

A small script to analyze a list of email addresses. It gives you an overview of the number of occurrences of domains, top-level-domains and countries of origin (as far as possible, manual labor may be necessary). The countries of origin are also plotted on a world map to give you a geographical impression.

Example of results for a list of generated mail addresses.

Domain Overview:

Domain Provider:

Note: Since the feature provider was added later, this plot was generated on newly generated data. Therefore, it does not match the values of the other plots.

Top Level Domain Overview:

Origin Overview:

Origin World Map:

Built With

  • Python
    • geopandas
    • matplotlib
    • pandas
    • poetry
    • pycountry
    • seaborn

Getting Started

Prerequisites

  • Python 3.7.6 (probably also higher versions, but not tested)
  • Internet access is necessary for access to the latitude/longitude overview by melanieshi0120. Alternatively, this file can be downloaded in advance and the reference in the code adapted (COUNTRIES_LATITUDE_LONGITUDE).

Installation

Recommended: Use poetry inside the directory to create a new virtual enviroment with all necessary modules:

poetry install

Alternatively: Use your preferred method to install python packages and handle environments (pip, conda, venv, ...). See requirements.txt for a summary of all necessary modules.

Usage

Code adaptation

  • Replace example mail addresses in EMAIL_ADDRESSES_STRING (mail_data.py) with your mail addresses.
  • If necessary, replace examples in GENERIC_DOMAINS_TO_COUNTRY (mail_data.py) with domains relevant for you.
  • If necessary, extend COUNTRY_SPECIFIC_DOMAINS_TO_COUNTRY (__main__.py) with new top-leve-domains (PRs are welcome!)
  • If necessary, adapt the file format for the created plots EXPORT_FILETYPE (__main__.pt). E.g. pdf, png, ...

Run the code

  • Activate virtual enviroment (poetry shell or by selection in your editor)
  • Call the script (e.g. from inside the repo directory):
python src/__main__.py

Support

Reach out to the maintainer at one of the following places:

Contributing

First off, thanks for taking the time to contribute! Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make will benefit everybody else and are greatly appreciated.

Please read our contribution guidelines, and thank you for being involved!

License

This project is licensed under the MIT license.

See LICENSE for more information.

About

Short analysis of the domains, top-level-domains and the geographical origin of mail addresses.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages