Skip to content

Latest commit

 

History

History
64 lines (52 loc) · 3.36 KB

README.md

File metadata and controls

64 lines (52 loc) · 3.36 KB

Random Data Generation and Basic Statistical Analysis

Overview

This project generates a synthetic dataset using various statistical distributions, providing insights into the nature of random data. The dataset includes values from Normal, Uniform, Exponential, Random Integers, and Binomial distributions, allowing for a comprehensive analysis of different types of data.

The dataset is designed for educational purposes, offering a practical example of how to generate and analyze random data.

Dataset Generation

Key Features

  • Data Sources: Data is generated using Python libraries such as NumPy and Pandas.
  • Distributions:
    • Normal Distribution: Simulates continuous data with a Gaussian distribution.
    • Uniform Distribution: Provides values within a specified range.
    • Exponential Distribution: Models the time between events.
    • Random Integers: Simulates discrete values.
    • Binomial Distribution: Represents binary outcomes.
  • Statistics: Descriptive statistics including mean, median, and standard deviation are computed.
  • Visualizations: Histograms are created to observe the distribution patterns.

Tools & Technologies

  • Python: For data generation and analysis.
  • NumPy: For numerical operations and random data generation.
  • Pandas: For data manipulation and analysis.
  • Matplotlib: For plotting visualizations.
  • Seaborn: For enhanced data visualization.

Dataset Information

The generated dataset includes the following columns:

  • Normal Distribution: Values drawn from a Gaussian distribution.
  • Uniform Distribution: Values uniformly distributed between specified limits.
  • Exponential Distribution: Values following an exponential distribution.
  • Random Integers: Integer values within a specified range.
  • Binomial Distribution: Values from a binomial distribution representing binary outcomes.

Visualizations

The project includes histograms for each type of distribution:

  • Normal Distribution Histogram: Shows the distribution of values from the Gaussian distribution.
  • Uniform Distribution Histogram: Displays the range and frequency of uniformly distributed values.
  • Exponential Distribution Histogram: Illustrates the spread of values from the exponential distribution.
  • Random Integers Histogram: Visualizes the frequency of discrete integer values.
  • Binomial Distribution Histogram: Represents the frequency of binary outcomes.

Project Structure

How to Use the Project

  1. Run the Script: Execute App.py to generate the dataset and visualizations.
  2. Explore Visualizations: Use the Streamlit interface to select columns and view histograms.
  3. Download Data: Use the download button to save the generated dataset as a CSV file.

Requirements

  • Install the necessary Python libraries:
    pip install -r requirements.txt

Insights and Recommendations**

  • Distribution Patterns: Analyze how different statistical distributions generate data with varying patterns.
  • Data Analysis: Utilize the generated dataset for educational purposes, testing, and further analysis.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Connect with Me