A comprehensive exploration of Statistics and Probability Theory concepts, with practical implementations in Python
-
Updated
Nov 3, 2024 - Jupyter Notebook
A comprehensive exploration of Statistics and Probability Theory concepts, with practical implementations in Python
Deep R Programming (Open-Access Textbook)
Using Python, learn statistical and probabilistic approaches to understand and gain insights from data. Learn statistical concepts that are very important to Data science domain and its application using Python. Learn about Numpy, Pandas Data Frame.
This repository contains a gentle introduction to machine learning algorithms with hands on practical examples
Learn the core statistical concepts, followed by application of these concepts using R Studio with the a nice combination of theory and practice. Learn key statistical concepts and techniques like exploratory data analysis, correlation, regression, and inference.
The Following problems showcase different Statistical Methods used for Decision Making. The purpose of this project is to experiment and execute statistical methods, which are required to conduct data analysis, derive insights and inferences and arrive at business decisions.
This portfolio features all the Data Science and Machine Learning projects I have completed for academic, self-learning and hobby purposes. Additionally, it is updated regularly.
This is a repository containing the notes on statistics and probability for Data Science from basics to Advance
WHO LIFE EXPECTANCY: Studying the factors that affect/contribute to life expectancy and analyzing the changes over the last 15years, that is between 2000-2015.
Using boxplots to investigate US hospitals healthcare costs
The Poisson Distribution models the number of events that occur within a specified time frame, such as years. Since the volume of incoming calls fluctuates from year to year, this distribution aids in determining whether the call data aligns with a Poisson process or if external factors are affecting the call volume.
This repository contains a collection of Jupyter Notebooks for conducting Exploratory Data Analysis (EDA) and Statistical Analysis on various datasets.
This repository includes all the assignments completed for the IDS702: Modelling & Representation of Data at Duke MIDS program.
This repo contains all two problem set solutions of Applied Regression Course.
The Poisson distribution is a useful model for analyzing product defects, helping to estimate expected defect rates, their variability, and the likelihood of extreme cases. This understanding aids in enhancing quality control processes and minimizing defects.
Project of data analytics
This project uses statistical hypothesis testing to examine the link between cholesterol and fasting blood sugar levels with heart disease. One-sample t-tests and binomial tests are applied to assess whether these health metrics significantly differ from expected values, focusing on their association with heart disease.
collection of Jupyter Notebooks in both English and Spanish, dedicated to performing data quality analysis using the R programming language
Summary of Assignment Two from the first semester of the MSc in Data Analytics program. This repository contains the CA2 assignment guidelines from the college and my submission. To see all original commits and progress, please visit the original repository using the link below.
Add a description, image, and links to the statistics-for-data-science topic page so that developers can more easily learn about it.
To associate your repository with the statistics-for-data-science topic, visit your repo's landing page and select "manage topics."