Data Scientist β’ Data Explorer β’ Problem Solver β’ Data Analyst β’ Data Visualization β’ Python Developer β’ SQL β’ Machine Learning
- π Graduating in 2024 from IMF Smart Education --> Data Science & Business Analytics
- π¨βπ» My main tools are Python and SQL
- π libraries: Numpy, Pandas, SciPy, Matplotlib, Seaborn, Keras, TensorFlow, OpenCv, Scikit-learn, Statsmodels
- π Good in Algorithms and Data Structures
- π» Supervised and unsupervised learning models | reinforcement learning | Deep learning models
- π’Databases: MySQL, Microsoft SQL Server, MongoDB, Neo4j
- π Databricks (Scala, PySpark)
- π PowerBi, Tableau
- π± Iβm currently continuous learning --> Cloud servicies and DevOps
- π Hobbies: Carsπ - Footballβ½οΈ - Work outποΈ - Codingβ¨οΈ
- π« How to reach me: dariodellagostino@gmail.com
-
NYC's public transportation system Analysis. A quick and effective way to obtain conclusions when working with univariate time series.
-
Implementation of the VAR statistical model to predict a set of temporal variables. Interesting project, with an exhaustive and detailed analysis, which has been presented as a final master's project.
- EDA
- Split the series into training and test sets
- Stationarity test
- Transformation of the training series
- Construction of a VAR model
- Granger Causality
- Model diagnosis
- The forecast
- Inverse transformation of the forecast
- The forecast evaluation
-
A binary classification of a bank churn analysis. A very useful model for any company that provides services, capable of predicting potential clients who will leave the company:
- EDA
- Visualizations
- Logistic Regression
- Metrics: ROC - AUC. Correlation matrix
-
It is an effective model applicable to any company. Normally it would be a good idea to present this type of reports to the marketing area, but the objective of this project is to demonstrate that you do not have to be an expert to be able to convert data into relevant information. The technologies used are:
- SQL Server: to extract the data.
- Python: to develop the script, all coded in python.
- PowerBi: technology used for the final report, DAX queries are implemented.
-
Price prediction is one of the most common regression problems in data analysis. Through an exhaustive step-by-step analysis, good and detailed results are obtained:
- EDA
- Visualizations
- Linear Regression: Simple linear regression and Multiple linear regression
- Regularization techniques: Ridge regularization and Lasso regularization
- Metrics: Mean square error (MSE), Root Mean Square Error (RMSE), Determination coefficient (R2)
-
A deep analysis of a store with intermediate level queries to a more advanced and detailed level.
Seeking to know the level of sales, relationship between products-customers, profitability, supplies and more relevant information of interest
- MySQL
- Python
- Jupyter Notebook
-
Convolutional Neural Networks(CNN)
Convolutional neural networks (CNN) specialize in dealing with images and videos. They enable the detection of objects, identification of people, autonomous cars, etc.
- Object localization