📈 Data Scientist & Data Engineer | Master's in Statistics and Data Science @ Rutgers University | Passionate about Machine Learning & Data-Driven Insights 🌟
I'm Rohit Kulkarni, a Data Scientist and Engineer with a passion for using data to drive impactful decisions. I'm currently pursuing my Master's in Statistics & Data Science at Rutgers University. I specialize in statistical modeling, machine learning, and data engineering.
Connect with me on 📧 LinkedIn | 📧 Email
🎓 Masters Science in Statistics & Data Science at Rutgers University
💼 Data Engineer & Data Scientist at Fractal Analytics
- Developed a Portfolio Optimization for a prominent CPG company, to help identify delisting opportunities for underperforming products
- Automated 10+ end-to-end ETL and CI/CD pipelines reducing manual activities by over 40%
- Migrated 60+ notebooks from Python to PySpark improving runtime by 85%
- Lead the technical activities of the US track of the project, managing a team of 3
📊 Skills and Certifications
- Staistical Modeling | Machine Learning | Data Wrangling | Data Engineering | Cloud Computing | Data Mining
- Python | R | SQL | PySpark | Microsoft Azure| PowerBI | Hadoop | Apache Spark
- Microsoft Certified Azure Data Engineer Associate (DP-203)
- Enhancing Predictive Model Reliability with Bootstrap Techniques: Enhanced the reliability and computational efficiency of predictive models by implementing Bag of Little Bootstraps (BLB) across large datasets, achieving superior scalability and accuracy in uncertainty estimation
- Optimized E-Commerce Sales Analysis with Azure ETL Pipeline: Built an advanced ETL pipeline leveraging Microsoft Azure and PySpark to analyze and optimize e-commerce sales, providing actionable insights through detailed data processing and analysis.
- Enhancing Predictive Model Reliability with Bootstrap Techniques: Applied both standard Bootstrap and the Bag of Little Bootstraps (BLB) methods to assess the reliability and efficiency of predictive models in large datasets, offering scalable and robust statistical analysis
- Automated ETL Pipeline for Enhanced Movie Data Insights: Developed a comprehensive, automated ETL pipeline using Microsoft Azure to efficiently process and analyze IMDb movie ratings data, ensuring seamless integration and storage in sophisticated reporting frameworks
- NFL Player Evaluation: Conducted regression analysis and hypothesis testing to evaluate NFL players, establishing the significance of key factors beyond physical attributes
- Flight Price Estimation: Predicted flight prices using several regression algorithms like XGBoost, SVR, RandomForestRegressor, achieving 95% accuracy score
- Customer Churn Rate Prediction: Analyzed customer retention in online food sales, and leveraged machine learning models to predict customer churn rate with 92% classification accuracy
More projects in my GitHub repo..