Hi there! This documentation is like a quick snapshot of my project in the data field, showing off my skills and know-how in this area.
Table of Contens :
- Data Engineering
- Python: Data Analysis and Machine Learning
- SQL
- Dashboard
- Complated Course and Certification
Project Link | Associated | Tools | Project Description |
---|---|---|---|
IYKRA | Python, GCP(Google Cloud Storage, BigQuery), Spark, Kafka, Looker Studio | Developed and implemented an end-to-end ETL pipeline for processing online payment transaction data. The pipeline integrated batch and streaming processing, transformed raw data using Spark, built a data warehouse applying a fact and dimensional model, provided notifications when fraudulent activity was detected, and created a reporting dashboard with Looker Data Studio. |
Project Link | Associated | Area | Library | Project Description |
---|---|---|---|---|
π° Predict Loan Default Customers | VIX - Home Credit Indonesia: Data Scientist | Data Wraggling, EDA, Supervised Learning - Classification | pandas, matplotlib, seaborn, scikit-learn, scipy | Predicted customer defaults or customer would experience payment difficulties. Conducted data cleansing on raw data and analyzed over 100 features using statistical methods for feature selection. The best model achieved an accuracy of 87% and an AUC of 73% using Logistic Regression. Created a simulation by deploying a web application for loan approval prediction using Streamlit. |
βοΈ Telco Customer Churn | FGA x Binar Academy: Data Science [Team] | Data Wraggling, EDA, Supervised Learning - Classification | pandas, matplotlib, seaborn, scikit-learn, shap | Developed a machine learning model to predict customer churn in a telecom company. The Random Forest model yielded the highest accuracy score, reaching 89%, with the most influential feature being the total day charge. A higher charge indicates a higher potential for customer churn. |
π² Predict Clicked Ads Customer Classification | Mini Project by Rakamin Academy | Data Wraggling, EDA, Supervised Learning - Classification | pandas, matplotlib, seaborn, scikit-learn, shap, etc | Developed a machine learning model and experimented with various algorithms, ultimately determining that the Random Forest model achieved the best fit with accuracy of 96% in identifying potential users likely to click on advertisements. Analyzed key influential features with SHAP to enhance targeting for improved conversion rates and cost efficiency. |
π Predict Customer Personality to Boost Marketing Campaign | Mini Project by Rakamin Academy | Data Wraggling, EDA, Unsupervised Learning - Clustering | pandas, matplotlib, seaborn, scikit-learn, yellowbrick | Analyzed customer characteristics of a e-grocery store by creating a clustering model using K-means. Before to clustering, decomposition was performed, and the best cluster was determined using inertia score or distortion score. This resulted in 4 clusters based on customer behavior, considering factors such as the number of transactions, spending levels, response to campaigns, and website visit frequency. |
π¬ Investigate Hotel Business using Data Visualization | Mini Project by Rakamin Academy | Data Wraggling, EDA, Data Visualization | pandas, matplotlib, seaborn | Analyzed the performance of City Hotels and Resort Hotels, identifying the frequently visited hotel type and exploring the relationships between booking cancellations, length of stay, and lead time through Python visualization. Identified potential causes for these patterns and provided business recommendations based on the analysis. |
π² Data Quality Assessment and Customer Segmentation | VIX - KPMG Australia: Data Analytic Consulting | Data Wraggling, EDA, RFM analysis | pandas, matplotlib, seaborn | Developed and optimized a bike company market strategy by analyzing their data. Conducted a data quality assessment and identified strategies to mitigate any data quality issues. Performed customer segmentation using a simple RFM (Recency, Frequency, Monetary) analysis to recommend potential new customers for targeted marketing. Visualized insights about the targeted customer demographics on a dashboard. |
π Online Shoppers Purchasing Intention | Final Project -Rakamin Academy [Team] | Data Wraggling, EDA, Supervised Learning - Classification | pandas, matplotlib, seaborn, scikit-learn, shap | Built a model to predict which website visitors are likely to make a purchase or not. After testing several algorithms, Random Forest Hyperparameter Tuning demonstrated the best performance, achieving a ROC-AUC score of 90%. Through simulation, it was projected that this model could potentially increase the conversion rate by 58%. |
Assignment - Rakamin Academy [Team] | Data Wraggling, EDA, Unsupervised Learning - Clustering | pandas, matplotlib, seaborn, scikit-learn, yellowbrick | Developed a clustering model employing LRFMC scores and the K-Means algorithm, resulting in the identification of 5 customer clusters: New Users, 20% are Loyal Customers, 19% are Potential Loyalists/The Champion, 18% are Need Attention, and 16% are Hibernating. |
Project Link | Associated | Area | Tools | Project Description |
---|---|---|---|---|
π³ Credit Card Customer Churn Analysis | VIX - BTPN Syariah: Data Engineer | Data analysis | PosgreeSQL, DBeaver, Tableau for Visualization | Created tables, loaded data in the database, and designed a star schema. Subsequently, conducted data exploration to identify customer profiles and characteristics related to churn. This analysis allowed for an examination of customer behavior based on demographic information, their relationship with the bank, and transaction history. |
π Maven Fuzzy Factory | Advanced SQL: MySQL Data Analysis and Business Intelligence | Data analysis | MySQL, MySQL Workbench | It is a course-based project aimed at analyzing the performance of an e-commerce business, answering various business questions using SQL, covering topics such as traffic, website measurement, product analysis, and user-level analysis. [Documentation is currently in progress.] |
π¦ Analyzing eCommerce Business Performance | Mini Project by Rakamin Academy | Data analysis | PosgreeSQL, pgAdmin, Excel for Visualization | Evaluated the business performance of e-commerce in Brazil by analyzing the growth in annual customer activity, annual product category quality, and annual payment type usage. The analysis utilized datasets containing information about customers, sellers, products, and orders. |
Project Link | Associated | Tools | Project Description |
---|---|---|---|
π©π»βπ» Human Resource Attrition Analysis (EDA, data analysis) | Challenge Task - IYKRA | Tableau | Joined tables within the dataset to conduct an analysis focused on identifying the demographic traits of employees who are more likely to leave the company, as well as an exploration of the factors that influence employee attrition. |
π Green Taxy Trip Monthly Report | Challenge Task - IYKRA | Looker Studio | Created a monthly performance report for taxi services. The report provided information about revenue generated from taxi trips and analyzed the busiest zones, days, and hours for passenger activity. |
π Sales Report Dashboard | VIX - PT. Kimia Farma, Tbk: Big Data Analyst | Looker Studio | Created a data mart, analyzed the provided data, and generated sales reports for the company. Additionally, developed a dashboard that primarily focuses on sales data from a six-month period, including key performance indicatorssuch as total revenue per month, total sales per branch location, total sales by product, and more. |
πIndonesia Covid-19 Dashboard | Challenge by Binar Academy [Team] | Looker Studio, BingQuery | Presented information about the update of COVID cases in Indonesia, such as new active cases, new confirmed cases, new deaths, and recoveries. |
π©π»βπ»Targeted Customer Demographic Dashboard | VIX - KPMG Australia: Data Analytic Consulting | Tableau | This dashboard provides insights into the demographics of new customers to be targeted in marketing efforts after conducting prior customer segmentation analysis. |
- Advanced SQL: MySQL Data Analysis and Business Intelligent | Udemy: Maven Analytics
- Advanced NLP with spaCy | DataCamp
- Natural Language Processing with spaCy | DataCamp
- Introduction to Natural Language Processing in Python | DataCamp
- Data Science | Binar x Kominfo RI: Digital Talent Scolarship
- Business Intelligence | Rakamin Academy
- Data Science: Machine Learning Specialization | Rakamin Academy
- Big Data Using Python | CISCO x Kominfo RI: Digital Talent Scolarship
- Data Science Math Skill | Coursera: Duke University
- Python Lanjutan | Skilvul