qvunguyen qvunguyen

👋 Hi there, I'm Vu Nguyen

I'm a Data Scientist.

👀 My primary interests are in Machine Learning, Deep Learning, and Artificial Intelligence. I am also well-versed in statistical modeling and data visualization.
👯 I’m eager to collaborate with professionals and enthusiasts on projects related to Data Science, Machine Learning, and AI research.
🥅 Future Goals: I aim to be at the forefront of AI-driven innovation, developing solutions that have real-world impact.
📫 How to reach me: vunguyen.career@gmail.com or LinkedIn

🛠️ Skills and Tools

Programming Languages:

Python
R
SQL
Java
JavaScript
HTML
CSS
Swift

Technology & Frameworks:

Tableau
Spreadsheets
Bootstrap

📚 Education

University of Melbourne (2020-2023)

Master of Information Technology; Major in Artificial Intelligence

University of Melbourne (2017-2020)

Bachelor of Commerce; Major in Finance and Accounting

🏅 Certifications

Google Data Analytics Professional Certificate

📂 Projects

1. iOS Geothermal App

Endeavour Discipline Award-Winning project in the field of Computing and Information Systems semester 1, 2023.

Overview: An endeavor to bring academic research to real-world application, the iOS Geothermal App is a tool designed to simplify the complexity involved in geothermal systems design. It serves as a portable calculator for engineers, architects, and contractors to output the recommended geothermal system design specifications based on user input.

The demand for energy efficiency and sustainable building practices has fueled the need for simple, accessible tools like our app, particularly among professionals in the field of architecture and engineering.

A key feature of our app is its simplicity and cost-effectiveness, making it more accessible and user-friendly compared to similar apps on the market.

2. Google Data Analytics Capstone

Overview: This project is a comprehensive marketing analysis for Bellabeat, a high-tech manufacturer of health-focused products for women. As a part of my professional Google Data Analysis certificate capstone, I analyzed smart device fitness data to gain insights into user behaviors. These insights play a crucial role in guiding the company's marketing strategy.

The analysis focuses on smart devices data, gathered through a distributed survey via Amazon Mechanical Turk, including minute-level output for physical activity, heart rate, and sleep monitoring. This valuable dataset provides insights into user behaviors, usage patterns, and preferences, which then translate into actionable marketing strategies.

Key findings include users' active and inactive days, caloric burn patterns, activity levels, total distance traveled, and sleep patterns. This information led to strategic recommendations for promoting outdoor activities, emphasizing light activities, providing personalized workout recommendations, improving sleep habits, and employing social media influencers for product promotion.

3. Movie Recommendation System

Overview: The Movie Recommendation System is a hybrid model that employs collaborative filtering and content-based filtering techniques to generate movie recommendations. We utilize Python and multiple libraries like Surprise and Scikit-learn to build this system. The model utilizes the MovieLens 25M dataset, comprising 25 million ratings and one million tag applications from 162,000 users on 62,000 movies.

The collaborative filtering approach involves predicting a user's preference for a movie based on the preferences of similar users. Here, we apply the Singular Value Decomposition (SVD) algorithm from the Surprise library.

The content-based filtering technique predicts a user's preference for a movie based on its features and the user's preferences for similar features. We utilize the Term Frequency-Inverse Document Frequency (TF-IDF) approach to create feature vectors for movie genres and compute cosine similarity to gauge the similarity between movies based on their genres.

Testing the recommendation system is straightforward and customizable. Users can set their user ID, the movie title they prefer, and the number of recommendations they desire. The model then generates a list of top recommendations based on these parameters.

4. Creadit Card Fraud Detection

Overview: This project involves the application of various machine learning models to identify fraudulent credit card transactions. The models include Logistic Regression, Random Forest, KNN, XGBoost, and LightGBM. The data preprocessing steps implemented handle missing values, duplicates, and outliers, and also address the imbalance in the dataset using SMOTE. Based on performance metrics, the Random Forest model stands out as the most effective in detecting fraudulent transactions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly