This git repository contains various scripts and references for the data analytics course held at the UOL.
- Anaconda
- Python 3 + other package dependencies (e.g., Jupyter Notebook, pandas, etc.)
- Visual Studio Code
- An integrated development environment (IDE) that is used to write code almost in any programming language
- Visual Studio Code also known as VS Code
- 💤 Important: make use of many VS Code extensions (e.g., Python, Pylance, IntelliCode, autoDocstring, Remote Development, etc.)
- github or gitlab at UOL
- A repository hosting functionality for your source code (and maybe data)
- overleaf
- LaTeX online editor, preferable way to work on a LaTeX project in a group
Use following short video tutorials to prepare your local development workplace using VS Code and extensions
- 🎬 How to Setup Visual Studio Code for Python and Data Science | Better Data Science
- 🎬 3 Must Know VS Code Features for ML & Data Science!
- 🐸 🎬 (optional) Learn Visual Studio Code in 7min (Official Beginner Tutorial)
- 🐸 🎬 (optional) Using Git with Visual Studio Code (Official Beginner Tutorial)
- 🎬 (advanced) STOP writing bad Data Science CODE with these 10 tools in VS Code
- 🎬 (advanced) Powerful VSCode Tips And Tricks For Python Development And Design
- Sublime Text
- An optional simple text editor
- Notepad++
- Simple, lightweight, full-featured and simply amazing text editor
- PyCharm Community Edition
- An optional IDE that can be used to write Python code
- Useful examples: uol-data-analytics-examples
- Useful links to datasets: datasets-links-collection
If you decided to perform your data analytics project using Python it is strongly recommended that you will get through the courses listed below.
- 🔰 Exploratory Data Analysis
- 🔰 Working with Jupyter Notebook
- 🔰 An introduction to data science using Python and Pandas with Jupyter notebooks
- Overview
- Printing, Strings, Numbers
- Logic, Loops, Lists, Tuples, and Dictionaries
- Pandas Part I
- Pandas Part II
- Pandas Time Series
- My top 25 pandas tricks (video)
- NOTE: Rest of this tutorials also can be reviewed as well.
- 🚀 Micro-Courses by Kaggle (Faster Data Science Education)
- 🚀 Python + Jupyter Notebook Tutorials by BigDataAnalyticsGroup (last update: 04.2020)
- Git repository with Python Tutorials
- 🎬 Videos (in German) are also available on youtube here
- Reproducible Data Analysis in Jupyter
- Jupyter Notebook on Full Stack Python
- Quick dive into Pandas for Data Science
- Getting started with Python and Jupyter Notebooks for data analysis
- 5 Quick and Easy Data Visualizations in Python with Code
- What Is a Data Frame? (In Python, R, and SQL)
- The Simple Yet Practical Data Cleaning Codes for pandas
- Pandas MultiIndex Tutorial
- 3 Awesome Visualization Techniques for every dataset
- Stylin’ with Pandas
- Every Complex DataFrame Manipulation, Explained & Visualized Intuitively
- 10 Useful Jupyter Notebook Extensions for a Data Scientist
Qgrid, itables, DataTables, ipyvolume, bqplot, handcalcs
- Compare SQL and pandas How to Write All of Your SQL Queries in Pandas
- Warm up SQL
- Data Wrangling with SQL (in some cases you will need to adapt provided solutions for your database)
- SQL Basics
- Python, SAP HANA and Analytics
- Python Client API for machine learning in SAP HANA 2.0 (Express Edition SPS 03, Rev. 33
- Setting up a HANA Express Python Machine Learning API Demo VM
- This in only one post from the series of posts, for deeper dive, please review the rest:
- SAP HANA and Machine Learning
- Visualization Tools
NOTE: There are always a lot of online courses available out there and this list is just a limited overview. This list is more a mixture of new materials with classical ones (update/review: 19.10.2020)
- Crashkurs für maschinelles Lernen (Microsoft, in German) - https://docs.microsoft.com/de-de/learn/paths/ml-crash-course/
- Intro to TensorFlow for Deep Learning (Udacity) - https://www.udacity.com/course/intro-to-tensorflow-for-deep-learning--ud187
- TensorFlow + Tutorials - https://www.tensorflow.org/tutorials
- Machine Learning (by Andrew Ng) - https://www.coursera.org/learn/machine-learning
- Statistical Learning - https://online.stanford.edu/courses/sohs-ystatslearning-statistical-learning
- Data Science (not all courses are free) - https://www.coursera.org/browse/data-science
- Stat 451: Intro to Machine Learning (Fall 2020) (Sebastian Raschka) - https://www.youtube.com/watch?v=OgK8JFjkSto&list=PLTKMiZHVd_2KyGirGEvKlniaWeLOHhUF3&index=1
- MIT 6.S191 Introduction to Deep Learning (Spring 2020) (Youtube) - http://introtodeeplearning.com/