In this homework, you will work with two popular frameworks: (1) sklearn (short for scikit-learn) and (2) statsmodels. In addition, you will create some basic visualizations to explain your findings. Regression analysis and Supervised learning constitute the two quintessential skills for a data scientist, thereby serving as the perfect material to prepare you for the real world.
The homework consists of eight tasks, which are described in the hw2.ipynb
notebook.
For each task, please provide both a written explanation of the steps you followed, and the corresponding code. Keep in mind that writing the explanation can help you in two ways:
- Clarifying the steps in your mind before writing the actual code
- Earning you points if the description is correct, regardless of the potential issues in your code
You are expected to solve the homework as a team of four, which you specified in the course registration form. By the homework submission deadline, each team should have a single shared private GitHub repo under the epfl-ada organization, containing the Jupyter Notebook with the solution. Please, follow the instructions below to create your team repo and start working on the homework:
- One team member should follow this link and create a team with exactly the same name as specified in the course registration form. Note: After creating your team, you might notice
ada-2021-homework-2-<your-personal-GitHub-handle>
instead ofada-2021-homework-2-<your-team-name>
as the name of the automatically created repo. This is a GitHub classroom bug (or feature), and fixing this is neither in our capacity nor purview. That said, please don't worry, the eventual created repo would be namedada-2021-homework-2-<your-team-name>
. - Creation of the team will automatically create a dedicated private repo. At this point the remaining three team-members should follow the same link and join their team. Make sure you are joining the correct team by checking your team-members' GitHub accounts: there might be teams with similar or same names.
- There is no simple automated way to transfer the materials for Homework 2 from the public course repository into your private team repository. To get started, we suggest that you manually pull the homework materials from the course repository to your local machine, copy them into your local team repository, and push to the remote.
- Afterwards -- keep collaborating on the homework as a team in your shared private repository.
- Most importantly, don’t forget to push the final solved version by November 26th, 23:59!
hw2.ipynb
notebook with disclosed output for each cell. Please don't update thedata
folder provided in the repository, i.e., use it in read-only mode.