Skip to content

Latest commit

 

History

History
109 lines (75 loc) · 5.64 KB

course_overview.org

File metadata and controls

109 lines (75 loc) · 5.64 KB

Bayesian Statistics, course outline

Course goals

After this course the student should be able to understand

  1. the relationship between machine learning and classical econometrics.
  2. the principles of machine learning methods including model selection based on cross validation.
  3. understand the main algorithms of machine learning such as Ridge, Lasso, Nearest neigbors and trees.
  4. Implement the core of the algorithms in Python (R, C++, etc) code. The aim is to understand how the algorithms work.
  5. Apply the algorithms on realistic practical problems with or without the help of statistical data packages such as scikit-learn.

Course description

The last couple of decades of statistical data analysis have been characterized by an increased emphasis on algorithms that can be described under the name of machine learning or big data. These methods have been successfully applied to the prediction of face recognition (Facebook), and predicting consumer preferences for movies (Netflix). They are also widely used in industry to decide whether just produced items match some quality criteria or not. Finallly, in medicine algorithms can help staff to classify tumors, and so on.

In this course we consider the principles of this algorithmic approach during in the lecture. We also look at practical examples and apply the algorithms in Python and compare our outcomes with existing software such as scikit-learn.

The topics that we deal with are the following:

  1. Introductory to machine learning for the econometrician.
  2. Principles of machine learning
  3. Nonparametric estimation methods
  4. Linear algorithms for prediction (Lasso, ridge)
  5. Algorithms for classification.
  6. Trees, bagging, random forests and boosting.

Schedule

Here is a preliminary schedule. We rely on you to give us feedback on whether the pace of the lectures is right/too fast/too slow.

WeekdayScheduleSectionLecturer
1W1Ch 1avv
F22.1-2.5avv
2W34.4avv
F4codingnvf
3W55.1-5.3, 5.5-5.7, 6.1-6.2avv
F6codingnvf
4F77.1, 7.2, 7.4-7.6, 7.8avv
5W8Ch 8avv
F9codingnvf
6F10Ch 8avv
F11codingnvf
7W11codingnvf
F12coding/writing reportnvf
8W13codding/writing reportnvf
F14No lecture/writing reportavv/nvf

The section numbers refer to the sections in DSML.

Exam and Grading

Report and duedates.

See the report template directory on github.

Oral exam

At the end of the course we invite the groups to explain their work in an oral exam, and we determine a grade.

Grade

60 % report, 20 % oral exam, 20 % review.

Literature

Our primary book is Kroeze, Data Science and Machine Learning (DSML).

There are other, non-obligatory, sources that you might like to consult. I liked in particular the first for an intuitive, high-level, overview of the topics and problems.

Coding

The university provides a few courses on python and R here.

Interesting research topics

Here are some topics that you might find interesting. They may serve as inspiration for your own paper.

On line Data

Contact info

  • Prof. dr. A.P. van Vuuren, a.p.van.vuuren@rug.nl
  • dr. N.D. van Foreest, n.d.van.foreest@rug.nl, coordinator