Course information

  • Title: Data Mining and Machine Learning
  • Neptun code:
  • Instructor: István Csabai
  • Semester: 3
  • Type: Lecture + Practice
  • Credit points: 4
  • Prerequisites: -

Course description

The purpose of the course is to give a theoretical and practical knowledge of techniques from the field of modern data mining and machine learning that are applicable at any quantitative fields science. The primary focus of the course is empirical methods of data driven research, hence it complements knowdledge on model-based physics.

Topics

  • Introduction, prediction, training set
  • Toolsets of data mining and machine learning
  • Data exploration
  • Supervised learning, objective functions, classification, regression, validation
  • Regularization, model optimization
  • Decision trees, random forests
  • Support Vector Machine
  • Artificial neural networks
  • Deep learning
  • Image and sound processing
  • Unsupervised learning, dimensionality reduction, clustering
  • Machine learning on big data and distributed systems

Recommended readings

  • Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani: An Introduction to Statistical Learning: with Applications in R Springer Texts in Statistics), Springer 2016
  • Trevor Hastie, Robert Tibshirani, Jerome Friedman: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer, 2009
  • Ian Goodfellow, Yoshua Bengio, Aaron Courville: Deep Learning (Adaptive Computation and Machine Learning series), MIT Press, 2016

Course material