Introduction to Machine Learning and Big Data

You are here: Programs & Courses » Open Certificate Courses » Data Science Courses

This Page

Introduction to Machine Learning and Big Data (ML I) - Course Details

Delve into the course contents and find out about the faculty members.

Single Course Price:

400.00 EUR (tax exempt)


Trent D. Buskirk, PhD (Bowling Green State University)
Prof. Frauke Kreuter (Ludwig-Maximilians-Universität, Munich & University of Maryland)

Video lecture:

Trent D. Buskirk, PhD (Bowling Green State University)
Prof. Frauke Kreuter (Ludwig-Maximilians-Universität, Munich & Univeristy of Maryland)


Course Dates:

To see all courses in the upcoming term click here.


This course is part of the Mannheim Data Science Certificate: Big Data & Machine Learning.

Book this course or the entire certificate here!

In order to book the course with alumni conditions, please get in touch with Manon Pfeifer directly.

Course Description

Short Course Description
The amount of data generated as a by-product in society is growing fast, including data from satellites, sensors, transactions, social media and smartphones, just to name a few. Such data are often referred to as “big data”, and can be used to create value in different areas such as health and crime prevention, commerce and fraud detection. Big Data are often used for prediction and classification tasks. Both of which can be tackled with machine learning techniques. In this course we explore how Big Data concepts, processes and methods can be used within the context of Survey Research. Throughout this course we will illustrate key concepts using specific survey research examples including tailored survey designs and nonresponse adjustments and evaluation.

No prerequisite.
We recommend good understanding of the material typically taught in undergraduate statistics courses and some familiarity with regression techniques. Knowledge about survey data collection at the level provided in the MDM course Fundamentals of Survey and Data Science.
While not a prerequisite, familiarity with the R software package (base R or R using Rstudio) is strongly encouraged.

Course Objectives
This course covers

  • an overview of key Big Data terminology and concepts
  • an introduction to common data generating processes
  • a discussion of some primary issues with linking Big Data with Survey Data
  • issues of coverage and measurement errors within the Big Data context
  • inference versus prediction
  • general concepts from machine learning including signal detection and information extraction
  • potential pitfalls for inference from Big Data
  • key analytic techniques (e.g. classification trees, random forests, conditional forests) to process Big Data using R with example code provided

Course Composition
This is a 2 ECTS course, which runs for 4 weeks. The content of the course is broken down into 4 units:

  1. Overview of Big Data; Working with Big Data; Classical Statistical Approaches versus Statistical Machine Learning
  2. Model Evaluation/Validation; K-Means Clustering
  3. K Nearest Neighbors; CARTS
  4. Random Forests

Learning and Teaching Methods
In this course, you are responsible for watching video recorded lectures and reading the required literature for each unit and then “attending” mandatory weekly one-hour online meetings where students have the chance to discuss the materials from a unit with the instructor. In addition, students are encouraged to post at least one questions about the materials covered in the videos and readings of the week in the forum before the meetings. Just like in an on-site course, homework will be assigned and graded.

Grading will be based on:

  • 4 online quizzes (worth 20% total)
  • Participation in discussion during the weekly online meetings and submission of questions via discussion form demonstrating understanding of the required readings and video lectures (worth 20%). Obviously in the first week one question will be enough, since we just started.
  • 3 homework assignments (worth 60% total)


ZFU Certification and Online Dispute Resolution

ZFU Certification

The Mannheim Master of Applied Data Science & Measurement program is certified according to the regulations of the ZFU (Staatliche Zentralstelle für Fernunterricht).


Online Dispute Resolution

Online dispute resolution according to Art. 14 Sect. 1 ODR-VO: The European Commission provides a platform for online dispute resolution (ODR). You can find more information under


Located in the heart of the German and European economy, Mannheim Business School (MBS), the umbrella organization for management education at the University of Mannheim, is considered to be one of the leading institutions of its kind in Germany and is continuously ranked as Germany’s #1.