Web Scraping and APIs

You are here: Executive Education » Open Courses » Data Science

This Page

Web Scraping and APIs - Course Details

Delve into the course contents and find out about the faculty members.

Single Course Price:

400.00 EUR (tax exempt)

 

Instructor:

Sascha Goebel (Goethe University Frankfurt am Main)

Video lecture:

Simon Munzert, PhD (Hertie School of Governance Berlin)

 

Course Dates

To see all courses in the upcoming term click here.

 

Book this course here!

In order to book the course with alumni conditions, please get in touch with Manon Pfeifer directly.

Course Description

Short Course Description
The short course provides a condensed overview of web technologies and techniques to collect data from the web in an automated way. To this end, students will use the statistical software R. The course introduces fundamental parts of web architecture and data transmission on the web. Furthermore, students will learn how to scrape content from static and dynamic web pages and connect to APIs from popular web services. Finally, practical and ethical issues of web data collection are discussed.

Prerequisites
Students are expected to be familiar with the statistical software R. Besides base R, knowledge about the “tidyverse” packages, in particular, dplyr, plyr, magrittr, and stringr, are of help. If you are familiar with R but have no experience in working with these packages, the best place to learn them is the primary reading “R for Data Science”.

Course Objectives
By the end of the course, students will…

  • have an overview of state-of-the-art research that draws on web-based data collection,
  • have a basic knowledge of web technologies,
  • be able to assess the feasibility of conducting scraping projects in diverse settings,
  • be able to scrape information from static and dynamic websites as well as web APIs using R, and
  • be able to tackle current research questions with original data in their own work.

Course Composition
This is a 2 ECTS course, which runs for 4 weeks. The content of the course is broken down into 4 units:

  1. Introduction – Web Technologies
  2. Scraping static webpages
  3. Scraping dynamic webpages and good practice
  4. Tapping APIs

Learning and Teaching Methods
In this course, you are responsible for watching video-recorded lectures and reading the required literature for each unit prior to participating in mandatory weekly one-hour online meetings where students have the chance to discuss the materials from a unit with the instructor. In addition, students are encouraged to post questions about the materials covered in the videos and readings of the week to the “Place to post your questions” forum before the meetings.

Grading
Grading will be based on:

  • Participation in discussion during the weekly online meetings and submission of questions via the forum demonstrating understanding of the required readings and video lectures (worth 10%)
  • Weekly quizzes that check factual knowledge about the course topics (worth 30% total)
  • Weekly assignments that require students to implement and practice scraping techniques in R (worth 60% total)

 

ZFU Certification and Online Dispute Resolution

ZFU Certification

The Mannheim Master of Applied Data Science & Measurement program is certified according to the regulations of the ZFU (Staatliche Zentralstelle für Fernunterricht).

 

Online Dispute Resolution

Online dispute resolution according to Art. 14 Sect. 1 ODR-VO: The European Commission provides a platform for online dispute resolution (ODR). You can find more information under http://ec.europa.eu/consumers/odr/

MANNHEIM BUSINESS SCHOOL (MBS)

Located in the heart of the German and European economy, Mannheim Business School (MBS), the umbrella organization for management education at the University of Mannheim, is considered to be one of the leading institutions of its kind in Germany and is continuously ranked as Germany’s #1.

SOCIAL MEDIA NEWS WALL