Distributed Data Processing

Modulnummer: Q06-10
Englischer Titel: Distributed Data Processing
Leistungspunkte: 6
Lehrperson: Weidlich

Empfohlene Vorkenntnisse

Grundlagen von Datenbanksystemen (DBS I)

Zwingende Voraussetzungen

keine

Inhalt

Data analytics refers to the ability to extract information from data. It has to cope with rapidly growing volumes of data as well as increasing complexity of analysis questions and methods. These trends are no longer matched by performance improvements of single processing units (CPU/GPU cores). As such, sequential processing of data on a single machine is no longer a viable option. Rather, systems for data analytics need to embrace parallel and distributed computation in order to achieve scalability by increasing the number of processing units.

This lecture introduces models and methods to build systems for distributed data processing. That includes foundational aspects, reaching from data models through encoding and replication schemes to notions of consistency and consensus. At the same time, the lecture covers practical implementations of distributed data processing based on infrastructures such as Akka, Spark, Flink, and Kafka.

Erforderliche Arbeitsleistungen für LP-Vergabe und Prüfungszulassung

Exercises are integrated in the lecture. Solutions to these exercises will be collected and graded. Successful completion of the exercises is a prerequisite for taking the final exam and earn the LP.

Lehrveranstaltungen

Vorlesung: 4 SWS
Übung: ** SWS
Praktikum: ** SWS
Seminar: ** SWS
Praxisseminar: ** SWS
Projektseminar: ** SWS

Zugeordneter Vertiefungsschwerpunkt

Algorithmen und Modelle: nein
Modellbasierte Systementwicklung: nein
Daten- und Wissensmanagement: ja
Ohne Vertiefungsschwerpunkt: nein

Sprache im Modul

Deutsch: nein
Englisch: ja

Angeboten für Studiengänge

M. Sc.: ja
M. Ed.: ja
Wirtschaftsmaster: ja

Angeboten im

Wintersemester: ja
Sommersemester: nein

Turnus

Alle zwei Jahre