6 ECTS credits
150 h study time
Offer 1 with catalog number 4018841FNR for all students in the 1st semester at a (F) Master - specialised level.
This is an advanced course on concepts, techniques, and tools for big data processing. The goal is to introduce the foundational concepts and techniques and motivate and illustrate their usage within frameworks and tools related to big data. Students are made aware of how these concepts are situated in larger, connected research domains such as distributed programming, reactive programming, metaprogramming, and so on.
Topics of this course include properties of big data; MapReduce and other Hadoop-related technologies distributed and resilient collection data structures in Spark; cluster-computing with Spark: data locality, partitioning, and shuffling; stream processing in Flink; blockchain.
This course is not a basic introduction to every research domain related to big data processing. Familiarity with web-related and data processing technologies is a big plus. Knowledge of (functional) programming languages theory is assumed.
The computing landscape is changing drastically, with technologies like cloud computing, machine learning, and big data entering the mainstream. While most students after completion of a Bachelor degree are already acquainted with the basics of mainstream programming languages, innovative software projects and companies today require expertise in the topics introduced in this course.
Knowledge and Insight: Successful completion of the course results in students that have gained knowledge about various topics, concepts, and technologies that lie at the basis of big data processing. The goal of the course is to introduce students to programming techniques that are used in modern and innovative software projects that tackle challenging problems. On the other hand, it is our purpose to introduce students to the scientific challenges that are a consequence of the specific context of the course contents (e.g., distributed nature of software, possibility of failure, the scale of operations and data, etc.).
The use of knowledge and insight: Students will gain practical experience in putting the concepts underlying big data processing into practice by means of the practical sessions in which they are required to apply tools and develop small programs. After successful completion of the course, the students are able to put the knowledge and insights they gained into practice in software projects that require this type of knowledge.
Judgement ability: Complementary to the application of the topics of this course, the student will be able to provide a critical analysis and comparison of existing languages, tools, techniques, and frameworks described in scientific literature or offered as commercial products.
Communication: students will be able to clearly and critically express themselves both orally and in written form about the topics that are covered by the course.
Learning Skills: The course acquaints the student with an important emerging computing technology, namely big data. Within this technology, we identify a number of important topics that are sufficiently representative to provide the student with the necessary skills to absorb, classify, and disseminate knowledge of other, related concepts and techniques. Students will be capable of performing a more in-depth treatment of each of the covered topics.
The final grade is composed based on the following categories:
Practical Exam determines 100% of the final mark.
Within the Practical Exam category, the following assignments need to be completed:
This 6-credit variant of the course is an extension of the 3-credit variant by requiring a more extensive project. The 6-credit project builds upon the functional data pipeline required for the 3-credit project by requiring solutions to one or more advanced engineering challenges such as performance tuning, stateful stream processing, robust data architecture, or integration of additional components (blockchain, AI, etc.).
The exam consists of a practical part, which is a programming project that validate the student’s ability in applying the concepts introduced during the lectures. The implementation of the project has to be complemented with a short report in which the student explains the setup of the project, how to run it, and how concepts seen in the lectures have been applied. During the oral defense of the projects, additional relevant questions may be asked about concepts and topics discussed in the theory lectures and exercise sessions. The use of GenAI for this assignment is not permitted.
This offer is part of the following study plans:
Master of Applied Sciences and Engineering: Computer Science: Artificial Intelligence
Master of Applied Sciences and Engineering: Computer Science: Multimedia
Master of Teaching in Science and Technology: computerwetenschappen (120 ECTS, Etterbeek) (only offered in Dutch)
Master of Applied Informatics: Artificial intelligence and Data Science: Big Data Technology
Master of Applied Informatics: Artificial intelligence and Data Science: Artificial Intelligence