6 ECTS credits
150 u studietijd

Aanbieding 1 met studiegidsnummer 4013052ENR voor alle studenten in het 1e en 2e semester met een verdiepend master niveau.

Semester
1e en 2e semester
Inschrijving onder examencontract
Niet mogelijk
Beoordelingsvoet
Beoordeling (0 tot 20)
2e zittijd mogelijk
Ja
Onderwijstaal
Engels
Faculteit
Faculteit Wetenschappen en Bio-ingenieurswetensch.
Verantwoordelijke vakgroep
Computerwetenschappen
Onderwijsteam
Coen De Roover
Ahmed Zerouali (titularis)
Onderdelen en contacturen
12 contacturen Hoorcollege
32 contacturen Werkcolleges, practica en oefeningen
80 contacturen Zelfstudie en externe werkvormen
Inhoud

The goal of this course is to learn how to discover interesting and actionable information about software projects by analyzing the large amounts of data stored in their repositories using data mining and machine learning algorithms. This is an advanced course about selected topics from the state of the art in mining software repositories. As such, the exact content of the course can vary each year. 

The initial lectures typically cover the following topics:
1. Software repositories and associated data: version control repositories, issue trackers, Q&A platforms

2. Sofware ecosystems and evolution

3. Software data analytics and inference: empirical software engineering methods, mining for idioms in snapshots, mining for change patterns in commits

4. A selection of recent success stories

In the final lectures, we study recent research to understand how the mining of software repositories is evolving. For these lectures, the students will prepare a presentation about recent research results on which they will be graded. Students will also be graded on three assignments for which they need to apply and extend data mining algorithms to real-world project data.

Studiemateriaal
Digitaal cursusmateriaal (Vereist) : Digital course material on the learning platform, Canvas
Bijkomende info

not applicable

Leerresultaten

Algemene competenties

Goals and competences

The goals of this course are:
- Students obtain knowledge about the analysis of large amounts of software engineering data coming from different ecosystems using data mining techniques.
- Students become skilled at uncovering interesting and actionable information about software systems and projects to improve software quality.

The corresponding learning results are:

* w.r.t. knowledge:
- The student can describe the process needed to build a machine learning model able to predict defective components.
- The student can illustrate and discuss the strengths and weaknesses of the features to extract for training the model.
- The student can describe how to choose a classifier, and outline the differences between white-box and black-box techniques.
- The student can illustrate how to effectively and efficiently tune a machine learning model using search-based techniques.

* w.r.t. applying knowledge:
- The student can independently build a machine learning model to predict defective software components.
- The student can independently tune a machine learning model using search-based techniques.

* w.r.t. analysing:
- The student can recognise which features should be extracted from a source code repository.
- The student can recognise whether a prediction model is effective.
- The student can recognise whether tuning a prediction model is effective and efficient.

* w.r.t. evaluating:
- The student can compare machine learning techniques and decide which one to apply.
- The student can evaluate the applicability of different search-based techniques in order to tune a prediction model.

* w.r.t. creating:
- The student can generate alternative prediction models and choose among them.
- The student is able to report about the choices he made when building a model and the rationale behind them.

 

Beoordelingsinformatie

De beoordeling bestaat uit volgende opdrachtcategorieën:
Examen Mondeling bepaalt 40% van het eindcijfer

Examen Praktijk bepaalt 60% van het eindcijfer

Binnen de categorie Examen Mondeling dient men volgende opdrachten af te werken:

  • Exam met een wegingsfactor 1 en aldus 40% van het totale eindcijfer.

    Toelichting: 1 oral presentation that synthesizes one recent publication in the domain (40%)

Binnen de categorie Examen Praktijk dient men volgende opdrachten af te werken:

  • Practical exam met een wegingsfactor 1 en aldus 60% van het totale eindcijfer.

    Toelichting: 3 written assignments in which students apply mining software repositories techniques on real software systems (20% each)

Aanvullende info mbt evaluatie

Students are evaluated on three programming assignments, and on an oral presentation that synthesizes one recent publication in the domain.
The assignments are mandatory and the deadlines are strict.
Failing to hand in an assignment implies an absent mark for the course.

Toegestane onvoldoende
Kijk in het aanvullend OER van je faculteit na of een toegestane onvoldoende mogelijk is voor dit opleidingsonderdeel.

Academische context

Deze aanbieding maakt deel uit van de volgende studieplannen:
Master in de ingenieurswetenschappen: computerwetenschappen: afstudeerrichting Artificiële Intelligentie
Master in de ingenieurswetenschappen: computerwetenschappen: afstudeerrichting Multimedia
Master in de ingenieurswetenschappen: computerwetenschappen: afstudeerrichting Software Languages and Software Engineering
Master in de ingenieurswetenschappen: computerwetenschappen: afstudeerrichting Data Management en Analytics
Master in Applied Sciences and Engineering: Computer Science: Artificial Intelligence (enkel aangeboden in het Engels)
Master in Applied Sciences and Engineering: Computer Science: Multimedia (enkel aangeboden in het Engels)
Master in Applied Sciences and Engineering: Computer Science: Software Languages and Software Engineering (enkel aangeboden in het Engels)
Master in Applied Sciences and Engineering: Computer Science: Data Management and Analytics (enkel aangeboden in het Engels)