Simulation environment for synthesizing data pipeline-based event logs

Recently, process mining techniques were applied to discover the structure of big data pipelines from the event logs keeping track of their execution. However, to date, organizations tend not to share their real-world event logs, making it rather complex any insightful analysis of the expected efficiency of a pipeline before its concrete execution. The goal of this thesis is to implement a simulation environment for generating event logs that are able to realistically represent many potential pipeline executions. As a result of the simulation, the KPIs (key performance indicators) of the pipeline in the given scenario as well as pipeline-based statistics (resource utilization, waiting times, bottleneck, etc.) will be returned.

Publisert 11. okt. 2022 09:14 - Sist endret 11. okt. 2022 09:14

Veileder(e)

Omfang (studiepoeng)

60