Benchmark ML-based simulation methods

Background and project description

Simulated data play a central role in computational method development and evaluation. Ideally, the simulated data should mimic real-world data regarding several characteristics and feature distributions. However, simplistic assumptions in simulations can often lead to over-optimistic conclusions in subsequent evaluations (of method performance, rejecting a null hypothesis of a statistical test etc). In this thesis project, the candidate will benchmark different existing and novel machine learning (ML) -based simulation methods to generate molecular datasets that are common in biomedical science. 

How will this task help in future jobs? 

Data simulations and benchmarking are useful skills and will come in particularly handy when developing/evaluating novel computational methods. Therefore, the thought process, resources, and skills developed through this task are useful, transferable skills for future jobs.

Required background 

Study programs: Data Science/Computational Science/Statistics/Informatics/Bioinformatics

Skills: Good grasp of statistics/machine learning is assumed. Strong programming skills in Python is assumed. No biology knowledge is required.


Emneord: Machine Learning, Simulations, benchmarking
Publisert 4. okt. 2022 09:25 - Sist endret 4. okt. 2022 09:25


Omfang (studiepoeng)