Specializations within Data Science Master

The Data Science master has five different specializations. Below, these are described and some previous master projects are presented. For each specialization, also a link to the related research groups are given. Have a look on these pages as well! 

Specializations:

Database Integration and Semantic Web

For this specialisation, please contact the research group “Analytical Solutions and Reasoning.”  Available master topics are listed here.  Most of our topics concern methods to access and connect information from inhomogeneous data sets constructed for differing purposes, using differing models. Ontologies and Knowledge graphs are relevant concepts. Some of the topics also contain elements of machine learning, but not all of them.

Some recent MSc theses:

  • Improving Access to Relevant Knowledge in Large Ontologies through Best Excerpts from Text by Martin Lam. This project addresses the problem that large domain models (Ontologies) can be hard to navigate for humans. There are methods to extract “excerpts” from ontologies that give an overview of a small part of the domain, but it is hard to say what part will be useful.  In this thesis, natural language text is analysed together with an ontology to find the most relevant parts.
  • SPARQL Extension Ranking – Collaborative filtering for OptiqueVQS-queries by Tom Fredrik Christoffersen. The starting point was a graphical user interface to compose database queries, similarly to the faceted search interfaces used in many online stores, but for graph queries. This thesis investigates methods to anticipate which filters users are most likely to add next, based on filters already selected, and a log of previous queries.
  • Frog: Functions for ontologies — An extension for the OTTR-framework by Marlen Jarholt. The starting point is OTTR, a language for domain modelling developed in our group. The thesis adds functions to the OTTR language, which requires both some theoretical work on the OTTR type system, and implementation work.

Data Science and Life Science

This specialization is focused on machine learning, most often deep learning, applied to complex sequence data. Data are typically very high-dimensional, with complex dependencies and data types, meaning that advanced machine learning is typically required to detect any meaningful signals. The group also has research interest in causality and domain robustness. The data typically reflect molecular data where some prior knowledge is available, which means that architectures can be inspired from domain knowledge. Recently, the group has also developed an interest in machine learning to connect  climate change with future health consequences. Tasks range form theoretical machine learning to more applied. The tasks are offered by the Biomedical Informatics group.

Some recent projects:

Digital Image Processing (Analysis)

This specialization, driven by the research group Digital Signal Processing and Image Analysis, focus on  image analysis, and deep learning learning for  images, with applications in medical imaging, sonar, seismics, and remote sensing. Projects in signal processing are also possible. Some recent master projects:

Language Technology

This specialization is driven by the Language Technology Group, LTG
(https://www.mn.uio.no/ifi/english/research/groups/ltg/). Its research
areas include computational linguistics, deep learning in natural
language processing (NLP), large language models, machine translation,
evaluation methods for NLP tasks, human annotation of linguistic
datasets, semantic change detection and many others.

Some recent master projects:

  • Text-Based Prediction of Dwelling Condition by Ece Cetinoglu. It used Norwegian BERT models to predict the condition score of dwellings in the real estate
    market based on the features extracted from the textual content of their
    respective listing advertisements.
  • NLP-Based Automated Conspiracy Detection for Massive Twitter Datasets by Rohullah Akbari. This thesis analyze such conspiracy theories on COVID-19-related misinformation on Twitter using the manually labeled nine conspiracy categories from the COCO dataset, a multilabel multiclass text-based dataset.

Statistics and Machine Learning

This specialization is mainly driven by the research group in Statistics and Data Science. Research areas cover both theoretical and applied statistics, including inference for high-dimensional data, survival and event history analysis, model selection and criticism, graphical modelling, non-parametrics, machine learning, hierarchical Bayesian modelling, time- and space-modelling, and general methodological development motivated from applications in public health, genetics, biology, climate science and other fields. Some recent master projects:

  • The application of penalized logistic regression for fraud detection by Shuijing Liao. Building good prediction models for detecting the fraudulent cases faces challenges, for instance, due to inappropriate measures of prediction performance. We study, in the setting of fraud detection, whether and how the prediction performance of a penalized logistic regression model may be improved by applying appropriate optimality measures in cross-validation of the penalty parameters.
  • Electricity Demand Forecasting by Eirik Sjåvik. This thesis introduces a medium-term forecast model for electricity demand in the Nordic region utilizing seasonal Numerical Weather Prediction temperature forecasts

 

 

Publisert 12. sep. 2023 08:07 - Sist endret 15. des. 2023 15:46