This project will investigate the use of transformer-based neural architectures for classifying the political party affiliation of parliamentary speeches and/or speakers. The primary data set used will be the Talk of Norway corpus (ToN), comprising Norwegian parliament speeches from 1998 to 2016. English transcripts from the European Parliament may also be used.
The focus of this project will be on the effective use of transformer models for text classification, and political speeches in particular, and whether these models outperform previously reported results based on e.g, SVMs and CNNs. Relevant pre-trained contextual language models one can apply include transformer models like NorBERT and NB-BERT, but also LSTM-based models like NorELMo.