Open semantic analysis: The case of word level semantics in Danish

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedings – Annual report year: 2017Researchpeer-review

View graph of relations

The present research is motivated by the need for accessible and efficient tools for automated semantic analysis in Danish. We are interested in tools that are completely open, so they can be used by a critical public, in public administration, non-governmental organizations and businesses. We describe data-driven models for Danish semantic relatedness, word intrusion and sentiment prediction. Open Danish corpora were assembled and unsupervised learning implemented for explicit semantic analysis and with Gensim’s Word2vec model. We evaluate the performance of the two models on three different annotated word datasets. We test the semantic representations’ alignment with single word sentiment using supervised learning. We find that logistic regression and large random forests perform well with Word2vec features.
Original languageEnglish
Title of host publicationProceedings of 8th Language and Technology Conference
Number of pages5
Publication date2017
Publication statusPublished - 2017
Event8th Language and Technology Conference - Poznan, Poland
Duration: 17 Nov 201719 Nov 2017

Conference

Conference8th Language and Technology Conference
CountryPoland
CityPoznan
Period17/11/201719/11/2017

ID: 140583527