Open semantic analysis: The case of word level semantics in Danish

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

The present research is motivated by the need for accessible and efficient tools for automated semantic analysis in Danish. We are interested in tools that are completely open, so they can be used by a critical public, in public administration, non-governmental organizations and businesses. We describe data-driven models for Danish semantic relatedness, word intrusion and sentiment prediction. Open Danish corpora were assembled and unsupervised learning implemented for explicit semantic analysis and with Gensim’s Word2vec model. We evaluate the performance of the two models on three different annotated word datasets. We test the semantic representations’ alignment with single word sentiment using supervised learning. We find that logistic regression and large random forests perform well with Word2vec features.
Original languageEnglish
Title of host publicationProceedings of 8th Language and Technology Conference
Number of pages5
Publication date2017
Publication statusPublished - 2017
Event8th Language and Technology Conference - Poznan, Poland
Duration: 17 Nov 201719 Nov 2017

Conference

Conference8th Language and Technology Conference
CountryPoland
CityPoznan
Period17/11/201719/11/2017

Cite this

Nielsen, F. Å., & Hansen, L. K. (2017). Open semantic analysis: The case of word level semantics in Danish. In Proceedings of 8th Language and Technology Conference
@inproceedings{8f1d1a60e1f947b58ce285e1bd4ad833,
title = "Open semantic analysis: The case of word level semantics in Danish",
abstract = "The present research is motivated by the need for accessible and efficient tools for automated semantic analysis in Danish. We are interested in tools that are completely open, so they can be used by a critical public, in public administration, non-governmental organizations and businesses. We describe data-driven models for Danish semantic relatedness, word intrusion and sentiment prediction. Open Danish corpora were assembled and unsupervised learning implemented for explicit semantic analysis and with Gensim’s Word2vec model. We evaluate the performance of the two models on three different annotated word datasets. We test the semantic representations’ alignment with single word sentiment using supervised learning. We find that logistic regression and large random forests perform well with Word2vec features.",
author = "Nielsen, {Finn {\AA}rup} and Hansen, {Lars Kai}",
year = "2017",
language = "English",
booktitle = "Proceedings of 8th Language and Technology Conference",

}

Nielsen, FÅ & Hansen, LK 2017, Open semantic analysis: The case of word level semantics in Danish. in Proceedings of 8th Language and Technology Conference . 8th Language and Technology Conference , Poznan, Poland, 17/11/2017.

Open semantic analysis: The case of word level semantics in Danish. / Nielsen, Finn Årup; Hansen, Lars Kai.

Proceedings of 8th Language and Technology Conference . 2017.

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

TY - GEN

T1 - Open semantic analysis: The case of word level semantics in Danish

AU - Nielsen, Finn Årup

AU - Hansen, Lars Kai

PY - 2017

Y1 - 2017

N2 - The present research is motivated by the need for accessible and efficient tools for automated semantic analysis in Danish. We are interested in tools that are completely open, so they can be used by a critical public, in public administration, non-governmental organizations and businesses. We describe data-driven models for Danish semantic relatedness, word intrusion and sentiment prediction. Open Danish corpora were assembled and unsupervised learning implemented for explicit semantic analysis and with Gensim’s Word2vec model. We evaluate the performance of the two models on three different annotated word datasets. We test the semantic representations’ alignment with single word sentiment using supervised learning. We find that logistic regression and large random forests perform well with Word2vec features.

AB - The present research is motivated by the need for accessible and efficient tools for automated semantic analysis in Danish. We are interested in tools that are completely open, so they can be used by a critical public, in public administration, non-governmental organizations and businesses. We describe data-driven models for Danish semantic relatedness, word intrusion and sentiment prediction. Open Danish corpora were assembled and unsupervised learning implemented for explicit semantic analysis and with Gensim’s Word2vec model. We evaluate the performance of the two models on three different annotated word datasets. We test the semantic representations’ alignment with single word sentiment using supervised learning. We find that logistic regression and large random forests perform well with Word2vec features.

M3 - Article in proceedings

BT - Proceedings of 8th Language and Technology Conference

ER -

Nielsen FÅ, Hansen LK. Open semantic analysis: The case of word level semantics in Danish. In Proceedings of 8th Language and Technology Conference . 2017