Title: | Advances in Weakly Supervised Learning of Morphology |
Author(s): | Kohonen, Oskar |
Date: | 2015 |
Language: | en |
Pages: | 148 + app. 92 |
Department: | Tietotekniikan laitos Department of Computer Science |
ISBN: | 978-952-60-6271-6 (electronic) 978-952-60-6270-9 (printed) |
Series: | Aalto University publication series DOCTORAL DISSERTATIONS, 91/2015 |
ISSN: | 1799-4942 (electronic) 1799-4934 (printed) 1799-4934 (ISSN-L) |
Supervising professor(s): | Oja, Erkki, Distinguished Prof. Emeritus, Aalto University, Department of Information and Computer Science, Finland |
Thesis advisor(s): | Lagus, Krista, Dr., Aalto University, Department of Computer Science, Finland |
Subject: | Linguistics |
Keywords: | morphology, allomorphy, machine learning, unsupervised learning, semi-supervised learning |
Archive | yes |
OEVS yes | |
|
|
Abstract:Morphological analysis provides a decomposition of words into smaller constituents. It is an important problem in natural language processing (NLP), particularly for morphologically rich languages whose large vocabularies make statistical modeling difficult. Morphological analysis has traditionally been approached with rule-based methods that yield accurate results, but are expensive to produce. More recently, unsupervised machine learning methods have been shown to perform sufficiently well to benefit applications such as speech recognition and machine translation. Unsupervised methods, however, do not typically model allomorphy, that is, non-concatenative structure, for example pretty/prettier. Moreover, the accuracy of unsupervised methods remains far behind rule-based methods with the best unsupervised methods yielding between 50-66% F-score in Morpho Challenge 2010.
|
|
Parts:[Publication 1]: Oskar Kohonen, Sami Virpioja, and Mikaela Klami. Allomorfessor: Towards Unsupervised Morpheme Analysis. In Evaluating Systems for Multilingual and Multimodal Information Access: 9th Workshop of the Cross-Language Evaluation Forum, CLEF 2008, Revised Selected Papers, volume 5706 of Lecture Notes in Computer Science, Aarhus, Denmark, pages 975-982, September 2009.[Publication 2]: Sami Virpioja, Oskar Kohonen, and Krista Lagus. Unsupervised Morpheme Analysis with Allomorfessor. In Multilingual Information Access Evaluation I. Text Retrieval Experiments, CLEF 2009, volume 6241 of Lecture Notes in Computer Science, Corfu, Greece, pages 609-616, September 2010.[Publication 3]: Sami Virpioja, Oskar Kohonen, and Krista Lagus. Evaluating the Effect of Word Frequencies in a Probabilistic Generative Model of Morphology. In Proceedings of the 18th Nordic Conference of Computational Linguistics NODALIDA 2011, Riga, Latvia, pages 230-237, May 2011.[Publication 4]: Oskar Kohonen, Sami Virpioja, and Krista Lagus. Semi-Supervised Learning of Concatenative Morphology. In Proceedings of the 11th Meeting of the ACL Special Interest Group on Computational Morphology and Phonology, Uppsala, Sweden, pages 78-86, July 2010.[Publication 5]: Teemu Ruokolainen, Oskar Kohonen, Sami Virpioja, and Mikko Kurimo. Supervised Morphological Segmentation in a Low-Resource Learning Setting using Conditional Random Fields. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning (CoNLL), Sofia, Bulgaria, pages 29-37, August 2013.[Publication 6]: Teemu Ruokolainen, Oskar Kohonen, Sami Virpioja, and Mikko Kurimo. Painless Semi-Supervised Morphological Segmentation using Conditional Random Fields. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL), Gothenburg, Sweden, pages 84-89, May 2014.[Publication 7]: Teemu Ruokolainen, Oskar Kohonen, Kairit Sirts, Stig-Arne Grönroos, Sami Virpioja, and Mikko Kurimo. A Comparative Study on Semi-Supervised Morphological Segmentation. Submitted, Computational Linguistics, 27 pages, 2014. |
|
|
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Page content by: Aalto University Learning Centre | Privacy policy of the service | About this site