Title: | Morph-based speech retrieval: Indexing methods and evaluations of unsupervised morphological analysis Morfeihin perustuva puhetiedonhaku: indeksointimenetelmiä sekä ohjaamattoman morfologisen analyysin evaluaatioita |
Author(s): | Turunen, Ville T. |
Date: | 2012 |
Language: | en |
Pages: | 228 |
Department: | Tietojenkäsittelytieteen laitos Department of Information and Computer Science |
ISBN: | 978-952-60-4718-8 (electronic) 978-952-60-4717-1 (printed) |
Series: | Aalto University publication series DOCTORAL DISSERTATIONS, 97/2012 |
ISSN: | 1799-4942 (electronic) 1799-4934 (printed) 1799-4934 (ISSN-L) |
Supervising professor(s): | Oja, Erkki, Prof. |
Thesis advisor(s): | Kurimo, Mikko, Dr. |
Subject: | Linguistics |
Keywords: | speech retrieval, spoken document retrieval, subword indexing, morphemes, out-of-vocabulary, confusion networks, morphological analysis, puhetiedonhaku, sananosat, morfeemi, konfuusioverkko, morfologinen analyysi |
OEVS yes | |
Digitized thesis: | ask |
|
|
Abstract:Puhetiedonhaku mahdollistaa tiedon löytämisen puhuttua aineistoa sisältävistä kokoelmista. Puheentunnistusta käytetään muuttamaan puhutut sanat tekstiksi, ja tiedonhakumenetelmiä käytetään tunnistustekstistä etsimiseen. Perinteiset tunnistusjärjestelmät sisältävät ennalta määrätyn sanaston, jolloin sanaston ulkopuoliset sanat jäävät aina tunnistumatta oikein. Yleensä harvinaiset sanat jätetään pois, mikä on ongelmallista tiedonhaun kannalta, koska hakusanat ovat usein harvinaisia sanoja, kuten erisnimiä. Rajoitettu sanasto on erityisen ongelmallista kielille, joissa on runsaasti sanamuotoja, kuten suomelle. |
|
Parts:[Publication 1]: Mikko Kurimo, Ville Turunen and Inger Ekman. An evaluation of a spoken document retrieval baseline system in Finnish. In Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech 2004 - ICSLP), Jeju Island, Korea, pp. 1585-1588, October 2004.[Publication 2]: Mikko Kurimo and Ville Turunen. To recover from speech recognition errors in spoken document retrieval. In Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech 2005 - Eurospeech), Lisbon, Portugal, pp. 605-608, September 2005.[Publication 3]: Ville T. Turunen and Mikko Kurimo. Using latent semantic indexing for morph-based spoken document retrieval. In Proceedings of the 9th International Conference on Spoken Language Processing (Interspeech 2006 - ICSLP), Pittsburgh PA, USA, pp. 341-344, September 2006.[Publication 4]: Ville T. Turunen and Mikko Kurimo. Indexing confusion networks for morph-based spoken document retrieval. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, The Netherlands, pp. 631-638, July 2007.[Publication 5]: Ville T. Turunen. Reducing the effect of OOV query words by using morph-based spoken document retrieval. In Proceedings of the 9th Annual Conference of the International Speech Communication Association (Interspeech 2008), Brisbane, Australia, pp. 2158-2161, September 2008.[Publication 6]: Ville T. Turunen and Mikko Kurimo. Speech retrieval from unsegmented Finnish audio using statistical morpheme-like units for segmentation, recognition, and retrieval. ACM Transactions on Speech and Language Processing, Vol. 8, No. 1, pp. 1-25, October 2011.[Publication 7]: Sami Virpioja, Ville T. Turunen, Sebastian Spiegler, Oskar Kohonen and Mikko Kurimo. Empirical comparison of evaluation methods for unsupervised learning of morphology. Traitement Automatique des Langues, Vol. 52, No. 2, pp. 45-90, 2011. |
|
|
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Page content by: Aalto University Learning Centre | Privacy policy of the service | About this site