Eesti keeles

Raul Sirel

../_images/raul.png

General

  • Date of birth: 14.11.1986

  • E-mail: rsirel@texta.ee

  • Phone: No phone number? Send an e-mail!

  • Research interests: language technology, machine learning, history, geography.

Career

  • 01.01.2017–… TEXTA, CTO; Founder (1,00)

  • 01.05.2015–31.12.2016 STACC , Project Manager (1,00)

  • 01.09.2013–31.12.2013 University of Western Sydney, Visiting Researcher (0,50)

  • 01.09.2013–31.12.2013 NICTA Canberra Research Lab, Visiting Researcher (0,50)

  • 01.07.2011–30.04.2015 STACC , Researcher (1,00)

  • 15.11.2010–30.06.2011 STACC , Researcher (0,50)

Education

  • 2011–… University of Tartu, PhD Studies in General Linguistics

  • 2009–2011 University of Tartu, MA Studies in Computational Linguistics

  • 2006–2009 University of Tartu, BA Studies in Computational Linguistics

  • 2003–2006 Tallinn Technical Secondary School

Qualifications

Academic Degree

  • Raul Sirel, Master’s Degree, 2011, (sup) Margus Treumuth, Poolautomaatne teadmusbaaside konstrueerimine dialoogsüsteemidele (Semi-Automatic Knowledge Base Construction for Dialogue Agents), University of Tartu.

  • Raul Sirel, Phd student, (sup) Kadri Muischnek; Jaak Vilo, Eestikeelsete terviselugude andmekaeve (Text mining of Estonian Medical Records), University of Tartu.

Additional Training

  • 2014 - 13th Estonian Summer School on Computer and Systems Science

  • 2013 - Lisbon Machine Learning Summer School (LxMLS)

  • 2013 - 12th Estonian Summer School on Computer and Systems Science

  • 2013 - Australian Health Informatics Summer School

  • 2012 - 11th Estonian Summer School on Computer and Systems Science

Projects

Scientific Projects

  • EMBEDDIA “Cross-Lingual Embeddings for Less-Represented Languages in European News Media”, TEXTA OÜ, H2020: https://cordis.europa.eu/project/id/825153.

  • TAR16013 (EXCITE) “Estonian Centre of Excellence in ICT Research (1.09.2016−1.03.2023)”, Maarja Kruusmaa, Tallinn University of Technology , School of Information Technologies, Centre for Biorobotics, Cybernetica AS.

  • EKTR3 “TEXTA Toolkit 2.0 (1.01.2018−31.12.2020)”, Raul Sirel, TEXTA.

  • IUT34-4 “Data Science Methods and Applications (DSMA) (1.01.2015−31.12.2020)”, Jaak Vilo, University of Tartu, Faculty of Science and Technology, Institute of Computer Science.

  • EKT108 “TEXTA tööriistakomplekti jätkuarendus ja juurutamine (1.01.2017−31.12.2017)”, Raul Sirel, TEXTA.

  • EKT68 “Töövahendite raamistik tekstikorpustest teadmuse tuletamiseks (1.01.2015−31.12.2016)”, Raul Sirel, STACC .

  • ETF9124 “Modelling of conversational agent and Estonian dialogue corpus (1.01.2012−31.12.2014)”, Mare Koit, University of Tartu, Faculty of Mathematics and Computer Science.

  • EKT5 “Eestikeelse dialoogi pragmaatika analüsaator (1.01.2011−31.12.2013)”, Mare Koit, University of Tartu, Faculty of Mathematics and Computer Science.

  • SF0180078s08 “Development and implementation of formalisms and efficient algorithms of natural language processing for the Estonian language (1.01.2008−31.12.2013)”, Mare Koit, University of Tartu, Faculty of Mathematics and Computer Science.

  • ETF7503 “Communicative strategies in a communication model: modelling Estonian dialogue on the computer (1.01.2008−31.12.2011)”, Mare Koit, University of Tartu, Faculty of Mathematics and Computer Science.

  • EKKTT09-65 “Automaatne parafraaside leidmine ning sõnade ja lühifraaside tõlkimine paralleelkorpuste abil (1.01.2009−31.12.2010)”, Maarika Traat, University of Tartu, Faculty of Mathematics and Computer Science.

Commercial Projects

  • TTJA “Automatic tagging of e-mails 2”, 2021-2022, TEXTA OÜ.

  • MKM “Bürokratt chatbot 1 & 2”, 2020-2021, TEXTA OÜ.

  • MKM “Bürokratt - automatic tagging of e-mails”, 2020-2021, TEXTA OÜ.

  • TTJA “Automatic tagging of e-mails”, 2020-2021, TEXTA OÜ.

  • Rahvusraamatukogu “Automatic tagger: KRATT 1 & 2”, 2019-2021, TEXTA OÜ.

  • HTM “Detecting GDPR leaks from document registry”, 2018,TEXTA OÜ.

  • Ekspress Grupp “Automatic moderation of comments”, 2019-…, TEXTA OÜ.

  • RIK “Automatic deidentification of court decisions.”, 2019, TEXTA OÜ.

  • Õhtuleht “Automatic tagging of newspaper articles”, 2018, TEXTA OÜ.

  • Äripäev “Automatic tagging of newspaper articles”, 2018, TEXTA OÜ.

  • CV Online “Recommendation engine for CV-s and job offers”, 2017, TEXTA OÜ.

  • Õhtuleht “Automatic moderation of comments”, 2016, STACC OÜ.

Supervised Dissertations

  • Katrin Valdson, Master’s Degree, 2016, (sup) Raul Sirel; Aleksandr Tkatšenko, Mustripõhine informatsiooni eraldamine Eesti kohtulahenditest (Pattern based information extraction from Estonian court documents), University of Tartu, Faculty of Science and Technology, Institute of Computer Science.

Teaching

Publications

  • Asula, Marit; Makke, Jane; Freienthal, Linda; Kuulmets, Hele-Andra; Sirel, Raul (2021). Kratt: Developing an Automatic Subject Indexing Tool for the National Library of Estonia. Cataloging & Classification Quarterly, 1−19. DOI: 10.1080/01639374.2021.1998283.

  • Vaik, Kristiina; Asula, Marit; Sirel, Raul (2020). Hybrid Tagger – An Industry-driven Solution for Extreme Multi-label Text Classification. In: Proceedings of the LREC2020 Industry Track (26−30). Language Resources and Evaluation Conference (LREC 2020), Marseille, 11–16 May 2020. European Language Resources Association (ELRA). DOI: 10.5281/zenodo.4306169.

  • Suominen, Hanna; Johnson, Maree; Zhou, Liyuan; Sanchez, Paula; Sirel, Raul; Basilakis, Jim; Hanlen, Leif; Estival, Dominique; Dawson, Linda; Kelly, Barbara (2015). Capturing patient information at nursing shift changes: methodological evaluation of speech recognition and information extraction. Journal of the American Medical Informatics Association, 22 (E1), E48−E66.10.1136/amiajnl-2014-002868.

  • Sirel, Raul (2013). Meetodeid tekstide leksikaalsete ja grammatiliste erinevuste tuvastamiseks meditsiiniliste tarbetekstide näitel. Eesti Rakenduslingvistika Ühingu aastaraamat, 9, 265−278. DOI: 10.5128/ERYa9.17.

  • Reisberg, S; Sirel, R; Kalda, R; Merzin, M; Pruulmann, J; Vilo, J (2013). Elektrooniliste terviselugude analüüsimise võimalused Tartu perearstide infosüsteemi näitel. Eesti Arst, 92 (8), 452−459.

  • Sirel, Raul (2013). Morphostatistical Approach to Medical Document Classification. Proceedings of the 4th International Louhi Workshop on Health Document Text Mining and Information Analysis (Louhi 2013): 11-12 Feb 2013 Sydney. Ed. Suominen, Hanna.

  • Sirel, R. (2012). Knowledge Acquisition Tool for Dialogue Systems. Frontiers in Artificial Intelligence and Applications, 247: Human Language Technologies – The Baltic Perspective, 4.-5. Oct Tartu. Ed. Arvi Tavast, Kadri Muischnek, Mare Koit. Amsterdam: IOS Press, 201−205.

  • Sirel, Raul (2012). Dynamic User Interfaces for Synchronous Encoding and Linguistic Uniforming of Textual Clinical Data. Frontiers in Artificial Intelligence and Applications, 247: Human Language Technologies – The Baltic Perspective, 4.-5. Oct Tartu. Ed. Arvi Tavast, Kadri Muischnek, Mare Koit. Amsterdam: IOS Press, 206−212.

  • Muischnek, K.; Kaalep, H.-J.; Sirel, R. (2011). Korpuslingvistiline lähenemine eesti internetikeele automaatsele morfoloogilisele analüüsile. Metslang, H.; Langemets, M.; Sepper, M.-M. (Toim.). Eesti Rakenduslingvistika Ühingu aastaraamat (111−127). . Tallinn: Eesti Rakenduslingvistika Ühing. DOI: 10.5128/ERYa7.07.