Raul Sirel
General
Date of birth: 14.11.1986
E-mail: rsirel@texta.ee
Phone: No phone number? Send an e-mail!
Research interests: language technology, machine learning, history, geography.
Career
01.01.2017–… TEXTA, CTO; Founder (1,00)
01.05.2015–31.12.2016 STACC , Project Manager (1,00)
01.09.2013–31.12.2013 University of Western Sydney, Visiting Researcher (0,50)
01.09.2013–31.12.2013 NICTA Canberra Research Lab, Visiting Researcher (0,50)
01.07.2011–30.04.2015 STACC , Researcher (1,00)
15.11.2010–30.06.2011 STACC , Researcher (0,50)
Education
2011–… University of Tartu, PhD Studies in General Linguistics
2009–2011 University of Tartu, MA Studies in Computational Linguistics
2006–2009 University of Tartu, BA Studies in Computational Linguistics
2003–2006 Tallinn Technical Secondary School
Qualifications
Academic Degree
Raul Sirel, Master’s Degree, 2011, (sup) Margus Treumuth, Poolautomaatne teadmusbaaside konstrueerimine dialoogsüsteemidele (Semi-Automatic Knowledge Base Construction for Dialogue Agents), University of Tartu.
Raul Sirel, Phd student, (sup) Kadri Muischnek; Jaak Vilo, Eestikeelsete terviselugude andmekaeve (Text mining of Estonian Medical Records), University of Tartu.
Additional Training
2014 - 13th Estonian Summer School on Computer and Systems Science
2013 - Lisbon Machine Learning Summer School (LxMLS)
2013 - 12th Estonian Summer School on Computer and Systems Science
2013 - Australian Health Informatics Summer School
2012 - 11th Estonian Summer School on Computer and Systems Science
Projects
Scientific Projects
EMBEDDIA “Cross-Lingual Embeddings for Less-Represented Languages in European News Media”, TEXTA OÜ, H2020: https://cordis.europa.eu/project/id/825153.
TAR16013 (EXCITE) “Estonian Centre of Excellence in ICT Research (1.09.2016−1.03.2023)”, Maarja Kruusmaa, Tallinn University of Technology , School of Information Technologies, Centre for Biorobotics, Cybernetica AS.
EKTR3 “TEXTA Toolkit 2.0 (1.01.2018−31.12.2020)”, Raul Sirel, TEXTA.
IUT34-4 “Data Science Methods and Applications (DSMA) (1.01.2015−31.12.2020)”, Jaak Vilo, University of Tartu, Faculty of Science and Technology, Institute of Computer Science.
EKT108 “TEXTA tööriistakomplekti jätkuarendus ja juurutamine (1.01.2017−31.12.2017)”, Raul Sirel, TEXTA.
EKT68 “Töövahendite raamistik tekstikorpustest teadmuse tuletamiseks (1.01.2015−31.12.2016)”, Raul Sirel, STACC .
ETF9124 “Modelling of conversational agent and Estonian dialogue corpus (1.01.2012−31.12.2014)”, Mare Koit, University of Tartu, Faculty of Mathematics and Computer Science.
EKT5 “Eestikeelse dialoogi pragmaatika analüsaator (1.01.2011−31.12.2013)”, Mare Koit, University of Tartu, Faculty of Mathematics and Computer Science.
SF0180078s08 “Development and implementation of formalisms and efficient algorithms of natural language processing for the Estonian language (1.01.2008−31.12.2013)”, Mare Koit, University of Tartu, Faculty of Mathematics and Computer Science.
ETF7503 “Communicative strategies in a communication model: modelling Estonian dialogue on the computer (1.01.2008−31.12.2011)”, Mare Koit, University of Tartu, Faculty of Mathematics and Computer Science.
EKKTT09-65 “Automaatne parafraaside leidmine ning sõnade ja lühifraaside tõlkimine paralleelkorpuste abil (1.01.2009−31.12.2010)”, Maarika Traat, University of Tartu, Faculty of Mathematics and Computer Science.
Commercial Projects
TTJA “Automatic tagging of e-mails 2”, 2021-2022, TEXTA OÜ.
MKM “Bürokratt chatbot 1 & 2”, 2020-2021, TEXTA OÜ.
MKM “Bürokratt - automatic tagging of e-mails”, 2020-2021, TEXTA OÜ.
TTJA “Automatic tagging of e-mails”, 2020-2021, TEXTA OÜ.
Rahvusraamatukogu “Automatic tagger: KRATT 1 & 2”, 2019-2021, TEXTA OÜ.
HTM “Detecting GDPR leaks from document registry”, 2018,TEXTA OÜ.
Ekspress Grupp “Automatic moderation of comments”, 2019-…, TEXTA OÜ.
RIK “Automatic deidentification of court decisions.”, 2019, TEXTA OÜ.
Õhtuleht “Automatic tagging of newspaper articles”, 2018, TEXTA OÜ.
Äripäev “Automatic tagging of newspaper articles”, 2018, TEXTA OÜ.
CV Online “Recommendation engine for CV-s and job offers”, 2017, TEXTA OÜ.
Õhtuleht “Automatic moderation of comments”, 2016, STACC OÜ.
Supervised Dissertations
Katrin Valdson, Master’s Degree, 2016, (sup) Raul Sirel; Aleksandr Tkatšenko, Mustripõhine informatsiooni eraldamine Eesti kohtulahenditest (Pattern based information extraction from Estonian court documents), University of Tartu, Faculty of Science and Technology, Institute of Computer Science.
Teaching
2015 Estonian NLP in Python (course)
2014 Corpus Lingustics (course)
Publications
Asula, Marit; Makke, Jane; Freienthal, Linda; Kuulmets, Hele-Andra; Sirel, Raul (2021). Kratt: Developing an Automatic Subject Indexing Tool for the National Library of Estonia. Cataloging & Classification Quarterly, 1−19. DOI: 10.1080/01639374.2021.1998283.
Vaik, Kristiina; Asula, Marit; Sirel, Raul (2020). Hybrid Tagger – An Industry-driven Solution for Extreme Multi-label Text Classification. In: Proceedings of the LREC2020 Industry Track (26−30). Language Resources and Evaluation Conference (LREC 2020), Marseille, 11–16 May 2020. European Language Resources Association (ELRA). DOI: 10.5281/zenodo.4306169.
Suominen, Hanna; Johnson, Maree; Zhou, Liyuan; Sanchez, Paula; Sirel, Raul; Basilakis, Jim; Hanlen, Leif; Estival, Dominique; Dawson, Linda; Kelly, Barbara (2015). Capturing patient information at nursing shift changes: methodological evaluation of speech recognition and information extraction. Journal of the American Medical Informatics Association, 22 (E1), E48−E66.10.1136/amiajnl-2014-002868.
Sirel, Raul (2013). Meetodeid tekstide leksikaalsete ja grammatiliste erinevuste tuvastamiseks meditsiiniliste tarbetekstide näitel. Eesti Rakenduslingvistika Ühingu aastaraamat, 9, 265−278. DOI: 10.5128/ERYa9.17.
Reisberg, S; Sirel, R; Kalda, R; Merzin, M; Pruulmann, J; Vilo, J (2013). Elektrooniliste terviselugude analüüsimise võimalused Tartu perearstide infosüsteemi näitel. Eesti Arst, 92 (8), 452−459.
Sirel, Raul (2013). Morphostatistical Approach to Medical Document Classification. Proceedings of the 4th International Louhi Workshop on Health Document Text Mining and Information Analysis (Louhi 2013): 11-12 Feb 2013 Sydney. Ed. Suominen, Hanna.
Sirel, R. (2012). Knowledge Acquisition Tool for Dialogue Systems. Frontiers in Artificial Intelligence and Applications, 247: Human Language Technologies – The Baltic Perspective, 4.-5. Oct Tartu. Ed. Arvi Tavast, Kadri Muischnek, Mare Koit. Amsterdam: IOS Press, 201−205.
Sirel, Raul (2012). Dynamic User Interfaces for Synchronous Encoding and Linguistic Uniforming of Textual Clinical Data. Frontiers in Artificial Intelligence and Applications, 247: Human Language Technologies – The Baltic Perspective, 4.-5. Oct Tartu. Ed. Arvi Tavast, Kadri Muischnek, Mare Koit. Amsterdam: IOS Press, 206−212.
Muischnek, K.; Kaalep, H.-J.; Sirel, R. (2011). Korpuslingvistiline lähenemine eesti internetikeele automaatsele morfoloogilisele analüüsile. Metslang, H.; Langemets, M.; Sepper, M.-M. (Toim.). Eesti Rakenduslingvistika Ühingu aastaraamat (111−127). . Tallinn: Eesti Rakenduslingvistika Ühing. DOI: 10.5128/ERYa7.07.