`Eesti keeles `_ Raul Sirel ========== .. image:: ../img/raul.png :width: 200 General ####### * Date of birth: 14.11.1986 * E-mail: rsirel@texta.ee * Phone: No phone number? Send an e-mail! * Research interests: language technology, machine learning, history, geography. Career ###### * 01.01.2017–... TEXTA, CTO; Founder (1,00) * 01.05.2015–31.12.2016 STACC , Project Manager (1,00) * 01.09.2013–31.12.2013 University of Western Sydney, Visiting Researcher (0,50) * 01.09.2013–31.12.2013 NICTA Canberra Research Lab, Visiting Researcher (0,50) * 01.07.2011–30.04.2015 STACC , Researcher (1,00) * 15.11.2010–30.06.2011 STACC , Researcher (0,50) Education ######### * 2011–... University of Tartu, PhD Studies in General Linguistics * 2009–2011 University of Tartu, MA Studies in Computational Linguistics * 2006–2009 University of Tartu, BA Studies in Computational Linguistics * 2003–2006 Tallinn Technical Secondary School Qualifications ############## Academic Degree *************** * Raul Sirel, Master's Degree, 2011, (sup) Margus Treumuth, Poolautomaatne teadmusbaaside konstrueerimine dialoogsüsteemidele (Semi-Automatic Knowledge Base Construction for Dialogue Agents), University of Tartu. * Raul Sirel, Phd student, (sup) Kadri Muischnek; Jaak Vilo, Eestikeelsete terviselugude andmekaeve (Text mining of Estonian Medical Records), University of Tartu. Additional Training ******************* * 2014 - 13th Estonian Summer School on Computer and Systems Science * 2013 - Lisbon Machine Learning Summer School (LxMLS) * 2013 - 12th Estonian Summer School on Computer and Systems Science * 2013 - Australian Health Informatics Summer School * 2012 - 11th Estonian Summer School on Computer and Systems Science Projects ######## Scientific Projects ******************* * EMBEDDIA "Cross-Lingual Embeddings for Less-Represented Languages in European News Media", TEXTA OÜ, H2020: https://cordis.europa.eu/project/id/825153. * TAR16013 (EXCITE) "Estonian Centre of Excellence in ICT Research (1.09.2016−1.03.2023)", Maarja Kruusmaa, Tallinn University of Technology , School of Information Technologies, Centre for Biorobotics, Cybernetica AS. * EKTR3 "TEXTA Toolkit 2.0 (1.01.2018−31.12.2020)", Raul Sirel, TEXTA. * IUT34-4 "Data Science Methods and Applications (DSMA) (1.01.2015−31.12.2020)", Jaak Vilo, University of Tartu, Faculty of Science and Technology, Institute of Computer Science. * EKT108 "TEXTA tööriistakomplekti jätkuarendus ja juurutamine (1.01.2017−31.12.2017)", Raul Sirel, TEXTA. * EKT68 "Töövahendite raamistik tekstikorpustest teadmuse tuletamiseks (1.01.2015−31.12.2016)", Raul Sirel, STACC . * ETF9124 "Modelling of conversational agent and Estonian dialogue corpus (1.01.2012−31.12.2014)", Mare Koit, University of Tartu, Faculty of Mathematics and Computer Science. * EKT5 "Eestikeelse dialoogi pragmaatika analüsaator (1.01.2011−31.12.2013)", Mare Koit, University of Tartu, Faculty of Mathematics and Computer Science. * SF0180078s08 "Development and implementation of formalisms and efficient algorithms of natural language processing for the Estonian language (1.01.2008−31.12.2013)", Mare Koit, University of Tartu, Faculty of Mathematics and Computer Science. * ETF7503 "Communicative strategies in a communication model: modelling Estonian dialogue on the computer (1.01.2008−31.12.2011)", Mare Koit, University of Tartu, Faculty of Mathematics and Computer Science. * EKKTT09-65 "Automaatne parafraaside leidmine ning sõnade ja lühifraaside tõlkimine paralleelkorpuste abil (1.01.2009−31.12.2010)", Maarika Traat, University of Tartu, Faculty of Mathematics and Computer Science. Commercial Projects ******************* * TTJA "Automatic tagging of e-mails 2", 2021-2022, TEXTA OÜ. * MKM "Bürokratt chatbot 1 & 2", 2020-2021, TEXTA OÜ. * MKM "Bürokratt - automatic tagging of e-mails", 2020-2021, TEXTA OÜ. * TTJA "Automatic tagging of e-mails", 2020-2021, TEXTA OÜ. * Rahvusraamatukogu "Automatic tagger: KRATT 1 & 2", 2019-2021, TEXTA OÜ. * HTM "Detecting GDPR leaks from document registry", 2018,TEXTA OÜ. * Ekspress Grupp "Automatic moderation of comments", 2019-..., TEXTA OÜ. * RIK "Automatic deidentification of court decisions.", 2019, TEXTA OÜ. * Õhtuleht "Automatic tagging of newspaper articles", 2018, TEXTA OÜ. * Äripäev "Automatic tagging of newspaper articles", 2018, TEXTA OÜ. * CV Online "Recommendation engine for CV-s and job offers", 2017, TEXTA OÜ. * Õhtuleht "Automatic moderation of comments", 2016, STACC OÜ. Supervised Dissertations ######################## * Katrin Valdson, Master's Degree, 2016, (sup) Raul Sirel; Aleksandr Tkatšenko, Mustripõhine informatsiooni eraldamine Eesti kohtulahenditest (Pattern based information extraction from Estonian court documents), University of Tartu, Faculty of Science and Technology, Institute of Computer Science. Teaching ######## * 2023 `Introduction to Machine Learning (lecture) `_ * 2015 Estonian NLP in Python (course) * 2014 Corpus Lingustics (course) Publications ############ * Asula, Marit; Makke, Jane; Freienthal, Linda; Kuulmets, Hele-Andra; Sirel, Raul (2021). Kratt: Developing an Automatic Subject Indexing Tool for the National Library of Estonia. Cataloging & Classification Quarterly, 1−19. DOI: 10.1080/01639374.2021.1998283. * Vaik, Kristiina; Asula, Marit; Sirel, Raul (2020). Hybrid Tagger – An Industry-driven Solution for Extreme Multi-label Text Classification. In: Proceedings of the LREC2020 Industry Track (26−30). Language Resources and Evaluation Conference (LREC 2020), Marseille, 11–16 May 2020. European Language Resources Association (ELRA). DOI: 10.5281/zenodo.4306169. * Suominen, Hanna; Johnson, Maree; Zhou, Liyuan; Sanchez, Paula; Sirel, Raul; Basilakis, Jim; Hanlen, Leif; Estival, Dominique; Dawson, Linda; Kelly, Barbara (2015). Capturing patient information at nursing shift changes: methodological evaluation of speech recognition and information extraction. Journal of the American Medical Informatics Association, 22 (E1), E48−E66.10.1136/amiajnl-2014-002868. * Sirel, Raul (2013). Meetodeid tekstide leksikaalsete ja grammatiliste erinevuste tuvastamiseks meditsiiniliste tarbetekstide näitel. Eesti Rakenduslingvistika Ühingu aastaraamat, 9, 265−278. DOI: 10.5128/ERYa9.17. * Reisberg, S; Sirel, R; Kalda, R; Merzin, M; Pruulmann, J; Vilo, J (2013). Elektrooniliste terviselugude analüüsimise võimalused Tartu perearstide infosüsteemi näitel. Eesti Arst, 92 (8), 452−459. * Sirel, Raul (2013). Morphostatistical Approach to Medical Document Classification. Proceedings of the 4th International Louhi Workshop on Health Document Text Mining and Information Analysis (Louhi 2013): 11-12 Feb 2013 Sydney. Ed. Suominen, Hanna. * Sirel, R. (2012). Knowledge Acquisition Tool for Dialogue Systems. Frontiers in Artificial Intelligence and Applications, 247: Human Language Technologies – The Baltic Perspective, 4.-5. Oct Tartu. Ed. Arvi Tavast, Kadri Muischnek, Mare Koit. Amsterdam: IOS Press, 201−205. * Sirel, Raul (2012). Dynamic User Interfaces for Synchronous Encoding and Linguistic Uniforming of Textual Clinical Data. Frontiers in Artificial Intelligence and Applications, 247: Human Language Technologies – The Baltic Perspective, 4.-5. Oct Tartu. Ed. Arvi Tavast, Kadri Muischnek, Mare Koit. Amsterdam: IOS Press, 206−212. * Muischnek, K.; Kaalep, H.-J.; Sirel, R. (2011). Korpuslingvistiline lähenemine eesti internetikeele automaatsele morfoloogilisele analüüsile. Metslang, H.; Langemets, M.; Sepper, M.-M. (Toim.). Eesti Rakenduslingvistika Ühingu aastaraamat (111−127). . Tallinn: Eesti Rakenduslingvistika Ühing. DOI: 10.5128/ERYa7.07.