My Background

I am often asked, by linguists and lay-people alike, why I decided to pursue a career in linguistics. This question, regrettably, is one for which I have never been able to furnish a suitable answer. The study and documentation of the world’s languages has simply always been a fascination of mine, as long as I can reasonably remember; correspondingly, not long after I discovered that an entire academic field existed dedicated to this very study, I resolved to apply myself to it.

 

After completing my International Baccalaureate diploma in 2018 in Halifax, Nova Scotia and receiving a parting scholarship, I moved to Edmonton, Alberta to pursue a B.A. in linguistics (minoring in anthropology) from the eponymous University of Alberta, and quickly took an interest in morphosyntax and semantics. In my second year, I had the good fortune of taking a course on global writing systems offered by Dr. Antti Arppe, through whom I took my first steps into involvement in practical language documentation work in November 2019. This first foray into ‘real’ linguistic work involved sifting through hundreds of hours of raw audio recordings of Plains Cree to mark the positions of individual words and sentences of note, and providing (prewritten) annotations of the transcribed and translated forms of these words and sentences. Ultimately, I ended up annotating 124 recording sessions, totalling well over 600 hours of raw audio, over the course of my first two years working on the project.

 

Over the course of late 2019 and early 2020, I became increasingly involved with the language documentation work being performed by Dr. Arppe and his research group, the Alberta Language Technology Lab. In May 2020, I was awarded the Roger S. Smith Undergraduate Researcher Award, a scholarship worth $5000, to pursue research in Plains Cree lexical semantics. This project involved manually reviewing all 21 345 entries in the largest Plains Cree dictionary of the time and providing semantic classifications to each entry, in order to allow the contents of the dictionary to be organised by meaning, rather than simply alphabetically. During this time, I also began experimenting with vector semantics, a computational method for assigning semantic classes to lexical items, which eventually yielded several publications (including my Honours thesis) concerning the application of the method to classification entries in bilingual dictionaries of Plains Cree, as well as several other Indigenous Canadian languages. 

 

In spring of 2021, I began working on the Maskwacîs Speech Database, manually reviewing each of the 20 300 entries in the database to standardise the written forms of the Cree transcriptions, to provide morphological analyses for each entry, and to standardise the English glossing conventions. Later in that year, I also began working with native Cree speakers in Maskwacîs, Alberta (primarily Rose Makinaw) to review the audio recordings of the entries in the database, as well as to add any missing or underspecified English definitions, and to elicit novel vocabulary and example sentences. This native speaker ‘validation’ process for the Maskwacîs Speech Database remains in progress.

 

In summer of 2021, I received the Undergraduate Research Initiative Stipend, a scholarship worth $6000, to author an English translation of the Dictionnaire de la langue des Cris, a late 19th century French missionary dictionary of Plains Cree which, due to being written in French, was inaccessible to most Plains Cree communities, the majority of which now speak (often only) English. This translation process took several months, ultimately resulting in all 11 411 entries in the dictionary being translated and digitised by hand, with the resulting translation being made freely available online.

 

 

 

In winter of 2022

 

Select Occupational Experience

  • Graduate Research Assistant, Alberta Language Technology Lab (2012-present)
    • Headed lexicographic documentation projects and corpus development for Plains Cree, developed morphological models for several Indigenous languages, assisted in conference organisation, and produced academic publications representing the lab.
  • Undergraduate Research Assistant, Alberta Language Technology Lab (2019-2022)
    • Performed various linguistic data-related tasks, including semantic classification, translation, interlinearisation, audio annotation, and vocabulary elicitation
  • Tutor-Mentor, Tutorbright (2019-2023)
    • Taught both English and French as second languages to students of various ages, typically on an individual level

Publications

Papers

Dacanay, D., Harrigan, A., & Arppe, A. (2021). Computational analysis versus human intuition: a critical comparison of vector semantics with manual semantic classification in the context of Plains Cree. In Silfverberg, M., & Desjardins, J.  (eds.), Proceedings of the 4th Workshop on Computational Methods for Endangered Languages, 1, 33-43. doi:10.33011/computel.v1i.971 

 

Dacanay, D., Harrigan, A., Wolvengrey, A. & Arppe, A. (2021). The more detail, the better? – Investigating the effects of semantic ontology specificity on vector semantic classification with a Plains Cree / nêhiyawêwin dictionary. In Mager, M., Oncevay, A., Rios, A., Meza Ruiz, I.V., Palmer, A., Neubig, G., Kann, K.  (eds.), Proceedings of the 1st Workshop on NLP for Indigenous Languages of the Americas, 1, 143-52. doi:10.18653/v1/2021.americasnlp-1.15

 

Dacanay, D., Poulin, J., & Arppe, A. (2022). kêtiski-kotahâskwâtam: The effectiveness of various hypernymic levels of WordNet synsets as vector semantic classification categories. In McCaulay, M. & Noodin, M. (eds.), Papers of the Fifty-Third Algonquian Conference (PAC53), 53. 

 

Dacanay, D. (2022). Lexical Semantic Classification in Plains Cree (nêhiyawêwin): Manual and Computational Approaches (Honours thesis, University of Alberta).

 

Dacanay, D. & Arppe, A. (2023a, forthcoming). Digitizing, translating, and standardizing Pr. Albert Lacombe’s Dictionnaire de la langue des Cris (1874). In McCaulay, M. & Noodin, M. (eds.), Papers of the Fifty-Fourth Algonquian Conference (PAC54), 54.

 

Poulin, J., Dacanay, D., & Arppe, A. (2023b, forthcoming). Speech Database (Speech-DB) – An on-line platform for recording, storing, validating, and searching spoken language data. In the Proceedings of 1st Workshop on NLP applications to Field Linguistics (Field Matters). 

 

Arppe, A., Neitsch, A., Dacanay, D., Poulin, J., Hieber, D., & Harrigan, A. (2023c, forthcoming) Finding words that aren’t there: Using word embeddings to improve dictionary search for low-resource languages. In Proceedings of the 3rd Workshop on NLP for Indigenous Languages of the Americas

 

Dacanay, D., & Arppe, A. (2024, forthcoming). misi-mîkiwâhp pêsêkinosa ohci – A corpus of miscellaneous Plains Cree texts. In Papers of the Fifty-Fifth Algonquian Conference (PAC55), 55.

Presentations

Arppe, A., Poulin, J., Harrigan, A., Schmirler, K., Dacanay, D., & Makinaw, R. (2022). êkosi ê-nêhiyawi-pîkiskwêcik maskwacîsihk – Towards a Spoken Dictionary of Maskwacîs Cree. Presentation conducted at the Fifth Workshop on the Use of Computational Methods in the Study of Endangered Languages (ComputEL-5). 

 

Dacanay, D. & Schmirler, K. (2022). An Analogy-Based Alternative to “Counter-Intuitive” Grammatical Animacy in Plains Cree/nêhiyawêwin. Presentation conducted at the Fifty-Fourth Algonquian Conference (PAC54).

Awards

  • International Baccalaureate Diploma Scholarship (2018)
    • $1000CAD
  • Roger S. Smith Undergraduate Researcher Award (2020)
    • $5000CAD
  • Undergraduate Researcher Initiative Stipend (2021)
    • $6000CAD

Workshops

Volunteer

  • Canadian Indigenous Languages and Literacy Development Institute Summer Program, University of Alberta, Edmonton, Alberta, July 10-28 2023
    • Volunteer Teaching Assistant (VeSPA)
  • Alberta Conference on Linguistics, University of Calgary, Calgary, Alberta, October 29 2022
    • Presenter

Attendee

  • Spring Training in Experimental Psycholinguistics, Centre for Comparative Psycholinguistics, Edmonton, Alberta, June 12-17 2023 
  • The Minimal Course approach for teaching/learning Algonquian Languages, Carleton University, Online, October 17 2021
  • First Workshop on Natural Language Processing for Indigenous Languages of the Americas, Association for Computational Linguistics, Online, June 11 2021
  • Organization of Alberta Students in Speech (OASIS) Conference, University of Alberta, Edmonton, Alberta, September 28 2019