5 datasets found

Organizations: The Marconi Lab

Filter Results
  • The Makerere Gendered Corpus: A Gendered English to Luganda Parallel Corpus

    This English-Luganda parallel sentence corpus consists of gendered examples created by a team of researchers from Makerere AI Lab at Makerere University with a team of Luganda...
  • Kiswahili Monolingual Corpus

    This dataset contains 100,000 Kiswahili sentences. We want to thank the team at the Makerere AI and Marconi Labs at Makerere University, TAVODET Youth Development (TYD)...
  • Lumasaba Monolingual Corpus

    Lumasaba sometimes known as Lugisu is a Bantu language spoken in the Eastern part of Uganda. This dataset contains a total of 39,999 sentences. The sentences are split into two...
  • Luganda Monolingual Corpus

    This dataset contains 100,000 Luganda sentences. Luganda is a Bantu language and is one of the major languages spoken in Uganda. This dataset was compiled by researchers at the...
  • Acoli Monolingual Corpus

    Acoli is a very low-resourced language spoken in parts of Northern Uganda. This dataset contains 40,037 Acoli sentences. The sentences were collected and evaluated by Acoli...
You can also access this registry using the API (see API Docs).