3 datasets found

Licenses: Creative Commons Attribution Organizations: The Marconi Lab Formats: CSV

Filter Results
  • Lumasaba Monolingual Corpus

    Lumasaba sometimes known as Lugisu is a Bantu language spoken in the Eastern part of Uganda. This dataset contains a total of 39,999 sentences. The sentences are split into two...
  • Luganda Monolingual Corpus

    This dataset contains 100,000 Luganda sentences. Luganda is a Bantu language and is one of the major languages spoken in Uganda. This dataset was compiled by researchers at the...
  • Acoli Monolingual Corpus

    Acoli is a very low-resourced language spoken in parts of Northern Uganda. This dataset contains 40,037 Acoli sentences. The sentences were collected and evaluated by Acoli...
You can also access this registry using the API (see API Docs).