This repository curates NLP datasets and models for languages spoken in Ghana. It is maintained by Ghana NLP, with the goal of supporting research and development of natural language processing for all Ghanaian languages.
Join Ghana NLP to stay involved.
Click to jump to any language:
Name | Description | Link |
---|---|---|
Twi-English Parallel Sentences | Twi and English Aligned translation pairs | View |
Fante Speech Transcribed | Transcribed multi-speaker speech dataset | View |
Twi Transcribed | Asante Twi Bible single speaker transcribed dataset (split at verse level) | View |
Twi Transcribed | Asante Twi Bible single speaker transcribed dataset (split at utterance level) | View |
Name | Description | Link |
---|---|---|
ABENA | BERT model for Asante Twi - cased1, uncased2 and distilled uncased3, and Akuapem Twi - cased4 . | 1 | 2 | 3 | 4 |
Akan Whisper | Speech recognition model for Akan | View |
Asante Twi Speech Recognition | Speech recognition and transcription model for Asante Twi | View |
Name | Description | Link |
---|---|---|
Dagbani Orthography | Spelling guide corpus | View |
Name | Task | Framework | Link |
---|---|---|---|
DagBERT | Language modeling | Transformers | GitHub |
(No entries yet — contribute!)
(No entries yet — contribute!)
- Abron (abr) (No data)
- Adamorobe Sign Language (ads) (No data)
- Adangbe (adq) (No data)
- Adele (ade) (No data)
- Ahanta (aha) (No data)
- Akposo (kpo) (No data)
- Animere (anf) (No data)
- Anufo (cko) (No data)
- Anyin (any) (No data)
- Avatime (avn) (No data)
- Awutu (afu) (No data)
- Bimoba (bim) (No data)
- Bisa (bib) (No data)
- Bondoukou Kulango (kzc) (No data)
- Boro (xxb) (No data)
- Buli (bwu) (No data)
- Chakali (cli) (No data)
- Chala (cll) (No data)
- Cherepon (cpn) (No data)
- Chumburung (ncu) (No data)
- Dangme (ada) (No data)
- Deg (mzw) (No data)
- Delo (ntr) (No data)
- Dompo (doy) (No data)
- Dwang (nnu) (No data)
- Esahie (sfw) (No data)
- Ewe (ewe) (No data)
- Farefare (gur) (No data)
- Ga (gaa) (No data)
- Ghanaian Pidgin English (gpe) (No data)
- Ghanaian Sign Language (gse) (No data)
- Gikyode (acd) (No data)
- Gonja (gjn) (No data)
- Gua (gwx) (No data)
- Hanga (hag) (No data)
- Jwira-Pepesa (jwi) (No data)
- Kamara (jmr) (No data)
- Kantosi (xkt) (No data)
- Kasem (xsm) (No data)
- Konkomba (xon) (No data)
- Konni (kma) (No data)
- Kplang (kph) (No data)
- Krache (kye) (No data)
- Kusaal (kus) (No data)
- Larteh (lar) (No data)
- Lelemi (lef) (No data)
- Ligbi (lig) (No data)
- Logba (lgq) (No data)
- Mampruli (maw) (No data)
- Nafaanra (nfr) (No data)
- Nawuri (naw) (No data)
- Nchumbulu (nlu) (No data)
- Nkami (nkq) (No data)
- Nkonya (nko) (No data)
- Ntcham (bud) (No data)
- Nyagbo (nyb) (No data)
- Nzema (nzi) (No data)
- Paasaal (sig) (No data)
- Safaliba (saf) (No data)
- Sekpele (lip) (No data)
- Selee (snw) (No data)
- Siwu (akp) (No data)
- Southern Birifor (biv) (No data)
- Southern Dagaare (dga) (No data)
- Tafi (tcd) (No data)
- Tampulma (tpm) (No data)
- Tumulung Sisaala (sil) (No data)
- Tuwuli (bov) (No data)
- Vagla (vag) (No data)
- Wali (wlx) (No data)
- Wasa (wss) (No data)
- Western Sisaala (ssl) (No data)
- Fork the repo
- Add your dataset or model under the correct language section
- Submit a pull request with a clear description
Or open an issue to suggest links or ask questions.
We’re a community of researchers and developers building NLP tools for Ghanaian languages.
👉 Join us here
Each dataset or model has its own license. Check links or contact the maintainers for reuse conditions.
Maintained with 💛 by Ghana NLP