Contributors
Words Collected
Contributions
The digital divide for languages like Somali is a significant challenge. We're building a grassroots, community-driven approach to create the resources needed to bring Somali language support to the world's largest technology platforms.
Collect and curate a large, high-quality, open-source dataset of Somali text and speech data.
Fine-tune existing models on our dataset to demonstrate the potential of a Somali language model.
Advocate for Somali language inclusion with tech companies and secure funding for long-term sustainability.

This is a grassroots movement. We're building a community of volunteers, linguists, technologists, and Somali speakers who believe in the power of language inclusion.
Help transcribe Somali speech from broadcasts.
Submit Somali text, digitize documents.
Share our mission with your network.
Get news about our progress, contribution opportunities, and milestones.