The SYSPIN project offers publicly available, studio-recorded text-to-speech (TTS) datasets in multiple Indian languages. These validated speech and text files are designed for academic research, industrial development, and innovation in TTS synthesis. The corpus, created by IISc Bengaluru, is released under a CC-BY-4.0 license, ensuring open access for researchers and developers aiming to advance speech technology in Indian languages.
When this corpus is used, please cite the following reference:
Abhayjeet et al. 'SYSPIN_S1.0 Corpus - A TTS Corpus of 900+ hours in nine Indian Languages', 2025.
To download the SYSPIN dataset, please click the button below:
Download Dataset© 2024 SYnthesizing SPeech in INdian languages
Terms and conditions