Categori yi
Tëem yi
VAANI am na bu baax bu mag bu jëkkër ci wolof yu jëm ci Indian Institute of Science (IISc), Bangalore, bu am na ci 21,500 at yu audio yu jëkk ci 110,000 jàngalekat ci 120 dëkk yi ak 22 réew yi ci India. Am na ci 86 wolof ak diaklét yu jëm ci India, ci wàllu wolof yu jëkkër ak ci jàmmu réew ak jàmmu jàmm, ak 835 at yu jëkk ci wolof yu jot.
Dataset bi am na ci jëfandikoo ci wàllu jàngat ak wolof yu jëm ci jëfandikoo ak jàngat, jëfandikoo ci jàngat bu jëkk, jëfandikoo ci jàngat bu jëkk, jëfandikoo ci jàngat bu jëkk, jëfandikoo ci jàngat bu jëkk, ak jëfandikoo ci jàngat bu jëkk. Dëggal na ci CC BY 4.0, am na ci jëfandikoo ci jàngat ak benchmarking AI systems, ak jàmmu jàmm ci jàngalekat ak jàngalekat yu jëkk ci wolof yu jëkkër ak yu jëkk ci India.
Jàmm ak Takk
VAANI (bu jëkk ci "ndaw" walla "góor-góor" ci ndaw yu nekk ci Indian yu jëkk) dafa jëfandikoo ci Indian Institute of Science (IISc) ci Bangalore ngir jàppale ak jàmm bu am solo ci data ndaw ngir Indian yu jëkk. Lii, jàmm ak jàmm bu Indian dafa am ci jàmm bu am solo ci àdduna, jëfandikoo ci jàmm yu nekk ak jàmm yu jëkk, waaye majority ci data ndaw yu nekk jëfandikoo ci jàmm yu am solo. VAANI dafa jëfandikoo ngir jàppale ak jàmm bu am solo ci jàmm bu am solo ci jàmm bu jëkk ngir jàmm bu am solo ci àdduna.
Data dafa jëfandikoo ci 110,000 ndaw yu nekk ci 120 département ci 22 Indian yu jëkk, bu jàppale ak jàmm bu am solo ngir jàmm bu am solo ak jàmm bu jëkk. Dataset bi dafa jëfandikoo ci 86 jàmm ak jàmm, jëfandikoo ci jàmm yu am solo ak jàmm yu jëkk ni Hindi, Tamil, Telugu, Bengali, Kannada, ak Malayalam, ak jàmm yu am solo ak jàmm yu jëkk ni Gondi, Santali, Kurukh, Wancho, ak Tenyidie, ak jàmm yu am solo.
Composition Dataset ak Njàngaan Yu Am Solo
VAANI dafa jëfandikoo ci 21,500 at yu audio ci total, bu jëfandikoo ngir jàmm bu am solo ci jàmm yu jëkk. Ci lii, 835 at dafa jëfandikoo, jëfandikoo ci jàmm bu am solo ngir jàmm bu am solo. Dataset bi dafa jëfandikoo ak jàmm yu am solo, bu jëfandikoo ngir jàmm bu am solo ci jàmm bu jëkk.
Njàngaan yu am solo ci dataset bi nekk:
- Jàmm ci 86 jàmm ak jàmm, jëfandikoo ci jàmm yu am solo ak jàmm yu jëkk
- Jàmm ci 110,000 ndaw yu nekk ci jàmm bu am solo ak jàmm bu jëkk
- 21,500 at yu audio ak 835 at yu jëfandikoo
- Jëfandikoo ci 120 département ci 22 Indian yu jëkk
- Jëfandikoo ci CC BY 4.0, jëfandikoo ngir jàmm bu am solo ak jàmm bu jëkk
- Jàmm ngir jàmm bu am solo ak jàmm bu am solo ci jàmm bu jëkk
Jàmm Yu Am Solo ak Jàmm Yu Jëkk
VAANI dafa jëfandikoo ngir jàmm bu am solo ngir jàmm bu jëkk ak jàmm bu jëkk. Jàmm yu jëfandikoo ak jàmm yu jëkk mën na jëfandikoo ngir jàmm bu am solo ak jàmm bu jëkk, jëfandikoo ci jàmm bu am solo ak jàmm bu jëkk. Dataset bi dafa jëfandikoo ngir jàmm bu am solo ak jàmm bu jëkk ak jàmm bu jëkk. Jàmm bu am solo ak jàmm bu jëkk mën na jëfandikoo ngir jàmm bu am solo ak jàmm bu jëkk.
Ngir jàmm bu am solo ak jàmm bu jëkk, VAANI dafa jëfandikoo ngir jàmm bu am solo ak jàmm bu jëkk. Dafa jëfandikoo ngir jàmm bu am solo ak jàmm bu jëkk, jëfandikoo ngir jàmm bu am solo ak jàmm bu jëkk.
Jàmm Yu Am Solo ngir Teknoloji Jàmm Yu Indian
Jàmm ak jàmm bu am solo ci VAANI dafa jëfandikoo ngir jàmm bu am solo ci jàmm bu jëkk, jëfandikoo ngir jàmm bu am solo ak jàmm bu jëkk. Jàmm bu am solo ak jàmm bu jëkk mën na jëfandikoo ngir jàmm bu am solo ak jàmm bu jëkk, jëfandikoo ngir jàmm bu am solo ak jàmm bu jëkk. Jàmm bu am solo ak jàmm bu jëkk mën na jëfandikoo ngir jàmm bu am solo ak jàmm bu jëkk.