MTS AI taught Cotype Lite to communicate in the Tatar language.

fgfg Picture

MTS AI developed a new version of the large language model Cotype Lite to work with texts in the Tatar language. The company showcased the new version of its large language model Cotype at the Kazan Digital Week forum, which took place in the capital of Tatarstan from September 9 to 11. The LLM can process documents up to 8,000 tokens (approximately 5 A4 pages) in length and can extract and summarize data within seconds.

Cotype Lite can be used in archives, libraries, and both government and private organizations — wherever there is a need for information processing and document analysis in Tatar. For example, with the help of the large language model, the processing of applications in government agencies can be accelerated. Cotype will extract key information such as the subject of the request, location, and personal data of the applicant, and transfer this information into the relevant database. Like other models in the Cotype family, this version can be installed within an organization’s infrastructure, ensuring there are no data leaks.

«In creating a large language model in Tatar, the developers at MTS AI had several goals. Firstly, we wanted to support the linguistic diversity existing in Russia, helping them to develop and remain relevant in the digital age. Secondly, this project demonstrated that we are capable of adapting our models to any scientific and business tasks, including such non-trivial ones as processing information in the languages of the peoples of Russia,» said Dmitry Markov, Executive Director of MTS AI.

To enable Cotype Lite to understand an unfamiliar language, the developers compiled a dataset and translated it from Russian to Tatar. All the data and the model’s responses were then checked by experts in Turkic languages and native speakers.

According to the developers, Cotype Lite ranks among the best LLMs in its class: it contains 8 billion parameters. If needed, MTS AI can create an LLM in Tatar with a larger number of parameters—up to 70 billion parameters, as well as a larger context window of up to 32,000 tokens—allowing the model to perform tasks such as translation and long text generation. Additionally, MTS AI is ready to adapt the Cotype family models for other regional languages of Russia.

News
Latest Articles
See more
AI Trends
Reliability of LLMs and Alternative to Lidars
AI Trends
AI in Science and Cameron in Stability AI
Tech
MTS AI taught Cotype Lite to communicate in the Tatar language.
AI Trends
Chinese Version of J.A.R.V.I.S. and Agentic AI
Tech
MTS AI Presented Cotype PRO
AI Trends
Batteries for Microrobots and Global Spending on AI
Solutions
How AI Helps Optimize the Quality Control and Sorting Process in Manufacturing
AI Trends
Studying DNA and Fighting LLM Overconfidence
Без рубрики
MTS integrates WordPulse for analyzing calls and chats
Cases
MTS AI created an AI moderator for NUUM