MTS AI has released an open large language model (LLM) called Cotype Nano to address business challenges related to the creation and analysis of text in the Russian language. The model can be run locally on personal devices—mobile phones, desktop computers, and laptops with average performance—making it accessible to a broad range of users.
The model’s weights—parameters used for decision-making—are open to researchers and developers. This allows them to study how the model works, customize it to their needs, and use it in their own projects without the need to build everything from scratch.
The model has demonstrated the best results in its class on the Ru Arena Hard benchmark—the first open independent platform in Russia for evaluating LLM models in the Russian language. The testing assesses the accuracy, quality, and relevance of responses to user questions compared to other models.
Cotype Nano can process up to 32,000 tokens (about 45 pages of text) at a time, allowing it to handle large volumes of data. The model is trained for content generation, accurate and rapid translations between Russian and English, processing and analyzing text data to enhance customer service, and can be used for developing chatbots and virtual assistants. Additionally, it has advanced data classification capabilities, which are essential for scenarios such as automatic searching and analyzing information in corporate knowledge bases.
Cotype Nano is optimized for operation on both CPU and GPU, with additional optimization for Intel processors, enabling it to run on laptops and even smartphones, making the model accessible to a wide range of developers and companies that do not have access to powerful computing resources.
“MTS AI adheres to the principles of openness and transparency in the development of generative artificial intelligence. We are launching an open large language model with a license for commercial use and are developing new services for the automatic training of neural networks and code generation to accelerate the development process. The advancement of open LLM models in Russia will enable companies, as well as aspiring developers and researchers, to create neural network-based solutions without significant investments in development and hardware,” said Sergey Ponomarenko, Director of LLM Products at MTS AI.
Cotype Nano contains 1.5 billion parameters. MTS AI trained the model on instructional datasets, including computer code, mathematics, and synthetic data.
The inference speed—the processing of text and generation of results—of Cotype Nano is about 190 tokens per second on the Nvidia A100 GPU and 9.5 tokens per second on a smartphone with a Qualcomm Snapdragon 8 Gen 2 processor. The model is based on the Qwen 2.5 transformer architecture and is compatible with popular inference frameworks such as VLLM, OpenVINO, and Hugging Face. Cotype Nano can be downloaded at the link.