AI Hackers and New DeepSeek Competitors

fgfg Picture

This issue of #InfocusAI is about the latest AI Index Report, the hybrid LLM line from a US-based startup and the next MIT project that helps create new drugs. We will also have a word on the RAG evaluation framework and cybersec threats posed by AI agents.

AI-focused digest – News from the AI world

Issue 63, March 27 – April 10, 2025

San Francisco startup releases a line of LLMs to compete DeepSeek 

The LLM market got a new notable player. This week, Deep Cogito, a San Francisco-based startup with an ex-Google engineer as a co-founder, released their first open source language model line, Cogito v1, capable of competing against LLaMA*  and DeepSeek. All the models feature hybrid architecture, that is combine standard elements and reasoning: they can offer answers immediately or take a pause to “self-reflect,” like the o-series by OpenAI or DeepSeek R1. The model range includes 5 sizes: 3, 8, 14, 32 and 70 billion parameters. All of them are trained using iterated distillation and amplification (IDA). Benchmarking put Cogito 70B (Standard) over LLaMA 3.3 70B in MMLU by 6.4 points (91.7% versus 85.3%) and over LLaMA 4 Scout 109B in the test total (54.5% versus 53.3%). Compared to DeepSeek R1 Distill 70B, Cogito 70B (Reasoning) performed better in general and multi-language tests, with notable 91.0% in MMLU and 92.7% in MGSM. See details at VentureBeat.

*LLaMA is an open source language model released by Meta Platforms. The organization is considered extremist and banned in the Russian Federation.

Stanford University releases latest AI Index Report

We will not go through the entire 400+ pages long paper, but here are just 5 of its valuable insights:

  • AI models’ performance was going up throughout 2024. Complex benchmarks like MMMU, GPQA and SWE-bench reveal that modern AI systems demonstrate better results by 18.8, 48.9 and 67.3 percentage points, respectively.
  • Business is focused entirely on AI, especially in the US, where private investments in this technology passed the USD 100 billion mark: almost 12 times as much as in China.
  • Soon, there will be no AI-free companies left. In 2024, the artificial intelligence technology was used by 78% of all organizations. Research confirms that smart solutions improve workforce productiveness and help close the skill gap.
  • The US and China are in the lead of the global AI race. 40 notable AI models were created in the US last year, while China put out 15.
  • The general attitude to AI differs from country to country. China, Indonesia and Thailand, for instance, see more benefits in efforts to develop neural networks than drawbacks, while Canada, the US, the Netherlands, Germany, France and the UK are far less optimistic.

See the full report here.

MIT researchers figure out how to improve new drug creation process via AI

Scientists from MIT and MIT-IBM Watson AI Lab, together with their peers from the University of Notre Dame (USA), developed a promising LLM-based tool that simplifies the new drug creation process. The model makes it possible to generate natural language queries for molecular structures with certain characteristics and get their detailed descriptions with step-by-step synthesis instructions. Unlike other language model-based approaches to the same problem, this project uses the LLM in conjunction with powerful AI graph models created specifically to design molecular structures. That, supposedly, makes molecular structures generated by the system align closer to provided specifications and be more feasible to synthesize. See details here.

New open source RAG evaluation framework is now available

Vectara and researchers from the University of Waterloo released Open RAG Eval, an open source framework for evaluating RAG (retrieval-augmented generation) systems. It can be used to assess RAG performance in 4 areas:

  • Hallucination detection evaluates the extent to which the generated content is infected with fictitious information not supported by source documents.
  • Citation makes a quantitative assessment of the extent to which quotes in the reply are substantiated in source documents.
  • Auto nugget evaluates if any significant chunks of source document information are present in the generated responses.
  • UMBRELA demonstrates the total retriever performance.

See more at VentureBeat.

AI agents may cause hacker unemployment

Cybersec experts warn that AI agents will soon be considered among the top information security threats. It is a question of when criminals are going to stop hiring human hackers and hand their nefarious affairs over to highly autonomous artificial intelligence systems, writes MIT Technology Review. AI agents capable of planning, reasoning and performing complex tasks can be used to detect vulnerable targets, infiltrate systems and steal large arrays of confidential data while being cheaper than professional hackers. Researchers have already demonstrated capabilities that these agents possess for replicating sophisticated attacks, so it is high time to think about effective strategies for detecting and countering the new threat.

News
Latest Articles
See more
In The Focus Of AI
AI Hackers and New DeepSeek Competitors
In The Focus Of AI
Dolphin Language and LLM Self-detoxification
Cases
МТS Live to use LLM from MTS AI to generate descriptions for events on its ticket showcase
Tech
MTS AI to Market New AI Assistant for Developers
Tech
MTS AI Releases Cotype Pro 2, Second-Generation Business-Focused LLM
AI Trends
AI Meme-Making and AGI Risks
AI Trends
Battle of AI Agents and Data Analysis for Medics
AI Trends
Musk’s AI Model and Humanoid Helpers
AI Trends
Plans for GPT-5 and Neural Networks for Mind Reading
Solutions
MTS AI Created Seven AI Services for Banks