The new issue of the #InfocusAI is already on its way to you. In this issue, we’ll discuss OpenAI’s plans for the release of GPT-4.5 and GPT-5, learn more about Andrew Ng’s agentic object detection method, explore experiments in mind reading, and dive into the updates to Mistral AI’s Le Chat.
AI-focused digest – News from the AI world
Issue No. 59, January 17 – February 13, 2025
OpenAI CEO Reveals Plans for GPT-5
OpenAI is preparing changes to its model lineup. The company’s CEO, Sam Altman, announced plans to release GPT-4.5 and GPT-5. According to Altman, the first to launch will be GPT-4.5, internally codenamed Orion. It will be “OpenAI’s last non-reasoning model.” Notably, the model was trained in October of last year, but its release was delayed due to high training costs and a lack of data. The next step will be the release of GPT-5, which will combine multiple OpenAI technologies, including their most advanced model to date, o3. The company plans to discontinue the standalone delivery of o3, integrating its capabilities into the new system. Users of the free version of ChatGPT will get unlimited access to GPT-5 with standard intelligence settings. Plus subscribers will be able to use GPT-5 with enhanced intelligence, while Pro users will gain access to the system’s maximum capabilities. Altman did not specify exact release dates but indicated that the launch would happen in the coming weeks or months.
US Develops Object Detection System Without Pre-Labeled Data
American scientist Andrew Ng has introduced a revolutionary method called Agentic Object Detection. Unlike traditional methods, it does not require pre-labeled data and allows objects to be found based on textual descriptions. For example, a user can ask the system to find “unripe strawberries” in an image, and it will handle the task without prior training on labeled data. The technology uses advanced reasoning algorithms to analyze various object characteristics: color, shape, and spatial relationships. Experts predict widespread applications for agentic detection across industries, from gaming to industrial automation. Developers continue to refine the system and plan to add object tracking features and support for video content. For more details, click here.
Meta* Develops Mind-Reading Technology for Text Input
Researchers at Meta have proposed a non-invasive method for «reading» thoughts, allowing the decoding of text a person plans to type based on their brain activity. Participants in the experiment were asked to memorize sentences and then type them on a standard keyboard. Their brain signals were recorded using magnetoencephalography (MEG) and electroencephalography (EEG). The uniqueness of the method lies in the innovative Brain2Qwerty architecture. It consists of several components: a convolutional module analyzes short time windows of brain signals, highlighting key features of motor activity during keystrokes; a transformer considers the context of the entire phrase; and a pre-trained model “corrects” potential errors, taking into account natural language statistics. During the experiment, 35 healthy volunteers typed sentences, and the system was able to decode their thoughts with relatively low error rates. Using MEG, the average error rate was 32% per character, while some participants achieved as low as 19%.
*Meta is recognized as an extremist organization and banned in Russia.
Mistral AI Introduces Updated Le Chat
Mistral AI has announced a major update to its AI assistant, Le Chat. The key feature of the new version is the Flash Answers function, enabling the processing of up to 1,000 words per second. Le Chat has also received an improved document and image recognition system, which the developers claim is the best in the industry. The assistant can analyze complex PDFs, tables, logs, and even hard-to-read images. The update includes a built-in code interpreter, allowing programs to be run in an isolated environment, conduct scientific analysis, and generate images. The system gathers information from various sources, including web searches, journalistic materials, and social media, which, according to the developers, ensures comprehensive and well-founded responses. In the near future, the creators plan to add the ability to connect to corporate systems and create AI agents.
Russian Scientists Decipher Pushkin’s Manuscripts Using AI
Specialists from Smart Engines have applied artificial intelligence to decipher crossed-out fragments in Alexander Pushkin’s manuscripts. The neural network architecture, Da Vinci, originally developed for document recognition, was able to reconstruct the crossed-out words by analyzing the characteristic features of the poet’s handwriting. The system studies pen movements in preserved texts and uses this data to restore lost fragments. According to Smart Engines CEO Vladimir Arlazarov, this technology opens up new possibilities not only for studying Pushkin’s manuscripts but also for other historical documents. The method has already uncovered several previously unknown words in the poet’s drafts. More details can be found here.