07.03.2024

New Claudes, Singing Portraits and Surgeon’s Avatar

Elon Musk’s claim against OpenAI, a waterfall of investment in a developer of humanoid robots, a new family of models from Anthropic, an AI system that creates singing portraits, and a neurosurgeon’s avatar – we’ll cover all this in our new issue of the #InfocusAI digest.

AI-focused digest – News from the AI world

Issue 37, February 22 – March 7, 2024

Elon Musk aims to force OpenAI to serve humanity through court

The beginning of spring was marked by another lawsuit against OpenAI. This time the creator of ChatGPT received claims from Elon Musk. He accused the company and its CEO Sam Altman of deviating from its original mission – to create AGI for the benefit of all humanity – in favor of financial gain for some individuals and companies, particularly Microsoft. It’s all over the world’s media. As reported, the lawsuit aimed at forcing OpenAI to return to its original mission and open its technology to the public. As expected, OpenAI disagrees with the billionaire’s accusations. The company’s official response can be found on its blog.

Technology leaders invest in humanoid robots

If you’re interested in AI and robotics, then you’ll need to remember a new name – Figure AI. This startup developing humanoid robots has already attracted the attention of a number of major investors and technology companies. Bloomberg informs that OpenAI, Microsoft, Nvidia, Amazon, Jeff Bezos himself (through Explore Investments LLC), Intel Corp., Samsung, LG Innotek, Parkway Venture Capital, Align Ventures, ARK Venture Fund, Aliya Capital Partners, Tamarack, Boscolo Intervest Ltd., and BOLD Capital Partners have invested in the project. It is reported that Figure AI engineers are now working on a robot called Figure 01. The startup believes that their humanoid robots could one day take over dangerous operations and solve the labor shortage.

Anthropic has introduced a new family of Claude models

Anthropic has recently introduced to the world its new family of AI models – Claude 3. The models have musical and poetic names: Haiku, Sonnet and Opus. The “most intelligent” of them is Opus. According to the developer, it outperforms GPT-4 and Gemini on all fronts, including math problem solving, coding, reasoning and question answering. Haiku is the fastest and most cost-effective model for its intelligence category: it can read an information and data dense research paper with charts and graphs in less than three seconds. Sonnet has a higher level of intelligence and performs well on tasks such as knowledge retrieval or sales automation, while being twice as fast as Claude 2 and Claude 2.1 in most cases. Learn more about the abilities of the Claude 3 family on the Anthropic website.

Researchers at Alibaba developed an AI system that brings portraits to life

Alibaba Group’s Institute for Intelligent Computing has developed a new AI system with an Audio2Video diffusion model called EMO (short for Emote Portrait Alive). It can generate a strikingly realistic video based on a portrait photo, in which a person speaks or sings, and the movement of facial muscles and head closely corresponds to the sound track. VentureBeat writes that Alibaba’s development could be considered a significant advancement in “talking head video generation”. Read the scientific paper with more details here.

MIT works with AR/VR startup to create an avatar of a renowned neurosurgeon to remotely train doctors

Massachusetts Institute of Technology and EDUCSIM, a medical simulator and AR/VR startup, have joined forces to create a virtual avatar of renowned pediatric neurosurgeon Benjamin Warf. Using the avatar, Dr Warf, being in Boston, can show aspiring surgeons, say, somewhere in São Paulo or even further away, how to perform brain surgery. Trough virtual reality goggles, they can see the doctor’s digital twin almost as if it were alive, ask questions and get answers. The avatar has synchronous and asynchronous modes. In the synchronous mode, Dr Warf operates his avatar from a distance in real time. The virtual doctor can walk around the room, talk to surgeons, guide their actions and show them how to perform certain procedures on the brain model. It is something like a “holoportation”. In the asynchronous mode, the aspiring doctors deal with a prearranged avatar demonstration. However, they also have the opportunity to interact with the mentor, or more specifically, a neural network trained on the research and the extensive set of questions and answers provided by Warf. Here MIT News talks about the history, goals and features of the project.

News