LLM Jailbreak and AI for Training Robots

fgfg Picture

Check out this year’s first batch of news from the world of artificial intelligence in our #InfocusAI digest! You will learn why researchers from Singapore trained AI to hack famous LLMs, what innovative ideas American scientists have for teaching robots, and why The New York Times decided to sue OpenAI. 

AI-focused digest – News from the AI world

Issue 33, December 28, 2023 – January 11, 2024

Singapore scientists trained AI to hack LLMs

Researchers at Nanyang Technological University in Singapore have trained a neural network to hack chatbots like ChatGPT and Bard, which are based on large language models, to make them do things that they are restricted from doing by their developers. This is called jailbreaking. Don’t worry, they did it entirely with good intentions – to help companies look for vulnerabilities and better protect their creations from hackers. First, the scientists reverse-engineered how LLMs defend themselves from malicious queries. Then, using that data, they trained their neural network to produce prompts that can bypass the defence. Training of an AI jailbreaker can be automated, meaning it can automatically adapt to produce new hacking prompts even after the weakness has been patched. More – on the university’s website

A composition of several foundation models helps robots to make feasible plans

Washing dishes and cleaning up the room are intuitive tasks for a human, but for a robot they require careful planning with detailed descriptions of actions. To help develop such instructions for robots, researchers at MIT have proposed a compositional multimodal system that combines foundation models trained on language, vision, and action data, which they have named HiP. Unlike RT-2 and other multimodal models, HiP uses three different foundation models, each trained on different data modalities. Each model captures a different part of the decision-making process and then they work  together to make decisions. As the scientists explain, this is cheaper than building monolithic multimodal foundation models. Learn more information with explanations on MIT News and in this academic paper

Scientists at Stanford developed a new system for mobile robot training

A bit more about robots. Stanford University has developed a novel and relatively low-cost system that can effectively train mobile bimanual robots to perform a variety of tasks that require mobility and whole-body participation to coordinate movement. Essentially, the robot learns from humans: a human operator demonstrates how to perform different tasks by remotely controlling  the robot arms through a special teleoperating system. This demonstration data is then collected and used to train the robot control system through end-to-end imitation learning. As a result, the robot can repeat all the learned actions autonomously. To learn more about how the new approach to training robotic systems differs from others built on human demonstrations, check out the publication on VentureBeat.

AI can automate the explanation of complex neural networks

Outlining how computation works in large models like GPT-4 is hard labour, especially given the fact that they are constantly growing, changing and becoming more complex. Researchers at MIT have proposed a method to automate the explanation of such complex neural networks with artificial intelligence, MIT News reports. The method is based on the so-called automated interpretability agent (AIA). AIAs plan and perform tests on computational systems (ranging in scale from individual neurons to entire models) with the aim to explain their work in a variety of formats. For example, this may be a language description or a code that reproduces the system’s behaviour. The article underlines that, unlike other existing approaches to interpretation, AIA actively engages in hypothesis formation, experimental testing and iterative learning, thereby improving its understanding of other computing systems in real-time. In addition, the scientists have developed their own standard for evaluating different methods of interpretation: they call it FIND. The scientific publication with details can be found here

The New York Times and OpenAI will meet in court

Discussions and litigation over copyright and AI don’t even begin to slow down. Toward the end of 2023, The New York Times filed suit against OpenAI and Microsoft for copyright infringement when training generative models on its content. The newspaper reported about this itself. And last Monday, OpenAI published an open response to the influential newspaper on its blog, stating that the lawsuit is without merit. The Company is sure that training AI models on data publicly available on the Internet, including from The New York Times, doesn’t violate the principles of fair use, and this is critical to advancing innovation and ensuring the competitiveness of the U.S. AI industry. The reasoning seems rock-solid on both sides – read up on the publications. Let’s wait what the court will say… The case is really interesting – the leader of the world AI industry and a newspaper with a long history covering the topic of AI development since the announcement of the first working neural network 60+ years ago, which, by the way, was honoured with special respect in OpenAI’s response.

Latest Articles
See more
Media about MTS AI
MTS AI Unveils New LLM Specifically for Business Use
Team news
MTS AI employee joins AI Alliance Science Council
AI Trends
Biothreat Protection and LLM under the Dragon Sign
AI Trends
Sleeper Agents and New Abilities of GPTs
AI Trends
LLM Jailbreak and AI for Training Robots
AI Trends
Word of the Year and Math Discoveries of LLM
AI Trends
AI Governance and OpenAI Competitors’ Alliance
AI Trends
Doc Producers vs GenAI and LLM Harmlessness Test
Team news
MTS AI Engineer Wins Big at AI Journey Conference Competitions
AI Trends
AI Success in the Turing Test and Weather Forecasting