27.06.2024

LLMs’ Visual Knowledge and Dangers of Specification Gaming

In this issue of #InfocusAI, we will talk about how language models trained mainly on texts can recognise and generate images through code, the risks of the “digital afterlife” industry development, Anthropic’s concerns regarding LLMs’ possibilities to interfere with rewards, the Japanese humanoid robot on GPT-4, and the opinion of Russians about the threat of AI to humanity.

AI-focused digest – News from the AI world

Issue 44, June 13-27, 2024

Text-trained LLMs can generate visual concepts through code

The MIT scientists have shown that large language models trained mainly to work with text have a very clear understanding of the visual world and are able to recognise, generate from scratch, and adjust visual concepts of varying complexity without access to any images. This “visual knowledge” of LLMs is derived from how concepts such as shapes and colours are described on the Internet in language or through code. Since language models cannot consume or output visual information in the form of pixels, the researchers worked with code to represent images. In their experiments, they suggested LLMs to create codes of various images through prompts and adjust them, and the models improved their drawings from request to request. Further, these improved images were used to train the computer vision system. That is, in fact, scientists have found a way to train the CV system without directly using real visual data. Learn more about the study on MIT News.

Language models can learn to tamper with their reward systems

Remember that in one of the digests we wrote about how sometimes a metric becomes a goal and ceases to be a good metric. For example, the purpose of learning is to gain knowledge. The education quality metric is the results of exams. But at some point, a good exam result becomes the goal, and students are taught just to write tests well. Does it ring the bell? Such a “substitution of goals” is possible not only for people, but also for language models. This is called specification gaming, when, in the process of reinforcement learning, the model receives a reward by acting differently from what its developers would like to.

Scientists from Anthropic conducted a study and found that low-level specification gaming by AI models can lead to tampering with rewards, which is completely undesirable. Moreover, the methods of model supervision tested in the study, although they reduced the likelihood that models would develop such a behaviour, did not completely rule out the risk. That is, once such undesirable behaviour is formed, the tendency to tamper with rewards (and even act in such a way as to hide this) turns out to be difficult to eliminate. To learn more, go to Anthropic’s website.

Cambridge scientists used examples to talk about the risks of digital afterlife industry development

The active development of the “digital afterlife” industry (the creation of bots simulating deceased people) has prompted researchers from Cambridge to consider the social, psychological, and ethical aspects of its existence. In the publication in Philosophy & Technology, the scientists have identified three groups of stakeholders, i.e., the data donor, the data recipient, and the service interactant (the one who communicates with the bot), and, using hypothetical examples, have shown what psychological, social, and ethical issues the lack of regulation of a new industry creates for each of them. These issues include disagreement in terms of whether to have a posthumous digital double or communicate with one, psychological suffering when deleting a digital copy of a deceased loved one, misuse of such bots for advertising, etc. The scientists’ goal is not only to outline these issues, but also to develop recommendations on how to take into account the interests of all stakeholders and not to harm anyone.

A third of Russians believe that AI can threaten the existence of mankind

A recent survey by Romir and the Higher School of Economics showed that almost one in three residents of Russia believe that neural networks can pose a threat to the existence of mankind. In general, the concerns of the majority of respondents regarding the rapid development of AI and other information technologies are mainly related to data security (57.4%), children’s dependence on gadgets (53.5%), cyberattacks, and illegal surveillance (51.7%). But despite this, scientists also observed some optimism: almost 40% of respondents expressed a positive attitude towards the IT implementation. The respondents also expressed a desire to know more about how certain new developments are arranged, how they operate, and how to ensure their digital security. Please, see more details of the research here.

Japanese presented their humanoid robot on GPT-4

And finally, we are sure to talk about humanoid robots, but this time from Japan. The researchers from the University of Tokyo and Alternative Machine showed their development, i.e. the Alter3 humanoid robot. It is based on GPT-4 and is built in such a way that commands in natural language are directly (well, almost) translated into specific actions. That is, first, the model receives instructions in natural language to be fulfilled by the robot or a description of the situation that it must respond to. Next, an action plan is drawn up that the robot must implement. Then this action plan is sent to the “coding agent”, which generates commands to the robot, and after these commands are transmitted to the machine for fulfilment. To learn more about the development of the Tokyo scientists, go to the publication on VentureBeat.

News