#InfocusAI is happy to present to you another selection of studies in the field of artificial intelligence. In this issue, we will explore the effect LLMs have on the quality of peer reviews, a new approach to assessing how generative models understand the world, an innovative method of developing new materials, a promising human behavior simulation
AI-focused digest – News from the AI world
Issue 53, October 24 – November 14, 2024
LLMs can reduce the quality of peer reviews
Experts are increasingly using large language models to peer review computer science papers, leading to a drop in reviewing quality. Scientists have analyzed over 50 thousand reviews for computer-science publications compiled for scientific conferences in 2023-2024 and concluded that 7% to 17% of sentences in these reviews were created by LLMs. This is what the Stanford University professor James Zou reported in Nature, citing a recent study. One can discern that the text was not written by a human by its formal tone and frequency of using specific words, which is common for AI. For instance, the words “commendable” and “meticulous” are now found in reviews 10 times more frequently than before 2022. Usually, an AI-created analysis of scientific texts appears superficial and overly generalized, and such reviews frequently lack references to specific sections of the publication. You can read about the solutions for the problem scientists have proposed here.
Study: Generative AI is not as good at understanding the world as it seems
Modern generative models possess impressive capabilities, so it seems like they implicitly learn certain general truths about the world, with numerous studies supporting this notion. However, scientists from MIT, Harvard and Cornell Universities, inspired by the Myhill–Nerode theorem, have developed a new approach to assessing how good artificial intelligence understands the world and revealed that generative models do not see the model of the world with all its rules in all its integrity and clarity as it may seem. This is reported by MIT News. For example, during a navigation experiment in New York, a GPT-type model can provide step-by-step instructions for drivers with an almost ideal accuracy. But minor changes in the environment, like closing several streets and adding bypasses, can lead the model to fail, and non-existing objects can appear on the city map generated. Based on this and several other experiments, the scientists have concluded that a model performing well in one context may fail if the task or environment are slightly different. As the scientists have noted, creating generative models thoughtfully reflecting the basic logic of the domains they model will bring great value. You can read about this new method of assessing how close models are to achieve this goal in this preprint.
AI can aid in creating new materials, getting ideas from art
Markus Buehler, a professor from MIT, has developed an AI-based method intended to find hidden relationships between science and art to suggest new materials. In particular, as reported by MIT News, using this method it is possible to find common aspects in such seemingly unrelated items as biological tissue and Bethoven’s Symphony No. 9 or create new biological materials, inspired by Kandinsky’s Composition VII. In short, the scientist used generative artificial intelligence to convert data from a thousand scientific papers about biological materials into a comprehensive ontological knowledge graph and then taught AI to think of data based on the graphs and link previously unrelated dissimilar concepts. You can review scientific clarifications regarding the method and descriptions of experiments in the Machine Learning: Science and Technology magazine.
Scientists create an AI model capable of simulating and predicting human behavior
Here is another, as we believe, important invention of this fall. Scientists from several leading universities and research centers worldwide combined their efforts to create a computational model named Centaur, capable of predicting and simulating human behavior in any situation which can be described in natural language. Essentially, Centaur is the first real candidate for a unified human cognition model, as the scientists state. The model has been developed by fine-tuning the open LLM Llama 3.1 70B (created by Meta, recognized as extremist and prohibited in the Russian Federation) on the novel Psych-101 dataset, containing the results of 160 psychological experiments involving over 60 thousand people, with over 10 million acts of choices. The study showed that Centaur is excellent at simulating the behavior of new participants and successfully completes generalization tasks. Of course, more experiments, studies and adjustments are required. However, scientists forecast that their development will have a disruptive impact on the cognitive sciences. For details, see this preprint.
SuperJob: Neural networks increase salaries for programmers but reduce those for AI users
Good news for programmers and not as much for marketing specialists. According to a recent study by SuperJob, developers experienced in machine learning earn more than their colleagues lacking the same. For example, a Python programmer adept in ML has a salary of 370 thousand rubles a month on average, and a specialist without such knowledge earns 270 thousand rubles. But when it comes to positions requiring the use of neural networks in jobs, here the salaries offered are lower than those of experts working in the same fields but without AI. Researchers link this with the fact that employers think that the use of artificial intelligence greatly simplifies efforts and reduces the time spent on routine operations and are, therefore, unwilling to pay more. The ones losing due to AI are project managers, graphic and UX/UI designers, sales managers and marketing experts.