In this issue of #InfocusAI we will talk about activation functions for neural networks, new CV algorithm for improving astronomical images and the reaction of Chinese companies to calling for a pause in LLM development. At the end of the article, we recommend a few research papers to help you better understand what to expect from the rapid development of large language models.
AI focused digest – News from the AI world
Issue 15, 23 March – 6 April 2023
Non-standard activation functions can increase the accuracy of neural networks
Neural networks can be designed in a way that minimizes the probability of incorrect classification of data input. That is the conclusion of MIT researchers after studying infinitely wide (number of neurons in a layer) and infinitely deep (number of layers) neural networks that were trained to complete classification tasks. The researchers have discovered that standard activation functions, often used by developers for training models in practice, in some cases are not the best choice: performance of the network can be worsened when increasing the depth. In the article, published in Proceedings of the National Academy of Sciences, the researchers describe how a correct selection of activation functions can help developers to design networks that classify data more accurately. They report that sometimes the functions that have never been used before, unlike ReLU or sigmoid, are the most effective, yet simple and easy to implement. The study also confirms the importance of theoretical proof. “If you go after a principled understanding of these models, that can actually lead you to new activation functions that you would otherwise never have thought of”, says Caroline Uhler, co-author of the research paper, as quoted on the MIT News website. The research is well explained in this article on the institute’s news portal, the scientific version is here.
Researchers develop a CV solution for clearer astronomical images
Researchers at Northwestern University and Tsinghua University have adapted a computer-vision algorithm used for sharpening photos to improve telescope astronomical images. It was made not because scientists wanted pretty space pictures. Images, even from the best ground-based telescopes, are blurred because of the Earth’s atmosphere, and these distortions can lead to errors in astronomical objects measurements. For example, atmospheric blur can warp the shape of galaxies. The new solution carefully removes the blur and does it faster and more accurately than the currently used technologies. For training the researchers used atmosphere-free images from the Hubble Space Telescope and added the atmospheric blur to them. The AI tool was originally designed to match the parameters of the Vera C. Rubin Observatory, which is expected to be fully operational next year. Read more about the project on the Innovation News Network website and check the code, guides and solution test results on GitHub.
UNESCO calls for implementing an ethical framework for AI immediately
Soon after the open letter from famous AI experts urging IT companies to pause the training of powerful neural networks until general safety protocols and regulatory policies are made, UNESCO published an appeal to governments. The organisation reminds about the Recommendation on the Ethics of Artificial Intelligence, endorsed in 2021, and suggests implementing it at the national level. This document is the first major guide for maximising the benefits and reducing the risks of AI. It contains values, principles and recommendations on ethical issues of AI, such as discrimination, disinformation, gender inequality, protection of privacy and personal data, human rights, environment protection, etc. The document was signed by all 193 Member States, but just over 40 countries are now working in partnership with the organisation to implement the recommendations at the national level. UNESCO calls on others to join the movement for ethical AI as well.
Chinese tech companies don’t share an idea to pause AI development
Chinese IT companies do not seem to have much faith in the effectiveness of artificial restraint on technological progress and believe that rather than stopping the development of large ML models like GPT-4, more effort should be put into developing safe and manageable AI. Last week, China Daily ran an article compiling the views of key Chinese industry figures in response to a call by Elon Musk and other thought leaders in AI to pause the development of powerful neural networks. In short, it boils down to this: whether people want it or not, this technology will still lead a new round of industrial revolution, and China needs to catch up with GPT and increase the country’s competitiveness. China’s AI spending is expected to reach $14.76 billion this year, about a tenth of the global AI market.
Ernie Bot taps into Chinese home appliance and cars
More news from China. Chinese home appliance, electronic and car manufacturers are stepping up to incorporate LLM-based chatbots into their products. More specifically, they bet on the most advanced Chinese equivalent to ChatGPT at the moment, Ernie Bot. So far major home appliance manufacturers such as Midea Group, Hisense Visual Technology Co Ltd and Sichuan Changhong Electric Co Ltd have joined the Ernie Bot ecosystem, as well as automakers: Jidu Auto, Geely, Dongfeng Nissan and Hongqi. The companies hope that their efforts will improve the machine-human interaction, spur the development of NLP technologies and create new revenue growth drivers. More in this China Daily article.
What we need to know about Large Language Models
And lastly, for a better understanding of LLM technology we recommend reading the recent article “Eight Things to Know about Large Language Models” by New York University professor Samuel R. Bowman. He presents eight points about large language models, more or less accepted in the professional community, and justifies them on the basis of previously published academic papers and statements by AI researchers. Here are the points:
1. LLMs predictably get more capable with increasing investment, even without targeted innovation.
2. Many important LLM behaviours emerge unpredictably as a byproduct of increasing investment.
3. LLMs often appear to learn and use representations of the outside world.
4. There are no reliable techniques for steering the behaviour of LLMs.
5. Experts are not yet able to interpret the inner workings of LLMs.
6. Human performance on a task isn’t an upper bound on LLM performance.
7. LLMs need not express the values of their creators nor the values encoded in web text
8. Brief interactions with LLMs are often misleading.
Many will also find the discussion part of the Bowman’s article quite interesting. No spoilers — read the article here. In addition, if you have time to read over 300 pages, check out the Stanford University’s Artificial Intelligence Index Report for the latest trends in the technology, as well as its impact on science, environment, labour market and economy.