After a short break, #InfocusAI digest is returning back to you. In this issue, we will talk about another CV model improvement from MIT to accelerate semantic segmentation of high-resolution images and a new patented USA technology that speeds up complex computations. We will also tell how Chinese scientists suggest using LVLM for searching industrial anomalies, what tool Meta (recognised as extremist and banned in the Russian Federation) developed to check computer vision models for biases, and how much water ChatGPT “drinks”. Buckle up and let’s go!
AI-focused digest – news from the AI world
Issue 25, August 17 – September 14, 2023
MIT developed a more efficient CV model for autonomous vehicles
MIT, in collaboration with MIT-IBM Watson AI Lab, has developed a more efficient computer vision model that can perform semantic segmentation of high-resolution images faster on an edge device (e.g., on-board computer). This is particularly valuable for autonomous vehicles that need to recognise surrounding objects rapidly and accurately and make quick decisions about further movement. High-resolution images contain millions of pixels. Vision transformers, which are widely used now, chop images into patches of pixels and encode these patches into tokens. The next step is generating an attention map. The more pixels – the bigger the attention map, which means it needs more computations and time. MIT came up with a simpler mechanism for the attention map construction which they realised in a new model series – EfficientViT. In short, the scientists replaced the sigmoid (nonlinear) activation function with a linear (ReLU) function in the attention module. This is how it became possible to change the order of operations to reduce total calculations. To compensate for accuracy loss, the researchers included components for capturing local feature interactions and a module that enables distinguishing between objects of different scales. This did not lead to a significant increase in computation. Tests showed that the EfficientViT model performs semantic segmentation nine times faster than popular vision transformers, without giving up accuracy. They also emphasise that EfficientViT architecture is hardware friendly. Read the details on MIT News.
A new technology for high-speed complex computation was patented in the USA
Last week, US-based Gigantor Technologies Inc announced that it had received a patent for its new invention – Custom Mass Multiplication Circuits, multiplying AI’s ability to perform complex calculations at high speeds. Not much is known about the technology, except that it is based on a unique non-Von Neumann model. The developers claim that their invention far surpasses the capabilities of the most advanced GPUs and TPUs and marks nothing less than a revolutionary step in the artificial intelligence field. We’ll see where this move leads us… Find some details in the press release on BusinessWire.
Chinese researchers applied LVLM technology to IAD tasks
Chinese scientists have developed a unique method for anomaly detection in industrial products, based on Large Vision-Language Models (LVLM). The main difference between the new approach, called AnomalyGPT, and other IAD (Industrial Anomaly Detection) solutions is that it doesn’t require manual threshold adjustments to distinguish between abnormal and normal samples. Anomalous images with corresponding textual descriptions were used as training data. The researchers also employed an image decoder and prompt embeddings to fine-tune the tool. AnomalyGPT can not only indicate the presence and location of an anomaly, but also provide image information. It also supports multi-turn dialogues. AnomalyGPT’s accuracy in anomaly detection was tested on several data sets, and in all cases the new method outperformed previously presented ones. More — in this preprint and on GitHub.
Meta* released a new tool to probe CV models for biases
Meta* has opened access to its new tool for testing computer vision models for biases. FACET (a tortured acronym for “FAirness in Computer Vision EvaluaTion”) allows researchers and AI experts to evaluate whether their CV models are adequate, tolerant and unbiased in classifying people with various physiological and demographic characteristics according to professions or occupations, for example. In particular, it can be used to answer questions such as “Are models better at classifying people as skateboarders when their perceived gender presentation has more stereotypically male attributes?” and “Are any biases magnified when the person has curly hair compared to straight hair?” Creating FACET required manual labelling of over 30,000 pictures representing over 50,000 people. Incidentally, this is hardly the first Meta* tool to test AI models for biases. Learn more about the IT giant’s successes and failures on the road to fair AI on TechCrunch.
*Meta is recognized as extremist and banned in the Russian Federation.
Scientists have calculated how much water ChatGPT “drinks”
And lastly, let’s talk about the environment, and more specifically, the importance of making artificial intelligence more effective, both in terms of training and application. The Associated Press has published a great and detailed article about how much water artificial intelligence consumes. Citing a yet unpublished study from the University of California, the publication reports that OpenAI’s advanced ChatGPT bot “drinks” up to half a litre of water every time it is given a series of 5-50 questions or prompts – depending on the weather and the server location. Water is required to remove heat from data processing centres and to cool the power plants that fuel them. Microsoft, which provided OpenAI with its computing power, reported that its water consumption in 2022 increased by 34% to nearly 1.7 billion gallons (that’s more than 2,500 Olympic swimming pools) compared to 2021. Google had an increase of 20%. Experts attribute this to AI activities of the IT giants. According to the West Des Moines Water Works, Microsoft pumped about 11.5 million gallons of water into its data centre cluster near Des Moines, Iowa, in July 2022 (a month before OpenAI said it had completed GPT-4 training). This can be considered the true birthplace of ChatGPT. Quite a lot, isn’t it? Scientists hope that the publicity of such large-scale water consumption by artificial intelligence will stimulate the activity of AI giants and the scientific community to use resources more sustainably. Now, when talking to ChatGPT, will you consider how much water it took to respond?