Controlled Forgetting in Vision-Language Models

Key Takeaways

Vision-language models (VLMs) have high generalization ability, but this can lead to risks if not controlled properly
The approximate domain unlearning (ADU) algorithm allows VLMs to "forget" specific domains to prevent misrecognition
ADU introduces a new perspective on risk management by enabling flexible AI configuration suited to individual practical scenarios
The proposed algorithm promotes separation between domains in the feature space and reduces recognition accuracy for unnecessary domains
ADU has the potential to provide safe and reliable AI technology for various applications

Introduction to Vision-Language Models
Vision-language models (VLMs) are a core technology in modern artificial intelligence (AI) that can represent different forms of expression or learning, such as photographs, illustrations, and sketches. They have high generalization ability, which allows them to accurately recognize objects in images within a domain. However, this generalization ability can be a double-edged sword, as it can lead to risks if not controlled properly. For example, a VLM may recognize both real cars and illustrated cars as "cars," which can cause problems if installed in a system that needs to distinguish between the two.

The Need for Domain Control
The risk associated with VLMs’ generalization ability is a significant concern, especially when it comes to safety-critical applications. To address this issue, it is essential to establish technology that can control learned knowledge according to the application. This is where the concept of domain control comes in. Domain control refers to the ability to selectively forget or ignore specific domains in a VLM to prevent misrecognition. This can be achieved through the use of algorithms that can distinguish between different domains and reduce the recognition accuracy for unnecessary domains.

The Approximate Domain Unlearning Algorithm
A team of researchers, led by Associate Professor Go Irie from Tokyo University of Science, has proposed the approximate domain unlearning (ADU) algorithm, which allows VLMs to "forget" specific domains. The ADU algorithm introduces a method called Domain Disentangling Loss that promotes separation between domains in the feature space and captures the different domain appearances in each image. Additionally, the algorithm uses an instance-wise prompt generator to reduce the recognition accuracy for unnecessary domains while minimizing the need for them. This enables flexible AI configuration suited to individual practical scenarios and allows for flexible knowledge control that was previously impossible.

Technical Difficulty and Solution
One of the significant technical difficulties in implementing the ADU algorithm is that the domains cannot be distinguished within the VLM. Different domains overlap in the feature space, making it challenging to select and forget only specific domains. To overcome this challenge, the researchers introduced the Domain Disentangling Loss method, which promotes separation between domains in the feature space. This method allows the algorithm to capture the different domain appearances in each image and reduce the recognition accuracy for unnecessary domains.

Implications and Future Directions
The ADU algorithm has significant implications for the development of safe and reliable AI technology. By enabling flexible AI configuration suited to individual practical scenarios, the algorithm can help prevent misrecognition and ensure safety in various applications. The researchers believe that their system, which allows for free control of functions, will enable the provision of safe and reliable AI technology to the world. The ADU algorithm also introduces a new perspective on risk management, highlighting the importance of considering the generalization ability of modern AI models and the need for flexible configuration to ensure safety and adaptability.

Conclusion and Future Research
In conclusion, the ADU algorithm is a significant breakthrough in the development of vision-language models. By enabling VLMs to "forget" specific domains, the algorithm can help prevent misrecognition and ensure safety in various applications. The researchers’ work has the potential to provide safe and reliable AI technology for various applications, and their findings have been presented at the 39th Conference on Neural Information Processing Systems (NeurIPS 2025). Future research should focus on further developing and refining the ADU algorithm to enable more efficient and effective domain control in VLMs. Additionally, the implications of the ADU algorithm for risk management and safety-critical applications should be further explored to ensure the development of safe and reliable AI technology.