![]() |
VOOZH | about |
We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.
Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.
Follow TNS on your favorite social media networks.
Become a TNS follower on LinkedIn.
Check out the latest featured and trending stories while you wait for your first TNS newsletter.
Despite the massive leaps forward in machine learning models recently, experts are still wrangling with the challenge of ensuring that machines don’t forget previously learned knowledge — especially when they are learning new knowledge.
This problem is known as catastrophic forgetting, or catastrophic interference. It occurs when the weights of an artificial neural network are optimized for learning a new task, which can, in turn, interfere with prior knowledge that is stored in the same weights. As an AI model parses new inputs, the statistical relationships between the model’s internal representations can change, mix or overlap — potentially leading to reduced performance (or “model drift“) or (at its worst) to the model abruptly and drastically forgetting its prior training.
There are a number of factors that might lead to a model ‘forgetting’. These include overfitting the model to new training data, limited model capacity, shared parameters, using a training technique that is ill-suited to the task, and the lack of regularization.
Nevertheless, some experts point out that the exact mechanisms behind catastrophic forgetting aren’t yet well understood.
“While there are a lot of studies in the field of continual learning investigating how to address catastrophic forgetting experimentally through algorithm design, there is still a lack of understanding on what factors are important and how they affect catastrophic forgetting,” explained Sen Lin, an assistant professor in University of Houston’s computer science department, and the co-author of a recent study on the effect of catastrophic forgetting on continual learning. “Our study filled these gaps up by revealing three important factors: model over-parameterization, task similarity, and task ordering, and their impacts on learning performance.”
In general, approaches to prevent catastrophic forgetting fall into three broad categories: regularization, memory-based techniques, and architecture-based methods.
Regularization techniques preserve meaningful weight parameters that are important to old tasks when training the model for new tasks. These include:
Architecture-based techniques are modifications to the model architecture that can help “freeze” critical parameters of old tasks in order to accommodate for new task learning or by increasing model size when more model capacity is required. These encompass methods such as:
Memory-Based techniques help to store information about old tasks into some kind of memory storage, which the model can then use to “replay” information during current task learning.
It is also possible to customize these techniques even further by using a hybrid approach, where two of more of the aforementioned methods are combined in order to bypass the limitations of any one method. For instance, variational continual learning (VCL) integrates both elastic weight consolidation (EWC) and generative replay (GR) to both regularize model weights while replaying old training data via a variational auto-encoder.
Despite this myriad of potential solutions, a universal solution for catastrophic forgetting has yet to be found. With AI models becoming ever more larger, complex and polyvalent, catastrophic forgetting remains a crucial obstacle to overcome in the quest for continual learning.