Optimal Retrieval for Continual Learning at Scale

Date

2024

Authors

Hickok, Truman

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

In recent years, deep neural networks have emerged as extremely scalable machine learning models. These models are often trained on billions of samples; however, as gaps in models' skill sets are revealed, naively continuing training after the expensive initial training phase leads to rapid forgetting of past tasks and reduced transfer to new tasks. Continual learning research is concerned with developing methods to counteract these effects, allowing models to continue training over data streams of indefinite length without overwriting existing representations.

One of the most widely used approaches in continual learning is referred to as replay. Replay methods support interleaved learning by storing past experiences in a replay buffer. Although there are methods for selectively constructing the buffer and reprocessing its contents, there is limited exploration of the problem of selectively retrieving samples from the buffer. Current solutions have been tested in limited settings and, more importantly, in isolation. Existing work has also not explored the impact of duplicate replays on performance.

In this thesis, we propose a framework for evaluating selective retrieval strategies, categorized by simple, independent class- and sample-selective primitives. We evaluated several combinations of existing strategies for selective retrieval and present their performances. Furthermore, we propose a set of strategies to prevent duplicate replays and explore whether new samples with low loss values can be learned without replay. In an effort to match our problem setting to a realistic continual learning pipeline, we restrict our experiments to a setting involving a large, pre-trained, open vocabulary object detection model, which is fully fine-tuned on a sequence of 15 datasets.

Description

Keywords

Citation

Department

Electrical and Computer Engineering