Incremental Learning in Deep Learning
Abstract
Researchers often try to capture as much information as they can, either by using existing architectures, creating new ones, going deeper, or employing different training methods. This paper compares different ideas and methods that are used heavily in Machine Learning to determine what works best. These methods are prevalent in various domains of Machine Learning, such as Computer Vision and Natural Language Processing (NLP).
Transfer Learning is the Key
Throughout our work, we have tried to bring generalization into context, because that’s what matters in the end. Any model should be robust and able to work outside your research environment. When a model lacks generalization, very often we try to train the model on datasets it has never encountered … and that’s when things start to get much more complex. Each dataset comes with its own added features which we have to adjust to accommodate our model.
One common way to do so is to transfer learning from one domain to another.
Given a specific task in a particular domain, for which we need labelled images for the same task and domain, we train our model on that dataset. In practice, the dataset is usually the largest in that domain so that we can leverage the features extracted effectively. In computer vision, it’s mostly Imagenet, which has 1,000 classes and more than 1 million images. When training your network upon it, it’s bound to extract features2 that are difficult to obtain otherwise. Initial layers usually capture small, fine details, and as we go deeper, ConvNets try to capture task-specific details; this makes ConvNets fantastic feature extractors.
Normally we let ConvNet capture features by training it on a larger dataset and then modify. Fully connected layers in the end can do whatever we require for carrying out classification, and we can add a combination of linear layers. This makes it easy to transfer the knowledge of our network to carry out another task.
To read more about it, please refer to this original paper:
Using Transfer Learning to Introduce Generalization in Models
Also Transfer Learning in NLP is out now:
Visit AI Journal for more videos. Don’t forget to subscribe . Stay connected with us on Twitter to stay updated in AI Research. Please support me on Patreon
Link to all research papers:
- An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks
- Visualizing and Understanding Convolutional Networks
- Universal Language Model Fine-tuning for Text Classification
- Learning Without Forgetting
- Deep Residual Learning for Image Recognition
- Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution