Adaptive Similarity Bootstrapping for Self-Distillation based Representation Learning
- Resource Type
- Conference
- Authors
- Lebailly, Tim; Stegmuller, Thomas; Bozorgtabar, Behzad; Thiran, Jean-Philippe; Tuytelaars, Tinne
- Source
- 2023 IEEE/CVF International Conference on Computer Vision (ICCV) ICCV Computer Vision (ICCV), 2023 IEEE/CVF International Conference on. :16459-16468 Oct, 2023
- Subject
- Computing and Processing
Signal Processing and Analysis
Training
Representation learning
Schedules
Computer vision
Codes
Self-supervised learning
Behavioral sciences
- Language
- ISSN
- 2380-7504
Most self-supervised methods for representation learning leverage a cross-view consistency objective i.e. they maximize the representation similarity of a given image’s augmented views. Recent work NNCLR goes beyond the cross-view paradigm and uses positive pairs from different images obtained via nearest neighbor bootstrapping in a contrastive setting. We empirically show that as opposed to the contrastive learning setting which relies on negative samples, incorporating nearest neighbor bootstrapping in a self-distillation scheme can lead to a performance drop or even collapse. We scrutinize the reason for this unexpected behavior and provide a solution. We propose to adaptively bootstrap neighbors based on the estimated quality of the latent space. We report consistent improvements compared to the naive bootstrapping approach and the original baselines. Our approach leads to performance improvements for various self-distillation method/backbone combinations and standard downstream tasks. Our code is publicly available at https://github.com/tileb1/AdaSim.