Bootstrap vs skeleton speed comparison

10/26/2022

We also propose a multi-viewpoint sampling method to leverage readily available positive pairs to produce distinct views of the same sample. We introduce two asymmetric augmentation pipelines to reduce the distribution shift between self-supervised pre-training and supervised fine-tuning. We propose new data augmentation approaches for action sequences tailored to make the learned representations robust to semantically-irrelevant variations.

Is an essential part of this method as it guides the network to learn relevant features in the absence of labels. In this work, we adapt BYOL for skeleton-based action recognition.Īs is the case with contrastive learning, data augmentation The stop-gradient operator ensures that the gradients are only propagated through the online network, while the target network’s parameters are updated with an exponential moving average of the online network’s parameters. The online network is trained to predict the output of the target network. The resulting views are passed to the online and target networks. Of the same action sequence captured from different viewing angles and transform them with two asymmetric augmentation pipelines.

We sample two random viewpoints s i, c 1, s i, c 2 Figure 1: Overview of our proposed method.

The online network is trained to predict the output of the target network, and the target network is updated with an exponential moving average of the online network. Instead, it uses two networks – an online network and a target network – to encode two augmented views of the same input sample. More recently, Bootstrap Your Own Latent (BYOL) showed that explicit negative pairs are not required to learn transferable visual representations. These methods learn representations by mapping different augmented views of the same input sample (positive pairs) closer while pushing augmented views of different input samples (negative pairs) apart.Ĭontrastive methods require a large number of negative samples to achieve good performance, necessitating large batch sizes or memory banks of negative samples. Contrastive learning has been used recently to reach remarkable performance in the image domain and is closing the performance gap with supervised pre-training when transferred to downstream tasks.

0 Comments

Bootstrap vs skeleton speed comparison

Leave a Reply.

Author

Archives

Categories