Just Mix Once: Mixing Samples with Implicit Group Distribution

Giorgio Giannone, Serhii Havrylov, Jordan Massiah, Emine Yilmaz, Yunlong Jiao

December 2021

Abstract

Recent work has unveiled how average generalization frequently relies on superficial patterns in data. The consequences are brittle models with poor performance in the presence of domain shift in group distribution at test time. When the subgroups in the training data are known, we can use tools from robust optimization to tackle the problem. However, group annotation and identification are time-consuming tasks, especially on large datasets. A recent line of research~\cite{liu2021just} is trying to solve this problem with implicit group distribution at training time, leveraging self-supervision and oversampling to improve generalization on minority groups. Following such ideas, we propose a new class-conditional variant of MixUp~\cite{zhang2017mixup} for worst-group generalization, augmenting the training distribution with a continuous distribution of groups. Our method, called Just Mix Once (JM1), is domain-agnostic, computationally efficient, and performs on par or better than the state-of-the-art on worst-group generalization.

Type

Preprint

Publication

In NeurIPS 2021 Workshop on Distribution Shifts: Connecting Methods and Applications