Just Mix Once: Mixing Samples with Implicit Group Distribution


Recent work has unveiled how average generalization frequently relies on superficial patterns in data. The consequences are brittle models with poor performance in the presence of domain shift in group distribution at test time. When the subgroups in the training data are known, we can use tools from robust optimization to tackle the problem. However, group annotation and identification are time-consuming tasks, especially on large datasets. A recent line of research~\cite{liu2021just} is trying to solve this problem with implicit group distribution at training time, leveraging self-supervision and oversampling to improve generalization on minority groups. Following such ideas, we propose a new class-conditional variant of MixUp~\cite{zhang2017mixup} for worst-group generalization, augmenting the training distribution with a continuous distribution of groups. Our method, called Just Mix Once (JM1), is domain-agnostic, computationally efficient, and performs on par or better than the state-of-the-art on worst-group generalization.

In NeurIPS 2021 Workshop on Distribution Shifts: Connecting Methods and Applications
Yunlong Jiao
Yunlong Jiao
Applied Machine Learning Research

My research interests include Deep Generative Models, Vision Language Models, Natural Language Processing, and Computational Biology.