A primary limitation of Graph Neural Networks(GNNs)is their vulnerability to distribution shifts. In real-world applications,models often rely on”shortcuts”—spurious correlations that appear predictive during training but vanish in unseen environments,such as irrelevant motifs in molecular structures.While invariant learning offers a remedy, balancing information retention with noise suppression in non-Euclidean spaces remains a significant hurdle. This is particularly difficult when the nature of environmental factors is unknown, often causing models to prioritize unstable patterns over robust causal logic.
To address this, the USTC team developed the InfoIGL framework, which harmonizes Variational Information Bottleneck(VIB) with multi-level contrastive learning. The architecture utilizes an attention-based redundancy filter to strategically compress the input graph, minimizing the mutual information between raw data and its latent representation to “squeeze out” spurious noise. To preserve predictive power, the framework incorporates both semantic and instance-level contrastive learning. This dual-optimization strategy aligns representations across simulated environments, forcing the model to isolate the “invariant causal core” of the graph data for reliable decision-making.
Comprehensive evaluations on DrugOOD, OpenGraph, and diverse synthetic datasets demonstrate InfoIGL’s robustness against structural and scale shifts .Notably, it achieved substantial accuracy gains on the HIV dataset compared to previous SOTA baselines. This work underscores the theoretical value of the Information Bottleneck principle in graph learning and provides a practical roadmap for developing adaptive graph intelligence systems capable of consistent performance in unpredictable real-world settings.
