Data Efficiency of the Symmetric Information Bottleneck

K. Michael Martini; Ilya M Nemenman

Data Efficiency of the Symmetric Information Bottleneck

ORAL

Abstract

The information bottleneck is an example of traditional Dimensionality Reduction; it compresses one set of variables while preserving maximal information about the other. The symmetric information bottleneck, on the other hand, is a Dual Dimensionality Reduction technique that simultaneously compresses two sets of random variables while preserving maximal information between the compressed sets. We explore the data size requirements of both methods by analytically calculating error bounds and mean squared errors in the estimation of mutual information terms used in both bottlenecks. We additionally introduce and examine the data size requirements of the deterministic symmetric information bottleneck, a symmetric bottleneck where the mutual information is replaced by entropy and the produced compression mappings are deterministic. We show that, in many situations of practical interest, the symmetric information bottleneck is more data efficient than the non-symmetric information bottleneck. We believe that this is an example of a more general principle that Dual Dimensionality Reduction methods are often more data efficient than their traditional Dimensionality Reduction equivalents.

March 7, 2023, 7:00 PM – March 7, 2023, 7:12 PM

Presenters

K. Michael Martini

Emory University

Authors

K. Michael Martini

Emory University
Ilya M Nemenman

Emory, Emory University