Data Efficiency of the Symmetric Information Bottleneck
ORAL
Abstract
The information bottleneck is an example of traditional Dimensionality Reduction; it compresses one set of variables while preserving maximal information about the other. The symmetric information bottleneck, on the other hand, is a Dual Dimensionality Reduction technique that simultaneously compresses two sets of random variables while preserving maximal information between the compressed sets. We explore the data size requirements of both methods by analytically calculating error bounds and mean squared errors in the estimation of mutual information terms used in both bottlenecks. We additionally introduce and examine the data size requirements of the deterministic symmetric information bottleneck, a symmetric bottleneck where the mutual information is replaced by entropy and the produced compression mappings are deterministic. We show that, in many situations of practical interest, the symmetric information bottleneck is more data efficient than the non-symmetric information bottleneck. We believe that this is an example of a more general principle that Dual Dimensionality Reduction methods are often more data efficient than their traditional Dimensionality Reduction equivalents.
–
Presenters
-
K. Michael Martini
Emory University
Authors
-
K. Michael Martini
Emory University
-
Ilya M Nemenman
Emory, Emory University