Identifying pattern in microarray expression series using algorithmic information theory
ORAL
Abstract
We introduce a method of detecting pattern in data series independent of the nature of the pattern. This is achieved by calculating a lower bound on the Algorithmic Information Content (AIC) of the data series, the exact value of the AIC being fundamentally uncomputable. This bound also provides us with a measure of the algorithmic compressibility. Data series which are highly compressible are more likely to result from simple underlying mechanisms than series which are incompressible. We show that the compression in bits is a universal currency by which we can order data series according to their significance, even if they are from different experiments or exhibit different kinds of pattern or noise. We test our method on microarray time series of yeast cell cycle and show that is very successful at blindly selecting genes identified by independent experimental studies, without making any assumptions about what kind of pattern these data series contain.
–
Authors
-
Sebastian Ahnert
University of Cambridge, UK
-
Karen Willbrand
Ecole des Mines de Paris, Fontainebleau, France
-
Francis Brown
Ecole Normale Superieure, Paris, France
-
Thomas Fink
Insitiut Curie, Paris, France