Coarse-graining, scale invariance, and emergence of summarization in narratives
ORAL
Abstract
One of the hallmarks of intelligence is the ability to abstract and summarize complex information into compact representations. This cognitive process is closely linked with the verbal recall of memories, where coherent information is often chunked and retrieved as summaries. To understand how summarization may arise in natural language, we develop a toy model of narratives in which narrative structures are abstracted into a hierarchy of keypoints, each representing a coarse-grained version of the underlying clauses. The exact solution to this model shows that at any given level of the hierarchy, there exists a scaling regime where the distribution of compression ratios for the keypoints, measured as fractions of narrative length, approaches a universal limit. In this limit, summarization emerges as a natural consequence of the system's scale invariance: for any keypoint with a fixed compression ratio drawn from this invariant distribution, the longer the narrative is, the larger the fraction of the narrative it summarizes. Large language model-assisted analysis of a human memory study reveals similar distributions of the compression ratios in the recall of meaningful narratives, where the empirical scaling exponent is found to approach the theoretical prediction in the limit of long narratives. Overall, our theory provides a framework for the systematic coarse-graining of language and enables a quantitative understanding of summarization and abstraction.
–
Presenters
-
Weishun Zhong
Institute for Advanced Study
Authors
-
Weishun Zhong
Institute for Advanced Study
-
Tankut Can
Emory University and Institue for Advanced Study
-
Mikhail Katkov
Weizmann Institute of Science and Institute for Advanced Study
-
Ilya Shnayderman
Weizmann Institute of Science
-
Antonios Georgiou
Institute for Advanced Study
-
Misha Tsodyks
Weizmann Institute of Science and Institute for Advanced Study