Quantifying thermodynamic properties of texts using Jaynes' principle of maximum entropy
POSTER
Abstract
Jaynes’ principle of maximum entropy can be used to study language by quantifying patterns between specific sequences of letters [1,2]. The empirical frequencies of pairwise letter combinations in the words from a given text are the constraints used to maximize entropy by assigning an interaction potential to each pairwise combination. This framework yields a Boltzmann distribution for the energy probabilities of all possible (real and pseudo) words [1,2]. Thus, we can look at properties analogous to those in thermodynamics such as average energy, temperature, and heat capacity for texts in English from varying authors. By calculating the heat capacity as a function of temperature for the word probability distribution of a given text, we find signals occurring at specific temperatures corresponding to changes in word type and composition. We also find that the probability distribution for the energies of words in a specific text has a characteristic temperature.
[1] G. J. Stephens and W. Bialek, Physical Review E 81, 066119 (2010)
[2] A. Corral and M. G. del Muro, Entropy 22, 179 (2020)
[1] G. J. Stephens and W. Bialek, Physical Review E 81, 066119 (2010)
[2] A. Corral and M. G. del Muro, Entropy 22, 179 (2020)
Presenters
-
Fahd Tarek Hatoum
Emory University
Authors
-
Fahd Tarek Hatoum
Emory University
-
Kristen P Gram
Emory University
-
Jiayu Sui
Emory University
-
Effrosyni Seitaridou
Emory University
-
Alfred C Farris
Emory University