APS Logo

Quantifying thermodynamic properties of texts using Jaynes' principle of maximum entropy

POSTER

Abstract

Jaynes’ principle of maximum entropy can be used to study language by quantifying patterns between specific sequences of letters [1,2]. The empirical frequencies of pairwise letter combinations in the words from a given text are the constraints used to maximize entropy by assigning an interaction potential to each pairwise combination. This framework yields a Boltzmann distribution for the energy probabilities of all possible (real and pseudo) words [1,2]. Thus, we can look at properties analogous to those in thermodynamics such as average energy, temperature, and heat capacity for texts in English from varying authors. By calculating the heat capacity as a function of temperature for the word probability distribution of a given text, we find signals occurring at specific temperatures corresponding to changes in word type and composition. We also find that the probability distribution for the energies of words in a specific text has a characteristic temperature.



[1] G. J. Stephens and W. Bialek, Physical Review E 81, 066119 (2010)

[2] A. Corral and M. G. del Muro, Entropy 22, 179 (2020)

Presenters

  • Fahd Tarek Hatoum

    Emory University

Authors

  • Fahd Tarek Hatoum

    Emory University

  • Kristen P Gram

    Emory University

  • Jiayu Sui

    Emory University

  • Effrosyni Seitaridou

    Emory University

  • Alfred C Farris

    Emory University