Bumblebee: A transformer-based foundation model for proton-proton collision, demonstrated on the pp->ttbar process.
ORAL
Abstract
Bumblebee is a transformer-based foundation model pre-trained on proton-proton collisions data from the Large Hadron Collider (LHC), designed to generalize across particle physics tasks. Inspired by BERT, by removing positional encodings and embedding particle 4-vectors, Bumblebee captures both generator- and reconstruction-level information while ensuring sequence-order invariance. Pre-trained using a masked task where parts of the input are masked and predicted, Bumblebee demonstrates improved top quark reconstruction resolution by 10-20\% over traditional methods on Monte Carlo simulations of the pp->ttbar process. It also excels in downstream tasks, including distinguishing bound states of top quark pairs and identifying the process as originating from gluons or quarks. Bumblebee has the potential to be applied to a wide range of physics processes at the LHC, as well as to aid in the discovery of new particles and facilitate fast detector simulations.
–
Presenters
-
Ethan Colbert
Purdue University
Authors
-
Yao Yao
Purdue University
-
Mia Liu
Purdue
-
Andrew James Wildridge
CMS
-
Andreas Jung
Purdue University
-
Jack Rodgers
Purdue University
-
Ethan Colbert
Purdue University