APS Logo

Bumblebee: A transformer-based foundation model for proton-proton collision, demonstrated on the pp->ttbar process.

ORAL

Abstract

Bumblebee is a transformer-based foundation model pre-trained on proton-proton collisions data from the Large Hadron Collider (LHC), designed to generalize across particle physics tasks. Inspired by BERT, by removing positional encodings and embedding particle 4-vectors, Bumblebee captures both generator- and reconstruction-level information while ensuring sequence-order invariance. Pre-trained using a masked task where parts of the input are masked and predicted, Bumblebee demonstrates improved top quark reconstruction resolution by 10-20\% over traditional methods on Monte Carlo simulations of the pp->ttbar process. It also excels in downstream tasks, including distinguishing bound states of top quark pairs and identifying the process as originating from gluons or quarks. Bumblebee has the potential to be applied to a wide range of physics processes at the LHC, as well as to aid in the discovery of new particles and facilitate fast detector simulations.

Presenters

  • Ethan Colbert

    Purdue University

Authors

  • Yao Yao

    Purdue University

  • Mia Liu

    Purdue

  • Andrew James Wildridge

    CMS

  • Andreas Jung

    Purdue University

  • Jack Rodgers

    Purdue University

  • Ethan Colbert

    Purdue University