QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation

Zhuo Chen; Rumen Dangovski; Charlotte Loh; Owen Dugan; Di Luo; Marin Soljačić

QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation

ORAL

Abstract

Large Language Models (LLMs) have become critical tools across many domains, but fine-tuning them for specific tasks remains a challenge. We propose Quantum-informed Tensor Adaptation (QuanTA), a novel method that utilizes quantum computation-inspired techniques to achieve efficient high-rank fine-tuning. Unlike Low-Rank Adaptation (LoRA), which may struggle with complex downstream tasks due to its low-rank nature, QuanTA leverages high-rank adaptations, supported by theoretical results such as the universality and rank representation theorems, to overcome these limitations. QuanTA offers improved performance in commonsense reasoning, arithmetic reasoning, and scalability while using fewer trainable parameters than other methods, without introducing inference overhead. Furthermore, QuanTA can be integrated with existing fine-tuning algorithms, providing a scalable and efficient approach to enhancing LLMs and advancing state-of-the-art in natural language processing.

March 19, 2025, 6:24 PM – March 19, 2025, 6:36 PM

Publication: https://nips.cc/virtual/2024/poster/96019<br>https://arxiv.org/abs/2406.00132

Presenters

Zhuo Chen

Massachusetts Institute of Technology

Authors

Zhuo Chen

Massachusetts Institute of Technology
Rumen Dangovski

Massachusetts Institute of Technology
Charlotte Loh

Massachusetts Institute of Technology
Owen Dugan

Massachusetts Institute of Technology
Di Luo

Massachusetts Institute of Technology
Marin Soljačić

Massachusetts Institute of Technology