LiteKV: A KV cache compression method for efficient inference on resource-constrained devices
Author:
Zhipeng Zhang,Dmitry Ilvovsky
Publication:
Information Processing & Management
© 2026 Elsevier Ltd. All rights are reserved, including those for text and data mining, AI training, and similar technologies.