首页 /研究 /RWKV-edge: Deeply Compressed RWKV for Resource-Constrained Devices
OTHER

RWKV-edge: Deeply Compressed RWKV for Resource-Constrained Devices

Wonkyo Choe, Yangfeng Ji, Felix Xiaozhu Lin

发表年份
2024
访问权限
开放获取

摘要

To deploy LLMs on resource-contained platforms such as mobile robots and smartphones, non-transformers LLMs have achieved major breakthroughs. Recently, a novel RNN-based LLM family, Repentance Weighted Key Value (RWKV) has shown strong computational efficiency; nevertheless, RWKV models still have high parameter counts which limited their deployment. In this paper, we propose a suite of compression techniques, ranging from model architecture optimizations to post-training compression, tailored to the RWKV architecture. Combined, our techniques reduce the memory footprint of RWKV models by 3.4x -- 5x with only negligible degradation in accuracy; compared to transformer LLMs with similar accuracy, our models require 4x less memory footprint.

关键词

cs.LGcs.PF

相关论文

查看 OTHER 分类全部论文