4Models·5h ago
Generic Expert Coverage for Pruning SparseMixture-of-Experts Language Models
Researchers have introduced a method called Generic Expert Coverage to prune redundant components from sparsely activated Mixture-of-Experts language models without requiring additional calibration data. This approach addresses the technical challenge of shrinking large models that often contain overlapping or unnecessary expert networks, potentially allowing for more efficient deployment of complex AI architectures.
Covered by 1 source
- AarXiv CS.AI↗Yongqin Zeng, Sicheng Pan, Jiale Wang, Hai-tao Zheng, Hong-Gee Kim, Chunxia Ma, XiuTeng Zhou5h ago