Efficient Large Scale Language Modeling with Mixture-of-Experts
Meta is working on efficient language models with MoE too
#language-model
#scaling
#mixture-of-experts
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Scaling language models with less global warming
#language-model
#scaling
#mixture-of-experts