Efficient Large Scale Language Modeling with Mixture-of-Experts
Meta is working on efficient language models with MoE too
#language-model
#scaling
#mixture-of-experts
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Scaling language models with less global warming
#language-model
#scaling
#mixture-of-experts
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Large-scale speech model is here
#speech-model
#scaling
Scaling Vision Transformers
Scaling up vision transformers takes it higher
#vision-transformer
#scaling