Data Machina #232

Dec 17, 2023

New Mixture-of-Experts (MoE) Models. MS Phi-2 2.7B Small Model. StripedHyena 7B Models. DeepMind Imagen2. Diffusion Models + XGBoost. promptbase. Automated Continual Learning. CogAgent V-L Model.

Read →

1 Comment

Charlie Guo

Artificial Ignorance

Dec 17, 2023

Is there a good primer for understanding mixture-of-experts and how it differs from the original Transformer architecture?

Expand full comment