Data Machina #250

Llama-3 Watershed Moment. Multi AI Agent Collaboration. AI Agents Planning. Idefics2-8B V-L Model. Google Gemini Cookbook. Quantisation Intro. torchtune. DeepMind Penzai. Youtube Commons Dataset.

Apr 21, 2024

Llama 3: A Watershed AI moment? I reckon that the release of Llama 3 is perhaps one of the most important moments in AI development so far. The Llama 3 stable is already giving birth to all sorts of amazing animals and model derivatives. You can expect Llama 3 will unleash the mother of all battles against closed AI models like GPT-4.

Meta AI just posted: ”Our largest Llama 3 models are over 400B parameters. And they are still being trained.” The upcoming Llama-400B will change the playing field for many independent researchers, little AI startups, one-man AI developers, and also enterprise AI apps. For now, The Zuck and Yan LeCunn are the bastions of “open AI.”

Quick Llama 3 Summary:

A family of SOTA, open models available in both 8B & 70B parameter sizes, in pre-trained base and instruction-tuned versions
License. Open but not fully Apache 2.0 open-source. Free license for research and commercial applications but with limitations. Read Llama 3 license here.
Open models and weights upon request. Get them here.
Trained on 24k GPUs!! and +15 trillion tokens. Massive for such model sizes.
Context window expanded to 8192 length. People expected 128K at least
New tokeniser with 128K words vocabulary built on tope of OpenAI TikToken
Meta AI official blogpost: Introducing Meta Llama 3: The most capable openly available LLM to date
Nathan’s great overview of all the tech details: Llama 3: Scaling open LLMs to AGI

Run Llama 3 with Meta AI intelligent assistant. Llama 3 has been integrated with Meta AI. Try it for chat, coding tasks, and problem solving here. It also runs on Facebook, WhatsApp and Instagram. If you’re not in the US, try with a VPN.

Easily deploy Llama 3 on cloud AI stacks. Using HuggingFace Deploy, you can now deploy Llama 3 on Azure ML, Google Vertex, Amazon SageMaker or HuggingFace hosting. Checkout: HuggingFace Meta-LLama-3-8B click deploy.

Run Llama 3 at blazing speed, super cheap cost.

Run it on Together AI inference engine. Use Together 2.0 Inference Engine, to get up to 350 tokens per second for Llama 3 8B and up to 150 tokens per second for Llama 3 70B, running in full FP16 precision. Blogpost: Together AI releases Meta Llama 3 for inference and fine-tuning.
Run it on Groq AI h/w. Groq is an innovative AI hardware stack optimised for super efficient, super fast, cheap AI compute. Researchers at ChatLabs show how running Llama 3 on Grow blows GPT-4 Turbo, Claude 3 Opuso, and Gemini Pro out of the water. Blogpost: Meta AI Llama 3 With Groq Outperforms Private Models on Speed/Price/Quality Dimensions?

Run Llama 3-Instruct-8B GGUF for efficient chat. GGUF is a binary format that is optimised for quick loading and saving of models. Llama 3 instruction tuned models are optimised for dialogue and outperform most open source chat models. Get Meta-Llama-3-8B-Instruct-GGUF here. Thanks to the great @nousresearch and @ggerganov.

Run Llama 3 on Apple silicon devices. You can now run any Lllama-3 model quantised in 4 bit or 8bit on your local Apple silicon device using Apple MLX framework. Thanks to the awesome @Prince_Canuma.

See how Llama 3 was jailbroken. The researchers at Meta AI team say that they spent a lot of time safeguarding and redteaming Llama 3. Well, I’m not so sure about that because -inevitably- lots of jailbreaks are starting to pop up. Checkout A Trivial Jailbreak Against Llama 3 or Jailbreaking Llama 3 for education purposes.

Have a nice week.

10 Link-o-Troned

Share Data Machina with your friends

the ML Pythonista

Deep & Other Learning Bits

AI/ DL ResearchDocs

MLOps Untangled

ML Datasets & Stuff

Postscript, etc

Enjoyed this post? Tell your friends about Data Machina. Thanks for reading.

Tips? Suggestions? Feedback? email Carlos

Curated by @ds_ldn in the middle of the night.

Data Machina