Data Machina #260

Jul 8

Vision-Language Models Booming. PaliGemma. Phi-3 Vision. Florence-2. LLaVA-NeXT. ML in Video games. PCA in Latent Space. MosaicML Agents Framework. MoEs at Scale. GraphRAG. Image SSL on a Shoestring.

Read →

1 Comment

Nick's Newsletter

Nick’s Substack

Jul 12

This article brilliantly captures the rapid advancements in Vision-Language Models (VLMs). It's exciting to see the emergence of small, powerful VLMs like LLaVA-Next, PaliGemma, and Florence-2, which are not only democratizing access but also pushing the boundaries of multimodal capabilities. These models, with their open-source availability and state-of-the-art performance, signify a significant shift in the AI landscape, making cutting-edge technology more accessible and versatile. It's a thrilling time for innovation in AI!

Expand full comment