Data Machina #224
Latest on AI Agents. AI Agents in PROD. OpenAgents. AgentTuning. MusicAgent. Fuyu-8B. AI Agents vs. Devs. PaLI-3. The End of Finetuning. Llema Model for Math.
On boring AI conferences and the latest in AI Agents. This week I attended yet-another big AI conference in which the main theme was “the latest in AI.” Alas, the event was extremely boring, a huge disappointment. There was noting about “the latest in AI,” but rather it was a mixed bag with loads of AI fluff, AI marketing spin, and common basic AI stuff… My feedback to the organiser: “Replace the dull presenters and the suits behind booths by AI Agents, you’ll do better!” Speaking of AI Agents…
AI Agents in production. This is a great post on lessons learned from building AI agents in production. For the past six months, engineers at Parcha have been building enterprise-grade AI Agents that instantly automate manual workflows in compliance and operations. The engineers share some reflections and the lessons they have learned in this blogpost: Building AI Agents in Production.
An open platform for AI Agents. This looks really awesome. The team at Xlang-AI just released: OpenAgents, an open platform for using and hosting language agents. The platform already includes a Data/ SQL Agent, a Plugins Agent with +200 tools, and a Web Agent for autonomous web browsing. OpenAgents is easy to deploy, and comes with a webUI. Checkout the repo, demo, and instructs: OpenAgents: An Open Platform for Language Agents in the Wild.
Tuning open AI Agents for complex tasks. Open AI agents have shown good performance completing general tasks. But they are still inferior to commercial agents when performing complex tasks in the real world. These agents employ LLMs as the central controller responsible for planning, memorisation, and tool utilisation, requiring both fine-grained prompting and robust LLMs to achieve satisfactory performance. AgentTuning is a simple and general method to enhance the agent abilities of LLMs while maintaining their general LLM capabilities. See model, code, dataset & paper: AgentTuning: Enabling Generalised Agent Abilities For LLMs.
An AI Agent for music. Researchers at MS, just released MusicAgent, an agent that automates music understanding and music generation workflows. The agent can be used for: song writing, lyrics generation, text to music, singing voice generation, and much more. MusicAgent provides integration with Spotify and music/audio models in Hugging Face & GitHub. Paper, repo here: MusicAgent: An AI Agent for Music Understanding and Generation.
A multimodal architecture for AI Agents. Researchers at Adept, just opensourced Fuyu-8B, a multi-model model for agents, that has a much simpler architecture and training procedure than other multi-modal models. the model is easy to scale, and deploy. Fuyu-8B was designed from the ground up for digital agents, so it can support arbitrary image resolutions, answer questions about graphs and diagrams, answer UI-based questions, and do fine-grained localization on screen images. Blogpost: Fuyu-8B: A Multimodal Architecture for AI Agents.
Have a nice week.
10 Link-o-Troned
the ML Pythonista
Deep & Other Learning Bits
AI/ DL ResearchDocs
data v-i-s-i-o-n-s
MLOps Untangled
AI startups -> radar
ML Datasets & Stuff
Postscript, etc
Tips? Suggestions? Feedback? email Carlos
Curated by @ds_ldn in the middle of the night.