Data Machina #219
AI Agents: Better Than Humans? Cognitive Architectures for Language Agents (CoALA). μAgents. Against LLM Maximalism. XTTS Voice Cloning. AI Forecasting. NeXt-GPT Any2any Multimodal LLM. phi-1 1.3B
AI Agents: Better than Human Agents? So I booked my flights. I just realised I need to change my info before checking-in because I made a typo. But the bloody app won’t let me do it. After 20 min on the phone! I finally get to talk to a human agent. The human agent instructs me to change the info via the app. I explain that I can’t because the app doesn’t allow me to do that! The human agent then asks me 10s of irrelevant Qs to finally conclude that I have to use the app to change my info before checking-in!
Me: “Are you really a human agent? You just seem to be reading from a script and NOT listening to what I’m telling you!.. Are you an AI agent with a human voice?”
The Human Agent: “Sir, we follow top-class procedures to ensure that your experience as a client is always the best. Please log-in into our app to provide your correct information before checking-in.”
This human agent is pathetically useless. These days, most human agents sound and act 99% like AI agents. I guess we will be much better off with AI Agents that sound and act 99% like humans.
A survey on LLM-based AI agents. An excellent survey in which the researchers present a framework for LLM-based agents, with three 3 components: brain, perception, and action. The researchers explore AI agents in 3 areas: single-agent scenarios, multi-agent scenarios, and human-agent cooperation. They then explore agent societies, behaviour & personality of agents, the social phenomena that emerge when they form societies, and the insights they offer for human society. Paper: The Rise and Potential of LLM-Based Agents: A Survey.
A new architecture for AI language agents. This is an interesting paper which aims to put some order in the piecemeal approach followed by many researchers in AI agents. Researchers @PrincetonUni introduce: Cognitive Architectures for Language Agents (CoALA), a conceptual framework to systematise diverse methods for LLM-based reasoning, grounding, learning, and decision making. Checkout the paper, and accompanying website with references and resources: Cognitive Architectures for Language Agents.
Building an AI agent with visual AI programming. Building AI agent apps programmatically is hard. A basic-to-mid level AI Agent app involves a lot of design cycles, prototyping, debugging, remote execution bits & pieces… Visual AI programming comes to the rescue. Last week I mentioned Rivet, an open source visual AI programming environment. Checkout this post on how to build your first AI agent with Rivet. And also a quick vid below on using visual AI programming to build a simple chatbot.
An AI agent dev framework. The μAgents is a fast and lightweight framework that makes it easy to build autonomous, decentralised AI agents. See this demo on how to build a μAgent for restaurant bookings.
A new library for building autonomous AI agents. Agents is an open-source library/framework for building autonomous language agents. The library supports key features like long-short term memory, tool usage, web navigation, multi-agent communication, human-agent interaction and symbolic control. See paper, code, examples, and website here: Agents: An Open-source Framework for Autonomous Language Agents.
New research on AI agents for medical diagnosis. In this paper, the researchers introduce a multi-agent conversational framework where doctor-AI and patient-AI agents interact to diagnose medical conditions, evaluated by a grader-AI agent and medical experts. The researchers assessed the accuracy of GPT-4 vs. traditional evaluations methods when diagnosing 140 cases. The researchers found a decline in diagnostic accuracy, identified key limitations in LLMs’ ability to integrate details from conversational interactions to improve diagnostic accuracy. Paper: Testing the Limits of Language Models: A Conversational Framework for Medical AI Assessment.
Managing a team of AI agents. Meta-GPT is a multi-agent framework that assigns different roles to GPTs to form a collaborative software company for complex tasks. Meta-GPT takes a one line requirement as input, and outputs: competitive analysis, user stories, PRD, data structures, APIs, docs, tasks and repo. See: MetaGPT: The Multi-Agent Framework.
A new community on AI Agents. This is a community of ML experts, professional prompt engineers, and AI enthusiasts that provide open source AI agents. Checkout Open source AI Agent community.
Have a nice week.
10 Link-o-Troned
the ML Pythonista
Deep & Other Learning Bits
AI/ DL ResearchDocs
MSR - Textbooks is All You Need [phi-1 1.3B a Super Efficient Model]
Chain of Density (CoD) Prompting for Better GPT-4 Summarisation
data v-i-s-i-o-n-s
MLOps Untangled
AI startups -> radar
ML Datasets & Stuff
A Dataset with ~120k Python Code Exercises Generated by ChatGPT 3.5
The Human Chronome Project - A DB of Human Global Activities
Google MADLAD-400: A Multilingual & Doc-Level Large Audited Dataset
Postscript, etc
Tips? Suggestions? Feedback? email Carlos
Curated by @ds_ldn in the middle of the night.
Loved this post and many useful links as always. Thanks
AI agents are starting to beat human agents in many areas already. This link is a short article about a trial we did a few weeks ago for Financial Guidance on UK Pensions queries. Our LLM AI Chatbot makes 5-10x FEWER ERRORS than typical human agents of financial services companies. So very much supporting your thesis. https://medium.com/@mattgosden/trial-results-show-that-ai-answers-financial-queries-better-than-humans-6df77bbcab31
Hey Matt - thanks for sending the link, very interesting. It does seem to support my thesis :-)
There's research from Harvard/ MIT/ Wharton unis working with business indicating the same results. It’s early days and there’s a bit of debate on what evaluation methods & benchmarks we use, and how.
Most orgs try to automate the human agents with scripts, templates, etc for interacting with humans and resolving tasks derived from the chat. But being an automated human is not a human thing.
Most probably, the AI agents -with the frameworks I mentioned- and properly finetuned with instruction-task oriented datasets, and a bit of RLHF (soon RLAIF) will soon beat most human agents at resolving most customer queries.