Data Machina #186
LLMs: The latest. Stanford Advances in Foundation Models. Neuro-Symbolic LLM. TransformerXL explained. Trustable & auto-retrainable ML models. Open problems in Deep Learning.
[Maybe] The Latest in Language Models? A Tour de Force. Let me tell you: I don’t know about you but it's bit challenging for me to keep up with the latest in LLMs. I use this handy, annotated, Large Language Models spreadsheet. It’s not fully comprehensive but it’s cool.
Open AI released InstructGPT, a new LM that is better than GPT-3 at following English instructions. More here: Aligning Language Models to Follow Instructions
StandfordNLP published a new LM framework for composing search and LMs with up to 120% gains over GPT-3.5. TL;DR: Use imperative code and forget about prompt engineering. Link: Demonstrate–Search–Predict (DSP.)
CarperAI released a set of diff models for editing/ generating code. These diff models are autoregressive LM models that were trained on millions of GitHub commits. Checkout: Diff Models – A New Way to Edit Code
Meta AI keeps delivering amazing LLM research. They just released: 1) A new model for generating high-fidelity music from text descriptions: MusicLM (paper, demo) and 2) A new method for generating 3D dynamic scenes from text descriptions: MAV3D Make-A-Video3D (paper, demo)
The Inverse Scaling Prize awarded 7 prizes to researchers who identified important tasks on which LMs perform worse the larger they are (“inverse scaling”.)
Speaking of scaling LMs, Jason @GoogleBrain gave a great talk on why Scaling Unlocks Emergent Abilities in Language Models (slides.)
Many professionals in media, entertainment & arts are all up in arms. They feel threatened by generative AI LLMs. This has triggered some researchers to find a way to algorithmically mark and detect text generated by AI. Paper: A Watermark for Large Language Models
LangChain is becoming the de facto library for building LM apps. This is a great post on Getting Started with LLMs using LangChain.
If you are interested in building apps with LangChain, @lostintangent developed a one-click dev environment for building LLM apps with LangChain.
If you need inspiration and examples on agents and chains for building LM apps, checkout the very new LangChain Hub.
Researchers @CarnegieMellonUni ask: Why Do Nearest Neighbor Language Models Work? They show that retrieval-augmented, kNN-LMs perform better than standard parametric LMs (Jan 2023).
LMs have many limitations. Neuro-Symbolic AI to the rescue? Oblivious to the AI Marketing Storm from the Tech Goliaths, a tiny little group of researchers @LIT_AI_Lab in Austria, built SymbolicAI API: a compositional, neuro-symbolic framework that combines LLMs with Differentiable Programming. This framework extends & augments LLMs with magic powers. And it’s beautifully documented. Awesome!
The v4.0 of Talking About Large Language Models (Jan 25, 2023) is out. It’s a great paper by Murray @ImperialCollege.
Remember ELIZA, the very 1st AI therapist? This startup has developed Serena, a chatbot that uses LLMs for Mental Therapy
Feels miserable outside, like in London? Here are some indoors Language Models entertainment suggestions:
Amusing: Ask anything to GPT, and an alive portrait replies to your query
Bookworm fun: Pick a book from a library to talk to. The library is a bit tailored for those energetic Silicon Valley hustlers. But it’s OK
Pretty amazing: Instruct Pix2Pix - Load an image, write some text to edit the image on the fly
Have a nice week.
Thanks for reading Data Machina! Subscribe free to receive new posts every week.
the ML Pythonista
the ML codeR
Deep & Other Learning Bits
AI/ DL ResearchDocs
AI startups -> radar
ML Datasets & Stuff
Tips? Suggestions? Feedback? email Carlos
Curated by @ds_ldn in the middle of the night.