Data Machina #186

LLMs: The latest. Stanford Advances in Foundation Models. Neuro-Symbolic LLM. TransformerXL explained. Trustable & auto-retrainable ML models. Open problems in Deep Learning.

Jan 29, 2023

[Maybe] The Latest in Language Models? A Tour de Force. Let me tell you: I don’t know about you but it's bit challenging for me to keep up with the latest in LLMs. I use this handy, annotated, Large Language Models spreadsheet. It’s not fully comprehensive but it’s cool.

Open AI released InstructGPT, a new LM that is better than GPT-3 at following English instructions. More here: Aligning Language Models to Follow Instructions

StandfordNLP published a new LM framework for composing search and LMs with up to 120% gains over GPT-3.5. TL;DR: Use imperative code and forget about prompt engineering. Link: Demonstrate–Search–Predict (DSP.)

CarperAI released a set of diff models for editing/ generating code. These diff models are autoregressive LM models that were trained on millions of GitHub commits. Checkout: Diff Models – A New Way to Edit Code

Meta AI keeps delivering amazing LLM research. They just released: 1) A new model for generating high-fidelity music from text descriptions: MusicLM (paper, demo) and 2) A new method for generating 3D dynamic scenes from text descriptions: MAV3D Make-A-Video3D (paper, demo)

The Inverse Scaling Prize awarded 7 prizes to researchers who identified important tasks on which LMs perform worse the larger they are (“inverse scaling”.)

Speaking of scaling LMs, Jason @GoogleBrain gave a great talk on why Scaling Unlocks Emergent Abilities in Language Models (slides.)

Many professionals in media, entertainment & arts are all up in arms. They feel threatened by generative AI LLMs. This has triggered some researchers to find a way to algorithmically mark and detect text generated by AI. Paper: A Watermark for Large Language Models

LangChain is becoming the de facto library for building LM apps. This is a great post on Getting Started with LLMs using LangChain.

If you are interested in building apps with LangChain, @lostintangent developed a one-click dev environment for building LLM apps with LangChain.

If you need inspiration and examples on agents and chains for building LM apps, checkout the very new LangChain Hub.

Researchers @CarnegieMellonUni ask: Why Do Nearest Neighbor Language Models Work? They show that retrieval-augmented, kNN-LMs perform better than standard parametric LMs (Jan 2023).

LMs have many limitations. Neuro-Symbolic AI to the rescue? Oblivious to the AI Marketing Storm from the Tech Goliaths, a tiny little group of researchers @LIT_AI_Lab in Austria, built SymbolicAI API: a compositional, neuro-symbolic framework that combines LLMs with Differentiable Programming. This framework extends & augments LLMs with magic powers. And it’s beautifully documented. Awesome!

The v4.0 of Talking About Large Language Models (Jan 25, 2023) is out. It’s a great paper by Murray @ImperialCollege.

Remember ELIZA, the very 1st AI therapist? This startup has developed Serena, a chatbot that uses LLMs for Mental Therapy

Feels miserable outside, like in London? Here are some indoors Language Models entertainment suggestions:

Amusing: Ask anything to GPT, and an alive portrait replies to your query
Bookworm fun: Pick a book from a library to talk to. The library is a bit tailored for those energetic Silicon Valley hustlers. But it’s OK
Pretty amazing: Instruct Pix2Pix - Load an image, write some text to edit the image on the fly

Data Machina

Data Machina #186

LLMs: The latest. Stanford Advances in Foundation Models. Neuro-Symbolic LLM. TransformerXL explained. Trustable & auto-retrainable ML models. Open problems in Deep Learning.

10 Link-o-Troned

the ML Pythonista

the ML codeR

Deep & Other Learning Bits

AI/ DL ResearchDocs

El Robótico

data v-i-s-i-o-n-s

DataEng Wranglings

AI startups -> radar

ML Datasets & Stuff

Postscript, etc