Data Machina #184
Next Frontiers in LLMs. Diffusion vs. Autoregression for LMs. Multimodal Deep Learning. GCViT SoTA Vision Transformer.The New KerasNLP. SoTA Self-Supervised CNNs. Generative TS forecasting.
Next Frontiers in LLMs: Extension & Augmentation. Language Models were not originally designed for search, knowledge base extraction, calculation, or rule-based symbolic computation.
Everybody in the AI/ML community knows that a LLM is not a search engine, or a calculator, or a wikipedia. LLMs have limitations but they can be extended and augmented.
And yet, there are some popular scientists and CS professors who are obsessed in hunting for LLM fail examples, and trying to stubbornly prove that LLMs fail at things that were not designed for… Here they go: LLMs like ChatGPT Say The Darnedest Things. Wondering why so much waste of time and negative energy?
I’ve just read Wolfram|Alpha as the Way to Bring Computational Knowledge Superpowers to ChatGPT. My first impression is that it seems as though Stephen Wolfram -all of the sudden- becomes aware of the powers & challenges of both LLMs and his computational knowledge engine. He provides examples of combining both, but he doesn’t provide an actual, practical solution to link them together.
Well, it took Harrison just a few days to code a Colab notebook to show How to Give ChatGPT a Wolfram|Alpha Neural Implant as Wolfram suggests in his post. I think this is brilliant!
Here’s a great example of knowledge-base & search LLM augmentation: Perplexity.ai: Ask Anything (try it here) which uses ChatGPT and Open AI new new embedding model. Nice.
I also like what the team @dagster built: A GitHub support bot with GPT3, LangChain & Python that leverages a support knowledge base. If you recall from previous DM issues LangChain is an awesome library for building apps on top of LLMs.
And here is a similar project by @ht2 : providing a Confluence+Zendesk support desk with a naturally queryable knowledge base using GPT3, indexing, and embeddings.
Another really exciting project is GPT Index, a set of data structures designed to make it easier to use large external knowledge bases with LLMs.
Deepmind - of course!- has also built a chatbot called Sparrow that is extended with Google Search. In this post, the Deepmind Research team explains how Sparrow works. Sparrow is a very politically correct & safe chatbot that uses RLHF. See a summary of Sparrow’s 23 Dialogue Model Rules in this post. Quite revealing.
A team of researchers @AllenAI_Institute, investigated why LLMs have limitations and what are the 4 main limitations. They published a paper in which they propose a modular, neuro-symbolic architecture that combines LLMs, external knowledge sources and discrete reasoning that overcames LLMs limitations. Perhaps that’s the way to go.
A few days ago, Kojo -a partner @MatrixVentures that invests in AI startups- posted: Reasoning Apps: The Next Frontier for LLMs. I think If you’re exploring the idea of building a startup, or to monetise an app around LLMs, this is a good read.
What do you think?
Have a nice week.
10 Link-o-Troned
A Pythonista *Experience*
Scripting aRt
Deep & Other Learning Bits
Implementing RLHF: Learning to Summarise with CarperAI’s trlX
ALToolbox - a Framework for Practical Active Learning in NLP
Self-supervised Contrastive Learning for Time Series Classification
ResearchDocs
El Robótico
data v-i-s-i-o-n-s
DataEng Wranglings
AI startups -> radar
ML Datasets & Stuff
Postscript, etc
Tips? Suggestions? Feedback? email Carlos
Curated by @ds_ldn in the middle of the night.