Data Machina #166

A weekly digest of AI/ ML curiosities and other amenities

Sep 11, 2022

What are Foundation Models? I guess the shortest answer is the definition coined by Stanford HAI:

"A new paradigm of AI models -trained on large scale, unlabelled data- that can be adapted to many different applications.”

Currently, most but not all foundation models are pre-trained large language models (a.k.a LLMs) like BERT, DALL-E, GPT-3, and Flamingo.

Here’s Standford HAI’s Workshop on Foundation Models - Day 1 and Day 2

Also worth mentioning that Samuel @CambridgeUni Machine Intelligence Lab, has recently published a cool free course on foundation models.

The two camps on foundation models. People in one camp believe that it’s all about scaling, and people in the other camp, say it’s about lack of interpretability and [symbolic] reasoning. In Can Foundation Models Talk Causality? the team @Tu-Darmstadt argue that causality, and the Pearlian counterfactual theory, can be the missing link in foundation models.

Another interesting paper @StanfordUni claims that “foundation models generalise and achieve SoTA performance on data cleaning and integration tasks, even though they are not trained for these data tasks.” Knowing that data cleaning & integration is such a pita! it’s worth reading Can Foundation Models Wrangle Your Data?

The long read. This is a nice post from the guys at Standford’s Center for Research on Foundation Models in which they share their Reflections on Foundation Models, and why these models are so important.

Have a nice Sunday.

10 Link-o-Troned

Share Data Machina with friends

A Pythonista Experience

Scripting aRt

Love from Julia

data v-i-s-i-o-n-s

Distributed de-Entangler

Forschung!

Algorithmic Potpourri

[Free book] MIT Algos for Decision Making Under Uncertainty
Birthday Paradox with Las Vegas and Monte Carlo Algos
Implement - Gaussian Naive Bayes from Scratch

Robots & Cyborgs like <you>

Deep & Other Learning Bits

startups -> radar

ML Datasets & Stuff

Postscript, etc

Liked this post? Feel free to share it

Tips? Suggestions? Feedback? email Carlos

Curated by @ds_ldn in the middle of the night.

Data Machina