Data Machina #171

A free weekly digest of AI/ ML curiosities and other amenities

Oct 16, 2022

On Multimodal Machine Learning (MMML). There is a big convergence happening in language, vision, and in general pre-trained large AI models.

Multimodal ML is emerging as a discipline for building general-purpose, universal models across different modalities. An important area of MMML deals with large-scale, self-supervised, pre-trained models (foundation models) that can generalise with little or no fine-tuning.

Last week, I read : Foundations and Recent Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions. It’s a great overview.

Afaik this seminal paper on Pre-trained Transformers as Universal Computation Engines -published by a team from UC Berkeley, Facebook AI & Google Brain- opened the gates for Multimodal ML and Foundation Models.

Last Thursday, the team @MSAGI (Microsoft Artificial General Intelligence) published Foundation Transformers, a true general-purpose model that can be used across all modalities (language, vision, speech) with guaranteed training stability.

And two days ago, the team @GoogleAIReseach published UL2 20B: An Open Source Unified Language Learner that improves the performance of language models universally across datasets and setups.

The team @CSCarnegieMellonUni has published some great free tutorials and courses on Multimodal ML, check these out:

Have a nice week.

10 Link-o-Troned

Share Data Machina with friends

A Pythonista Experience

Scripting aRt

Deep & Other Learning Bits

ResearchDocs

Algorithmic Potpourri

El Robótico

data v-i-s-i-o-n-s

DataEng Wranglings

startups -> radar

ML Datasets & Stuff

Postscript, etc

Thanks for reading Data Machina! Subscribe for free to receive new posts every week

Tips? Suggestions? Feedback? email Carlos

Curated by @ds_ldn in the middle of the night.

Data Machina