Data Machina #229

Generative AI Video. Stability AI SVD. Meta AI EMU. The Bayesian Brain. Probabilistic ML. UltraFastBERT. Gorilla OpenFunctions. Voyage-01 Embeddings. GAIA. Orca 2. ALERTA-NET.

Nov 26, 2023

Generative AI Video. Tinkering around and playing with GenAI in regulated industries like finance, insurance, pharma, or telecomm can cost you a fortune. But in media, marketing, and creative industries you can play with stuff like GenAI video for fun & profit. Here’s the latest on GenAI video.

Stability AI SVD image-to video. This week, Stability AI introduced Stable Video Diffusion (SVD) Image-to-Video, a diffusion model that takes in a still image as a conditioning frame, and generates a video from it. This model was trained to generate 25 frames at resolution 576x1024 given a context frame of the same size, finetuned from SVD Image-to-Video [14 frames]. Checkout the repo, model: Stable Video Diffusion Image-to-Video Model Card.

Meta AI EMU SOTA text-to-video. Ten days ago, Meta AI researchers introduced Emu Video, a state of the art, simple method for text-to-video generation based on diffusion models. Emu Videos a unified model that can generate videos based on a variety of inputs: text only, image only, and both text and image. Checkout project website, demo, paper: Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning.

Baidu VideoGen HD text-to-video. Baidu VideoGen is a text-to-video generation model, that generates high-definition video with high frame fidelity and strong temporal consistency using reference-guided latent diffusion. VideoGen leverages Stable Diffusion, to generate an image with high content quality from the text prompt, as a reference image to guide video generation. Checkout paper, demos: VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation.

Alibaba I2VGen-XL Video Synthesis. Researchers at Baidu, introduced I2VGen-XL, a model that addresses the scarcity of well-aligned text-video data, the complex inherent structure of videos, and the difficulty of a model to simultaneously ensure semantic and qualitative excellence. was trained on 35 million single-shot text-video pairs and 6 billion text-image pairs. Checkout paper, repo and demos: I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models.

Shanghai AI Lab SEINE generative video transitions. SEINE, is a model that aims to generate high-quality long videos with smooth and creative transitions between scenes and varying lengths of shot-level videos. The model uses a random-mask video diffusion model that auto-generates transitions based on textual descriptions. and different scenes as inputs, combined with text-based control. Checkout paper, model, demos: SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction.

PKU-YUAN Video-LLaVA video-image understanding. This is a Large Vision-Language Model (LVLM) for visual-language understanding that learns from a mixed dataset of images and videos, mutually enhancing each other. Video-LLaVA unifies visual representation in a single mode, and outperforms models designed specifically for images or videos. Checkout paper, repo and demo: Video-LLaVA: Learning United Visual Representation by Alignment Before Projection.

Some AI activities for the weekend. The weather in London is cold and miserable. And I bet you’re sick & tired of reading about the OpenAI drama, and all the paid PR/ media propaganda on the Q* breakthrough, super AGI achieved, and AI existential risk. So don’t venture outside, instead:

Play Death by AI, a survival party game. Can you survive my judgment and cheat death? Invite your friends to the game. Describe a deadly scenario. Enter a survival strategy. AI decides if you survive.
Join The Church of AI. At some point AI will have God-like powers. Experience what happens when an intelligent machine is able to expand its intelligence exponentially for eternity.
Learn about Q-Learning. All of the sudden, there are so many “experts” talking about Q-learning! It’s worthwhile to teach them a lesson.

Have a nice week.