Data Machina #206
Improving Model Finetuning. Orthogonal Finetuning. GLoRA. GPT-Engineer. GPT Functions. What-if MLOps. Transformers for Time Series. Voicebox SoTA Speech. LLMs in Prod. LLM Assitant for Spark.
Improving Model Finetuning. Since the early days of transfer learning and ULMFit, finetuning of NLP models has improved in many amazing ways. Today, you can’t avoid finetuning if you want to exploit the full power of LLMs. Finetuning is evolving quickly, and finetuned models outperform generalist models on specific tasks. We are going to see many new, much more sophisticated finetuning methods in the months to come. Here are my latest notes on finetuning:
This is a nice Review of Eight Fine Tuning Methods.
Improving finetuning for text-to-Image models. Very often, in Generative AI with text-to-image models, the issue is simply the user. Indeed, achieving a good variety of good quality images generated with text prompts provided by the user has limits. Checkout these two finetuning approaches to address that issue:
Textual Inversion (TI). You can improve images generated by Stable Diffusion with TI. The idea behind textual inversion is to use a few example images to teach a new word to the text model and train its embeddings close to some visual representation. Textual Inversion: A method to finetune Stable Diffusion Model
Orthogonal Finetuning (OF). OF is a new finetuning method for adapting text-to-image diffusion models to downstream tasks. The basic idea is to finetune the pretrained weight matrices with orthogonal transform. This enables you to effectively and stably finetune text-to-image diffusion models. Paper, code: Controlling Text-to-Image Diffusion by Orthogonal Finetuning.
More efficient finetuning of LLMs. Since finetuning is pretty much unavoidable these days, a lot of research is focused on efficient computation, and faster and cheaper ways to finetune models. A key goal is to democratise finetuning methods and make them available to users in local machines with cheap computation.
Finetuning Falcon in 1 hour with 1 GPU. If you recall, Falcon 40B is an Apache 2.0, causal decoder-only model that currently outperforms all other existing open source LLMs. This is a great post, showing how to finteune Falcon in 1 hour on a single GPU instead of a day on 6 GPUs. Finetuning Falcon LLMs More Efficiently With LoRA and Adapters.
LLaMA Efficient Tuning End-to-End. There is an arms race happening on better, powerful methods to fine-tune LLaMA-based models. A few days ago, Yaowei Zheng published a new way to Fine-tuning LLaMA with PEFT (PT+SFT+RLHF with QLoRA). The great thing about this method is that it supports the GPT training pipeline (see diagram): pre-training, supervised finetuning and RLHF.
A more efficient LoRA. Another key area of research in LLMs is to find methods to improve Low-Rank Adaptation (LoRA), or ways to accelerate the training of large models while consuming less memory. QLoRA, for example, combines quantisation with LoRA. This is a nice post on Exploring QLoRA's Potential for Innovation. Two weeks ago, a team from Meta AI and CMU, introduced GLoRA (generalised LoRA). GLoRA employs a generalized prompt module to optimize pre-trained model weights and adjust intermediate activations. Checkout the paper, and code: One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning.
Have a nice week.
10 Link-o-Troned
GPT-Engineer: Tell the AI What to Build, Then the AI Builds It
[free] LLMs in Production Conference II, 15/06/23 (vid recording)
the ML Pythonista
Deep & Other Learning Bits
AI/ DL ResearchDocs
data v-i-s-i-o-n-s
MLOps Untangled
Berkeley Motion - MLOps for Auto Continual Learning & Monitoring
Giskard- Open-Source Testing Framework for ML Models (Tab to LLM models)
AI startups -> radar
ML Datasets & Stuff
Postscript, etc
Tips? Suggestions? Feedback? email Carlos
Curated by @ds_ldn in the middle of the night.