Data Machina #237

New SOTA AI Code Generation. AlphaCodium. Graph ML 2024. The Modern AI Stack. QAnything AI. Apple AIM. NVDIA QAChat. MetAI Self-Rewarding Models. Dataland. Preference Tuning: DPO, IPO & TKO.

Jan 21, 2024

AI Code Generation: A New Paradigm. Several developer surveys indicate that devs -especially Sr. devs- who use AI tools are “more productive.” Today you can use lots of AI tools for code completion, pair programming, data generation, an even having a team of AI coding agents to complete tedious tasks for you.

Back in October, the team at DeepSense wrote an excellent blogpost on the state of the art in AI coding agents, detailing the pros & cos, and workflows of each AI coding agent.

Worth mentioning that -fundamentally- all these AI coding agents are mostly based on prompt engineering, chaining prompts and CoT techniques, and hacking prompts for AI code generation.

New SOTA AI code generation paradigm. But five days ago, the team at Codium introduced AlphaCodium: a new, SOTA AI code generation approach that implements a test-based, multi- stage, code-oriented iterative flow.

AlphaCodium was tested on the DeepMind CodeContest Dataset, which has +13K competitive coding problems. Some key points about AlphaCodium:

It moves away from the engineering prompt-answer (hacking prompts) basic paradigm to a coding "flow" paradigm, where the AI agent builds and tests the answer (code solution) by using self-reflection and other techniques iteratively
It beats DeepMind AlphaCode2 -SOTA until now- which uses a brute force approach of clustering a few coding solutions among a search space of a million of AI generated coding solutions
It beats CodeChain, another near-SOTA approach which uses a chain of sub-module-based self-revisions
It doesn’t need intensive model fine-tuning and massive computation like other AI code generation approaches

You can read all the details on AlphaCodium in these links:

stable-code-3b: A new smallish, SOTA AI code completion model. Stability AI just released stable-code-3b, a 2.7B billion parameter decoder-only language model pre-trained on 1.3 trillion tokens of diverse textual and code datasets. stable-code-3b is trained on 18 programming languages and demonstrates state-of-the-art performance on AI code completion (compared to models of similar size). Blogpost - Stable Code 3B: Coding on the Edge

Programming -not prompting- Foundation Models. Prompt engineering is a hack for NLP tasks that was not originally designed for coding. Standford DSPy is a framework for developing high-quality LM systems. DSPy minimises much of the issues of prompt engineering, hacking prompts, or stuffing everything together into one single clever prompt. DSPy separates the flow of your program (modules) from the parameters (prompt instructions, few-shot examples, and LM weights), which DSPy optimisers can tune if you give them an objective. In this interview below, Omar -the creator of DSPy and ColBERT- discusses the benefits and advantages of DSPy programmatic approach. Highly recommended, it will change the way you think about developing LLM apps.