Data Machina #148

SotA in Speech Recognition - a few days ago, GoogleAI has open-sourced Lingvo, a deep learning framework for State-of-the-Art speech recognition. Click here to read more on Google AI Lingvo and access the original paper and the code.

How to be Schmidhubered: In this fascinating conversation, Lex Fridman @MIT talks to Juergen Schmidhuber on Godel-Machines, Meta-Learning, LSTMs and more

On Causal Inference and The Book of Why: I’m reading this book and also following this intellectual match between Judea Pearl, the creator of Bayesian Networks and Probabilistic AI, and Andrew Gelman, a famous Bayesian statistician. Pearl bashes statisticians and thinks AI needs causal inference, and Gelman has difficulty understanding the point of Pearl’s writing on causal inference. Enjoy the reading.

Please help me keep Data Machina independent, no sponsors, no ads

This is the weekly, full version of Data Machina.

If you’d like to keep receiving it after March 1, please click here to subscribe and get a 35% off discount now.

You don’t have to do anything to receive a free, version of Data Machina every 2 weeks.

10 Link-o-Troned

  1. Seven Myths in Machine Learning Research

  2. Making Black Box Machine Learning Models Explainable

  3. Recent Advances in Machine Reading

  4. Bandit Swarm Networks

  5. My Reactions to Deepmind’s StarCraft II Agent AlphaStar

  6. Learning Topology Methods for Unsupervised Learning

  7. Marketing Spend & Bayesian Structural Time Series @Uber

  8. Google Machine Learning Course Notes

  9. The Latest in AI for GUI-Based Software Testing

  10. Reconstruct 99% of Twitter’s Firehose Anytime with SnowFlake

A Pythonista *Experience*

  1. PyTorch Under The Hood

  2. Graph Neural Networks Meet Personalized PageRank

  3. Harvard NLP Annotated Transformer from Scratch in PyTorch

beCause of Dennis & Bjarne

  1. frugally deep - Use Keras Models in C++ with Ease

  2. Kaldi - A Toolkit for Speech Recognition in C++

  3. MeTA - A Modern C++ Data Science Toolkit

Scripting aRt

  1. Causal Inference & Directed Acyclic Graphs (DAGs)

  2. Fast Out-of-Core Learning with Vowpal Wabbit in R

  3. Training Many Models in Parallel with Rstudio Jobs

Love from Julia

  1. Fundamentals of Machine Learning with Julia [pdf, 352 pages]

  2. Getting to Machine Learning from a General Purpose Compiler

  3. MLJ - A Pure Julia Machine Learning Framework


  1. Deep Learning Examples in Clojure MXNet

  2. ClojureAI - A List of AI, ML & Data Science Resources

  3. Flare: Clojure Dynamic Neural Net Library


  1. Transpiling Python to Scala with Neural Machine Translation

  2. High Performance NLP with Spark & Scala

  3. Dealing with Data Skewness and Salting Spark to Scale

data v-i-s-i-o-n-s

  1. Uber’s Autonomous Vehicle Visualization System

  2. VVVV- Visualising Various Views of Variability

  3. Plotting Neural Networks

Distributed de-Entangler

  1. Serverless is Dead

  2. cube.js - An Open Source Serverless Analytics Framework

  3. Zero to JupyterHub in the Cloud with Kubernetes & Docker

Blockchain Über Alles

  1. Why Ricardian -not Smart- Contracts are Blockchain Killer App

  2. You Don’t Need Blockchain: 8 Popular Cases, Why they Won’t Work

  3. Probabilistic Smart Contracts on the Blockchain

IoTea - everyThing/anyThing

  1. Arduino-based, Open Source Mobile Phone for Free Calls

  2. k3s - Lightweight Kubernetes for IoT & Edge

  3. ZeroPhone - an Open Source Raspberry Pi Smartphone


  1. Online Meta-Learning

  2. Intro to Contextual Embeddings [one of the best, no-hype intros ever]

  3. Time Series Prediction with Time-delay Embeddings

Algorithmic Potpourri

  1. Comparing Algorithms: Beyond Worst-case Analysis

  2. Swarm Intelligence Algorithm for Privacy-Preserving Data Mining

  3. Polymorphic Encryption Algorithms

Robots & Cyborgs like <you>

  1. Stanford AI: Intro to Generalizable Autonomy in Robotics

  2. MIND KIT: Maker Kit Exclusively for Robotics

  3. Programming for Robotics with ROS - [videos & slides]

Deep & Other Learning Bits

  1. Yann LeCun: The Epistemology of Deep Learning

  2. Google Brain: Introduction to Meta-Learning

  3. Reinforcement Learning & Optimal Control [book, slides & videos]

startups -> radar

  1. - Intelligent Web Agents for Autonomous Collaboration

  2. - A Service for Managing Machine Learning Models

  3. Armorblox - Deep Learning for Stopping Socially Engineered Attacks

ML Datasets & Stuff

  1. QMUL OpenLogo: Brand Logos Detection Dataset

  2. SIPRI Dataset: Major Arms Transfers 1950-2018

  3. Were You Born in a Full Moon? The Full Moon Dataset

Postscript, etc

Spread the word Share Data Machina with your friends

Tips? Suggestions? Feedback? Send email to Carlos

Curated by Carlos @ds_ldn in the middle of the night.