Data Machina #150

Free subscription, published every 2 weeks

A Tour-de-force on Causality & Machine Learning: Right! so there is a lot of ‘debate’ recently on whether Causality/Causal Inference can help Machine Learning overcome some of its “ limitations” or not.

Fist, apologies: A link correction is needed. In the previous free version of Data Machina I wrote On Causal Inference and The Book of Why.

As I said I’m reading this fascinating book and also following this intellectual match between Judea Pearl, the creator of Bayesian Networks and Probabilistic AI, and Andrew Gelman, a famous Bayesian statistician. Here is the correct link:  Pearl bashes statisticians and thinks AI needs causal inference, and Gelman has difficulty understanding the point of Pearl’s writing on causal inference. Enjoy the reading.

Second, the basics. Let me recommend you the following readings, books, courses, stuff on Causal Inference:

Third, let me bombard you with some interesting stuff on Machine Learning and Causality:

In this post Judea Pearl writes about The Seven Tools of Causal Inference and Reflections on Machine Learning

Back in 2017, the The Unofficial Google Data Science Team published this great post: Causality in Machine Learning

In this interview, Judea Pearl argues that To Build Truly Intelligent Machines, [we should] Teach Them Cause and Effect

At this forum some great minds like Judea Pearl, Michael Jordan, Leon Bottou, Hal Varian… discuss Drawing Causal Inference from Big Data

Michiel -a PhD researcher in Machine Intelligence- explains what Machine Learning can and can’t do, and how Climbing the Ladder of Causality can help.

Yanir - a Kaggle Master- claims that the often overlooked topic of causality should be more relevant for data scientists than Deep Learning. He elaborates: Why You Should Stop Worrying About Deep Learning And Deepen Your Understanding Of Causality Instead

Here you can access all the course materials of: Harvard’s Advances in Causality and Foundations of Machine Learning

In this paper, the folks at IBM Research write about Explaining Deep Learning Models using Causal Inference

In this presentation, the team at Oxford Policy Management wonders How can Machine Learning be Employed to Help with Causal Inference?

In this video talk, Professor Bernhard Schölkopf @Max Planck Institute for Intelligent System talks bout Statistical and Causal Approaches to Machine Learning

And finally, the ever great Ferenc -it’s been ages! since we last met- writes about why he’s become a full-on Causal Reasoning believer, and why Causal Inference and Causal Diagrams complement Deep Learning. Read more here: Machine Learning beyond Curve Fitting: An Intro to Causal Inference and do-Calculus

Postscript, etc

Spread the word Share Data Machina with your friends

Tips? Suggestions? Feedback? Send email to Carlos

Curated by Carlos @ds_ldn in the middle of the night.

This is Data Machina -Free Subscription Edition

Would you like to receive the full, Data Machina- Paid Subscription edition every week? Please click here to subscribe and get a 35% off discount now 

Data Machina #149 - Paid Subscription

On Deep Learning and New Programming Languages: Deep Learning may need a new programming language to overcome Python’s weaknesses

Swift for Tensorflow: it is time to embrace Swift for Machine Learning

Swift for TensorFlow: The Next-Generation Machine Learning Framework

Flux in Julia: we need a language to write differentiable algorithms.

Maybe Rust?: A work-in-progress catalog on the state of Rust for Machine Learning

Or Owl in OCaml?: a State-of-the-Art, platform for functional scientific and numerical computing

10 Link-o-Troned

  1. The World's 1st Immersive Linear Algebra Online Book

  2. JupyterBooks - Inspirational Machine Learning Notebooks

  3. Easily Transpile Trained ML Models into Native Python, C, Java

  4. Taco Bell Programming: Hadoop Hell and The Unix Zen

  5. Viewing Matrices & Probability as Graphs

  6. Awesome Network, Graph Embeddings - A Curated List

  7. Visual Exploration of Neural Nets with Activation Atlases

  8. Ultrafast Geospatial DB for Geofencing & Location-based Apps

  9. GPU Accelerated Javascript for Massive Parallel Computations

  10. [free ebook] Harvard+Stanford Intro to Probability [609 pages, pdf]

A Pythonista *Experience*

  1. ThunderGBM: Fast GBDTs and Random Forests on GPUs

  2. Poincaré Embeddings for Learning Hierarchical Representations

  3. Location Embeddings: Implementing Loc2Vec in PyTorch

beCause of Dennis & Bjarne

  1. rec2c - A Fast, Open-source Tokenizer in C++

  2. Shogun- Unified, Efficient Machine Learning

  3. xtensor- C++ Tensor Algebra Library

Scripting aRt

  1. Classifying News Content with R, bash & Vowpal Wabbit

  2. Functional Data Analysis in R Course: Lectures & Codes

  3. Feature Selection in R with mlr

Love from Julia

  1. Probabilistic Programming with Programmable Inference

  2. Google's Machine Learning Crash Course in Julia

  3. Multiple Dispatch - An Example for Math Optimizers


  1. Getting Started with Clojure and MXNet on AWS

  2. a Clojure Wrapper for DeepLearning4J

  3. You’re in a Maze of Deeply Nested Maps, All Alike


  1. How to Deploy KubeFlow on Lightbend (9 Chapters)

  2. Category Theory for Programmers, Milewski 3 March 2019

  3. DynaML -a Scala & JVM Machine Learning Toolbox

data v-i-s-i-o-n-s

  1. A Visual Exploration of Exoplanets

  2. A New Way to Visualise Interactions in Neural Nets

  3. Google DataGIF Maker to Compare Data & Tell Stories

Distributed de-Entangler

  1. [free ebook] Distributed Systems for Fun & Profit

  2. Overview of Ozone: A Modern Object Store for Hadoop

  3. All the Talks from Facebook Data@Scale Conference

Blockchain Über Alles

  1. Decentralized, Self-Sovereign, and Blockchain Identity

  2. The 1st Python Blockchain with Turing Complete Contracts

  3. Why it is Impossible to Solve Blockchain Trilemma?

IoTea - everyThing/anyThing

  1. Noise Mapping with KafkaSQL, RasPi & Software-Defined Radio

  2. IoT & Fraud Detection with Kafka, Tensorflow & Google Cloud

  3. Industrial IoT with Kafka, Flink and CrateDB


  1. A Fast Multi-pattern Regex Matcher for Modern CPU

  2. Agents that Learn to Follow Directions in Google Street View

  3. Emotion-based Fake News Detection with word2vec & RNNs

Algorithmic Potpourri

  1. Image-Based Airbnb Pricing Algorithm

  2. Training Time Estimation for scikit-learn Algorithms

  3. Interactive, Online: Path Finder Algorithms

Robots & Cyborgs like <you>

  1. The Amazing MIT Mini Cheetah Robot

  2. Learning to Walk via Deep Reinforcement Learning

  3. Learning from Demos to Mimic Human Behaviour

Deep & Other Learning Bits

  1. Deep Learning to Federated Learning in 10 Lines of Code

  2. All the Projects from UC Berkley Deep Learning Spring2019

  3. Integrating Domain Knowledge into Deep Learning [pdf]

startups -> radar

  1. RaptorMaps - Machine Learning for Solar Panel Inspections

  2. Orcam - Advanced Wearable AI Devices for the Blind

  3. Freenome - AI Genomics for Cancer Detection

ML Datasets & Stuff

  1. Common Voice - The Largest Human Voice Dataset

  2. Who Links Whom? 1.78 Billion Links Graph Dataset

  3. GrapAL- Knowledge Graph of 40 Million Academic Papers

Postscript, etc

Spread the word Share Data Machina with your friends

Tips? Suggestions? Feedback? Send email to Carlos

Curated by Carlos @ds_ldn in the middle of the night.

Would you like to receive the full, Data Machina- Paid Subscription edition every week? Please click here to subscribe and get a 35% off discount now