The NLP Arms Race continues… Just a few days ago Microsoft AI released MT-DNN, a Multi-task Deep NN that outperforms Google’s BERT in almost every NLP benchmark.
OpenAI became ClosedAI as they decided not to release their GPT-2 large scale model for fake text generation because it was… too dangerous.
The folks @OpenAI thought I’d be a great idea to call a few selected journos to show them the results behind closed doors and spin the PR news cycle: New AI fake text generator may be too dangerous to release, say creators…
Several people in the NLP/ML community followed up with some interesting posts:
In the meantime, someone has published an Open Clone of OpenAI's Unreleased WebText Dataset …LOL!
A plea for your help to fund Data Machina. Help me keep Data Machina fully independent and free from ads, sponsors, or any marketing bias.
This is the weekly, full version of Data Machina.
If you’d like to keep receiving it after March 1, please click here to subscribe and get a 35% off discount. You don’t have to do anything to receive a free, much shorter, version of Data Machina with a few topics every 2 weeks.
Your paid subscriptions will help me keep Data Machina independent and will enable me to curate only 100% unbiased content always. Many thanks for your support and help.
Probabilistic Programming and AI
Data Science is Different Now
The Unreasonable Effectiveness of Deep Feature Extraction
How Powerful are Graph Neural Networks [pdf opens slowly]
In-Depth Tutorial: AllenNLP (From Basics to ELMo & BERT)
OpenAI: Better Language Models and Their Implications
Berkeley AI: Controlling False Discoveries in Large-Scale Experiments
Facebook AI Open Sources New ELFOpenGo Dataset and Research
An Open Source Engine for Search & Machine Learning Ranking
Andrew Ng: How to Choose Your First AI Project
Yann LeCun: Deep Learning Will Require New Types of Hardware
Automatic Differentiation + Optimization in PyTorch
Neural Nets + Gaussian Processes: The Neural Processes Family
Pretrained Language Models for Google's BERT, OpenAI GPT-2
Mask R-CNN for Object Segmentation in C++
xForest - Super Fast, Scalable Random Forests in C++
Flashlight - A C++ Library for Machine Learning
Probability & Statistics: A Simulation-based Introduction
Explore & Visualise Boosted Regression Trees
Anatomy of a Logistic Growth Curve
A Julia Package for Prescriptive Analytics
Solving Partially Observable Markov Decision Processes
Julia Reinforcement Learning
Clojure at Netflix: The Good, The Bad & The Ugly
Object Detection with MXNet Clojure
Intro to Probabilistic Programming with MIT’s MetaProb
Testing Machine Learning Sytems in Staging
Introduction to Kafka Streaming with Scala
Initial Impressions of Scala from a Java&Python Data Engineer
Visualising Global Temperature Anomalies 1880-2017
An Alt, Data-Driven Country Map [Winner World Dataviz Prize]
The DNA of Good Government [Winner World Dataviz Prize]
What Comes after Serverless? A Deployless Future
Federated Learning: The Future of Distributed Machine Learning
Hipster Shop: Cloud-Native Microservices Demo App & Code
ETH Zurich Research Bitcoin as a Transaction Ledger (pdf)
The Ocean Protocol for Decentralized AI Data & Services (pdf)
OS Blockchain & Smart Contracts with Hyperledger Fabric
From Tensorflow to ML Kit: ML for Android Apps
Machine Learning for Mobile with Tensorflow
QP/C++ Open Framework for Real-time Embedded Systems
A New Theory for Selective Prediction
Explainable Text-Driven Neural Net for Stock Prediction
Cool Papers: Machine Learning/AI in Fashion
Closeness Centrality in Neo4j
Reinforcement Learning Algorithms: Free Book & Tutorial
[free book] Algorithms for Walking, Running, Flying… Robots
Self-Driving Cars MIT Lecture, Chief Scientist @Waymo
PythonRobotics - A Collection of Python Code for Robotics
Robotic Soft Sensing with Embedded Sensors & RNNs
Introduction to Reinforcement Learning [pdf, 519 pages]
xfer: Open Source Neural Network Transfer Learning
Deep Unsupervised Learning - UC Berkeley Spring 2019
Kite - AI Turbocharged Python Programming
BlazingDB - GPU-accelerated SQL for AI Workloads
R2.ai - Everyone’s Intelligent AutoML
UK Weather Stations Dataset 1853-2019
The NSFW (Not Safe for Work) Dataset, 220K Images
The Visual AI Dialog Challenge Dataset
Spread the word Share Data Machina with your friends
Tips? Suggestions? Feedback? Send email to Carlos
Curated by Carlos @ds_ldn in the middle of the night.