On Deep Learning and Tabular Data. Is it worth all the effort, complexity, and cost of Deep Learning for supervised learning tasks with tabular data? Are tree-ensemble methods faster, more accurate and more cost-efficient for tabular data than DL? Let me share some interesting stuff.
Back in 2020, Google published a paper titled: TabNet: Attentive Interpretable Tabular Learning in which the team claimed that TabNet outperforms previous work across tabular datasets from different domains (pdf.)
But since then, many people have disagreed with the TabNet paper results. In 2021, Michael Clark wrote an excellent summary of findings on DL for tabular data. His conclusion: Definitely Deep Learning is Not All You Need for Tabular Data.
In June 2022, the DSAR group @University of Tübingen, published a new paper: Deep Neural Nets and Tabular Data: A Survey. They claim that algorithms based on gradient-boosted tree ensembles still mostly outperform DL models on supervised learning (pdf.)
And just recently, a team at Inria Saclay & Sorbonne University -some of them from the scikit-learn team- published Why Do Tree-based Models Still Outperform Deep Learning on Tabular Data? in which they provide a systematic benchmark and results showing that tree-based models remain state-of-the-art on medium-sized data even without accounting for their superior speed (pdf.)
US & UK competition on Privacy-Preserving ML (PPML). As per my previous post DM #158, PPML is likely to become part of the regulatory framework for finservices. Days ago the US & UK just announced a competition on Financial Crime, Healthcare and Privacy-Preserving Federated Learning. By the type and level of agencies & regulators involved, you can tell PPML is at the top of the agenda. Get here all the details on this Privacy Enhancing Tech Innovation Competition
10 Link-o-Troned
A Pythonista *Experience*
beCause of Dennis & Bjarne
C++ Incremental Decision TreesScripting aRt
Scripting aRt
Love from Julia
(Paren(th)ethical)
ScalaTOR
data v-i-s-i-o-n-s
Distributed de-Entangler
Forschung!
Algorithmic Potpourri
Robots & Cyborgs like <you>
Deep & Other Learning Bits
startups -> radar
ML Datasets & Stuff
Postscript, etc
Tips? Suggestions? Feedback? email Carlos
Curated by @ds_ldn in the middle of the night.