Epoch 2️⃣0️⃣: This week in ML (+ Bioinformatics 🧬 and Astronomy 🌌)
Implicit Differentiation, Flow-based sampling, DNA-GCN, LEXACTUM, PyAutoLens and more ...
Automatic differentiation (autodiff) has revolutionized machine learning. It allows expressing complex computations by composing elementary ones in creative ways and removes the burden of computing their derivatives by hand. In this paper, the authors propose a unified, efficient and modular approach for implicit differentiation of optimization problems. In their approach, the user defines (in Python in the case of their implementation) a function F capturing the optimality conditions of the problem to be differentiated. Once this is done, they leverage autodiff of F and implicit differentiation to automatically differentiate the optimization problem. Link to the code and Yannic Kilcher’s Video.
Algorithms based on normalizing flows are emerging as promising machine learning approaches to sampling complicated probability distributions in a way that can be made asymptotically exact. In the context of lattice field theory, proof-of-principle studies have demonstrated the effectiveness of this approach for scalar theories, gauge theories, and statistical systems. This work develops approaches that enable flow-based sampling of theories with dynamical fermions, which is necessary for the technique to be applied to lattice field theory studies of the Standard Model of particle physics and many condensed matter systems.
ML in Bioinformatics 🧬
DNA-GCN: Graph convolutional networks for predicting DNA-protein binding
Predicting DNA-protein binding is an important and classic problem in bioinformatics. Convolutional neural networks have outperformed conventional methods in modeling the sequence specificity of DNA-protein binding. However, none of the studies have utilized graph convolutional networks for motif inference. In this work, the authors propose to use graph convolutional networks for motif inference. They build a sequence k-mer graph for the whole dataset based on k-mer co-occurrence and k-mer sequence relationship and then learn DNA Graph Convolutional Network (DNA-GCN) for the whole dataset. They evaluate their model on 50 datasets from ENCODE. Link to the code.
Ten Quick Tips for Deep Learning in Biology
In the context of biological research, deep learning has been increasingly used to derive novel insights from high-dimensional biological data. To make the biological applications of deep learning more accessible to scientists who have some experience with machine learning, the authors solicited input from a community of researchers with varied biological and deep learning interests. These individuals collaboratively contributed to this manuscript's writing using GitHub and the Manubot manuscript generation toolset. The goal was to articulate a practical, accessible, and concise set of guidelines and suggestions to follow when using deep learning. Link to the accompanying Github Repository 📒.
Astroinformatics 🌌
As we enter the era of large-scale imaging surveys with the up-coming telescopes such as LSST and SKA, it is envisaged that the number of known strong gravitational lensing systems will increase dramatically. However, these events are still very rare and require the efficient processing of millions of images. In order to tackle this image processing problem, the authors present Machine Learning techniques and apply them to the Gravitational Lens Finding Challenge. The Convolutional Neural Networks (CNNs) presented have been re-implemented within a new modular, and extendable framework, LEXACTUM. Link to the code.
PyAutoLens: Open-Source Strong Gravitational Lensing
Strong gravitational lensing, which can make a background source galaxy appears multiple times due to its light rays being deflected by the mass of one or more foreground lens galaxies, provides astronomers with a powerful tool to study dark matter, cosmology and the most distant Universe. PyAutoLens is an open-source Python 3.6+ package for strong gravitational lensing, with core features including fully automated strong lens modeling of galaxies and galaxy clusters, support for direct imaging and interferometer datasets and comprehensive tools for simulating samples of strong lenses. ⭐️⭐️ Link to the Code.
Explainable, Interpretable, Bias and Ethics in AI
The AI Ethics Brief #59: Carbon-accounting for AI, filtered dating, dark patterns, and more ...
What Really Happened When Google Ousted Timnit Gebru (Wired Article)
Interesting Events and News 📰
Articles and Resources 📃 I liked
⭐️⭐️ ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
⭐️⭐️ Hacker's guide to deep-learning side-channel attacks: code walkthrough
Recommended Podcasts 🎧
Gradient Dissent | What could make AI conscious? with Wojciech Zaremba, co-founder of OpenAI
The Robot Brains Podcast | Mike Schuster on whether AI can help hedge fund investors to beat the market?
Test & Code | 156: Flake8: Python linting framework with Pyflakes, pycodestyle, McCabe, and more - Anthony Sottile
Machine Learning – Software Engineering Daily | Machine Learning: The Great Stagnation with Mark Saroufim
Houston We Have a Podcast | The International Space Station and Beyond
Gravity Assist: Before You Launch: Practice, Practice, Practice