Epoch 1️⃣8️⃣: This week in ML (+ Bioinformatics 🧬 and Astronomy 🌌)
Expire-Span, CogView, DeepAb, ProteinBERT, Ghost Artifacts in Dark Energy and more ...
Sorry for the late upload. I decided to take some time off considering the current state in India.
Abstract: Attention mechanisms have shown promising results in sequence modeling tasks that require long-term memory. Recent work investigated mechanisms to reduce the computational cost of preserving and storing memories. However, not all content in the past is equally important to remember. The authors propose Expire-Span, a method that learns to retain the most important information and expire the irrelevant information. This forgetting of memories enables Transformers to scale to attend over tens of thousands of previous timesteps efficiently, as not all states from previous timesteps are preserved. Link to the code and Yannic Kilcher’s Video.
Abstract: Text-to-Image generation in the general domain has long been an open problem, which requires both generative model and cross-modal understanding. The authors propose CogView, a 4-billion-parameter Transformer with VQ-VAE tokenizer to advance this problem. They also demonstrate the finetuning strategies for various downstream tasks, e.g. style learning, super-resolution, text-image ranking and fashion design, and methods to stabilize pretraining, e.g. eliminating NaN losses. CogView (zero-shot) achieves a new state-of-the-art FID on blurred MS COCO, outperforms previous GAN-based models and a recent similar work DALL-E. Link to the code.
ML in Bioinformatics 🧬
Antibody structure prediction using interpretable deep learning
Therapeutic antibodies make up a rapidly growing segment of the biologics market. However, rational design of antibodies is hindered by reliance on experimental methods for determining antibody structures. In recent years, deep learning methods have driven significant advances in general protein structure prediction. Here, the authors present DeepAb, a deep learning method for predicting accurate antibody FV structures from sequence. They evaluate DeepAb on two benchmark sets – one balanced for structural diversity and the other composed of clinical-stage therapeutic antibodies – and find that their method consistently outperforms the leading alternatives. By introducing a directly interpretable attention mechanism, they show that our network attends to physically important residue pairs.
ProteinBERT: A universal deep-learning model of protein sequence and function
Self-supervised deep language modeling has shown unprecedented success across natural language tasks, and has recently been repurposed to biological sequences. However, existing models and pretraining methods are designed and optimized for text analysis. The authors introduce ProteinBERT, a deep language model specifically designed for proteins. Their pretraining scheme consists of masked language modeling combined with a novel task of Gene Ontology (GO) annotation prediction. They introduce novel architectural elements that make the model highly efficient and flexible to very large sequence lengths. The architecture of ProteinBERT consists of both local and global representations, allowing end-to-end processing of these types of inputs and outputs. Link to the code.
Astroinformatics 🌌
Finding flares in Kepler and TESS data with recurrent deep neural networks
Stellar flares are an important aspect of magnetic activity – both for stellar evolution and circumstellar habitability viewpoints – but automatically and accurately finding them is still a challenge to researchers in the Big Data era of astronomy. The authors present an experiment to detect flares in space-borne photometric data using deep neural networks. Using a set of artificial data and real photometric data, they trained a set of neural networks, and found that the best performing architectures were the recurrent neural networks (RNNs) using Long Short-Term Memory (LSTM) layers. Testing the network trained on Kepler data on TESS light curves showed that the neural net is able to generalize and find flares – with similar effectiveness – in completely new data having previously unseen sampling and characteristics. Link to the code.
Astronomical images are often plagued by unwanted artifacts that arise from a number of sources including imperfect optics, faulty image sensors, cosmic ray hits, and even airplanes and artificial satellites. Spurious reflections (known as “ghosts”) and the scattering of light off the surfaces of a camera and/or telescope are particularly difficult to avoid. Detecting ghosts and scattered light efficiently in large cosmological surveys that will acquire petabytes of data can be a daunting task. In this paper, the authors use data from the Dark Energy Survey to develop, train, and validate a machine learning model to detect ghosts and scattered light using convolutional neural networks. As a proof of principle, they have showed that their method is promising for the Rubin Observatory and beyond.
Explainable, Interpretable, Bias and Ethics in AI
The AI Ethics Brief #57: Race and Digital Society, AI winter to AI hype, privacy in China, and more
Twitter | Sharing learnings about our image cropping algorithm
Report finds startling disinterest in ethical, responsible use of AI among business leaders
Interesting Events and News 📰
😍🤩 See the Wild Plans for Nüwa, a Proposed City on Mars Built Inside a Giant Cliff
Google Cloud unveils Vertex AI, one platform, every ML tool you need
The Military Is Funding Ethicists to Keep Its Brain Enhancement Experiments in Check
👨🏻🎨 DeepMind Releases Algorithm To Create Mind-Blowing Paintings Just From Text
Articles and Resources 📃 I liked
DeepMind’s Latest A.I. Health Breakthrough Has Some Problems
A very Bayesian interpretation of decision trees and other machine learning algorithms
How to Deploy TransGAN for generating CelebA-like pictures ⭐️
Recommended Podcasts 🎧
Gradient Dissent | Alyssa Simpson Rochwerger on responsible machine learning in the real world
TWIML AI Podcast | Using AI to Map the Human Immune System w/ Jabran Zahid - #485
The Stack Overflow Podcast | Build engineering at Apple and the future of deploy previews
The Robot Brains Podcast | Peter Chen on building brains for robots in the real world
Python Bytes | #234 The Astronomy-filled edition with Dr. Becky
Data Science at Home | MLOps: what is and why it is important Part 2 (Ep. 152)
The Joy of X Podcast | Eve Marder on the Crucial Resilience of Neurons