Epoch 1️⃣8️⃣: This week in ML (+ Bioinformatics 🧬 and Astronomy 🌌)

Expire-Span, CogView, DeepAb, ProteinBERT, Ghost Artifacts in Dark Energy and more ...

May 29, 2021

Sorry for the late upload. I decided to take some time off considering the current state in India.

Abstract: Attention mechanisms have shown promising results in sequence modeling tasks that require long-term memory. Recent work investigated mechanisms to reduce the computational cost of preserving and storing memories. However, not all content in the past is equally important to remember. The authors propose Expire-Span, a method that learns to retain the most important information and expire the irrelevant information. This forgetting of memories enables Transformers to scale to attend over tens of thousands of previous timesteps efficiently, as not all states from previous timesteps are preserved. Link to the code and Yannic Kilcher’s Video.

CogView

Abstract: Text-to-Image generation in the general domain has long been an open problem, which requires both generative model and cross-modal understanding. The authors propose CogView, a 4-billion-parameter Transformer with VQ-VAE tokenizer to advance this problem. They also demonstrate the finetuning strategies for various downstream tasks, e.g. style learning, super-resolution, text-image ranking and fashion design, and methods to stabilize pretraining, e.g. eliminating NaN losses. CogView (zero-shot) achieves a new state-of-the-art FID on blurred MS COCO, outperforms previous GAN-based models and a recent similar work DALL-E. Link to the code.

ML in Bioinformatics 🧬

Antibody structure prediction using interpretable deep learning
Therapeutic antibodies make up a rapidly growing segment of the biologics market. However, rational design of antibodies is hindered by reliance on experimental methods for determining antibody structures. In recent years, deep learning methods have driven significant advances in general protein structure prediction. Here, the authors present DeepAb, a deep learning method for predicting accurate antibody FV structures from sequence. They evaluate DeepAb on two benchmark sets – one balanced for structural diversity and the other composed of clinical-stage therapeutic antibodies – and find that their method consistently outperforms the leading alternatives. By introducing a directly interpretable attention mechanism, they show that our network attends to physically important residue pairs.
ProteinBERT: A universal deep-learning model of protein sequence and function
Self-supervised deep language modeling has shown unprecedented success across natural language tasks, and has recently been repurposed to biological sequences. However, existing models and pretraining methods are designed and optimized for text analysis. The authors introduce ProteinBERT, a deep language model specifically designed for proteins. Their pretraining scheme consists of masked language modeling combined with a novel task of Gene Ontology (GO) annotation prediction. They introduce novel architectural elements that make the model highly efficient and flexible to very large sequence lengths. The architecture of ProteinBERT consists of both local and global representations, allowing end-to-end processing of these types of inputs and outputs. Link to the code.

Astroinformatics 🌌

Finding flares in Kepler and TESS data with recurrent deep neural networks
Stellar flares are an important aspect of magnetic activity – both for stellar evolution and circumstellar habitability viewpoints – but automatically and accurately finding them is still a challenge to researchers in the Big Data era of astronomy. The authors present an experiment to detect flares in space-borne photometric data using deep neural networks. Using a set of artificial data and real photometric data, they trained a set of neural networks, and found that the best performing architectures were the recurrent neural networks (RNNs) using Long Short-Term Memory (LSTM) layers. Testing the network trained on Kepler data on TESS light curves showed that the neural net is able to generalize and find flares – with similar effectiveness – in completely new data having previously unseen sampling and characteristics. Link to the code.
A Machine Learning Approach to the Detection of Ghosting and Scattered Light Artifacts in Dark Energy Survey Images
Astronomical images are often plagued by unwanted artifacts that arise from a number of sources including imperfect optics, faulty image sensors, cosmic ray hits, and even airplanes and artificial satellites. Spurious reflections (known as “ghosts”) and the scattering of light off the surfaces of a camera and/or telescope are particularly difficult to avoid. Detecting ghosts and scattered light efficiently in large cosmological surveys that will acquire petabytes of data can be a daunting task. In this paper, the authors use data from the Dark Energy Survey to develop, train, and validate a machine learning model to detect ghosts and scattered light using convolutional neural networks. As a proof of principle, they have showed that their method is promising for the Rubin Observatory and beyond.