Table of Contents

1 NeurIPS 2019

This repository is a collection of the papers and my notes from attended talks and other various parts of NeurIPS 2019. I created this repository mostly for my own sake, but also because I found it painstaking to find the papers and slides for the various talks I wanted to go to. Included also are my personal notes that will be updated as the week goes on.

1.1 Sunday

1.1.1 Habana Labs

  • Notes
  • Talk was a brief showcase of the Habana Goya Inference Processor. If youre not familiar with this type of processor, they allow you to fine-tune the inference of your model by reducing the precision of your model without accuracy lost. This sentence liberally taken from the Nvidia website and summerizes the importance of reduced precision:

Reduced precision inference significantly reduces application
latency, which is a requirement for many real-time services,
auto and embedded applications.

1.1.2 Facebook Hardware

  • Notes
  • Very interesting talk on the Open Accelerator Intrastructure. Couldn't stay for the whole thing, but the jist of it is that the OAI stack brings an open source software feel to hardware. The stack aims to reduce the time to integration for AI systems. There is an impressive number of companies behind it including Habana. I couldl't stay for whole talk, but for more detail see link below.
  • OAI

1.1.3 ML in Finance

  • Notes
  • Personal Note: Two Sigma folks were very nice and approachable.
  • Attend the end of the J.P. Morgan talk and the Two Sigma talk in this session. Both were very high level talks about ML in finance. J.P. Morgan covered the basics of RL and their application to financial data. Two Sigma's talk broke down different uses of ML for finding opportunies within the market. They provided an example of how break down an instagram photo into consumable data for CNNs(object-recognition) and LSTMs(sentiment analysis). The process of training forecasting models was also covered.

1.2 Monday (tutorials)

1.2.1 Deep Learning with Bayesian Principles

  • Notes
  • Author, Mohammad Emtiyaz Khan, focused on the benefits of combining Bayesian Learning and Deep Learning approaches. He showed how to derive common DL optimizers like Adam and RMSProp from Bayesian principles. Khan stressed the importance of such Bayesian principles in lifelong learning. This talk was one of my favorites because Khan did a great job of distilling the benefits of Bayesian Learning into well explained equations.

1.2.2 Efficent Processing of Deep Neural Network

  • Notes
  • Vivenne Sze discussed the various processing methods available and being researched for AI computation. Specifically, Sze drilled home the impact of reads from DRAM in training. This talk was quite dense and covered many aspects of AI processing from chip specifics to co-design and Neural Architecture Search (NAS).

1.2.3 Reinforcement Learning: Past, Present, and future Prospectives

  • Notes
  • Katja Hofmann, a part of Microsoft Research, talked about Reinforcement Learning (Rl) from inception to standard practice. Hofmann outlined Deep Q-Learning and some improvements that have been made to the method such as Boostrapped DQNs from Osband et al. (2018). She then explained the Actor Critic model and shared a number of papers I am going to go read.

1.3 Tuesday

1.3.1 Uniform Convergence may be unable to Explain generalization in Deep Learning

1.3.2 Logarithmic Regret for Online Control

1.3.3 Legendre Memory Units: Continuous Time Representation in RNNs

1.3.4 Point Voxel CNN for Efficent 3D Deep Learning

  • Site
  • Paper
  • Won gold model in Lyft Challenge
  • Shows large improvement from Point NN

1.3.5 Conditiional Independance Testing Using GANs

1.3.6 Machine Learning Meets Single-Cell Biology: Insights and Challenges

  • Speaker: Dana Pe'er
  • Biology is becoming a data science
  • Mentioned the success of tSNE for biological data
  • Was able to use DS methods to discover a new cell type that formed a checkpoint in DNA structure. Cell was as rare as 7 in 10000.
  • Was able to map the spatio-temporal development of mammalian endoderm.
    • Setty et al (2019), Nature

1.3.7 Causal Confusion in Imitation Learning

1.3.8 Generative Modeling by Estimating Gradients of the Data Distribution

  • Paper
  • Stable training as opposed to GANs
  • Better or comparable sample quality to GANs
  • Inpainting example was quite cool

1.3.9 Reducing the Varience in Online Optimization by Transporting Past Gradients

1.3.10 SySCD: A System-Aware Parallel Coordinate Descent Algorithm

1.3.11 Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies

1.3.12 Hindsight Credit Assignment

1.3.13 Weight Agnostic Neural Networks

  • Paper
  • Slides
  • Site
  • Questions the importance of architechture in the learning process. Makes the comparison to Precocial species who have certain abilities provided at birth.
  • Shows that a neural architechture search can perform on multiple reinforcement learning and supervised learning tasks.
  • Based on NEAT

1.3.14 Other Papers

  1. Neural Networks with Cheap Differential Operators
  2. Sequential Neural Processes

1.4 Wednesday

1.4.1 Fast and Accurate Least-Mean-Squares Solvers

  • Paper
  • Novel method to compute caratheodory set
  • Mentioned heavy dependance on the dimensionality being low for benefits to show.

1.4.2 Calibration tests in multi-class classification: A unifying framework

1.4.3 Verified Uncertainty Calibration

  • Code
  • Paper
  • Platt scaling, cannot estimate calibration
  • debiased estimator

1.4.4 Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations

1.4.5 Principal Component Projection and Regression in Nearly Linear Time through Asymmetric SVRG

  • Slides
  • Paper
  • Combining PCA and linear projection and regression (PCP, PCR)

1.4.6 PIDForest: Anomaly Detection via Partial Identification

1.4.7 Guided Similarity Seperation for Image Retrieval

  • Paper
  • Graph convolutionall network that models the descriptor graph where the descriptors are what are compared against for images to be retrieved.

1.4.8 CNAPs: Fast and Flexible Multi-Task Classification Using Conditional Neural Adaptive Processes

  • Paper
  • Slides
  • Code
  • Showed added classes and added test images without re-training
  • Meta-Dataset: a dataset of datasets

1.4.9 Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

1.4.10 Efficient Meta Learning via Minibatch Proximal Update

1.4.11 Reconciling meta-learning and continual learning with online mixtures of tasks

1.4.12 Beyond Online Balanced Descent: An Optimal Algorithmfor Smoothed Online Convex Optimization

  • Slides
  • Paper
  • Goal wasnt to minimize regret, was to acheive optimal competative ratio.
  • Competitive ratio: Minimize cost of learner action compared to best possible course of action

1.4.13 Strategizing against No-regret Learners

  • Two types of agents: Strategic and Learning
  • Strategic agents maximize utility whereas learning agents play to learn how to play
  • Bidders behavior in online auctions is laregly consistent with a no-regret learner
  • mean-based algorithm: play the historically best action

1.5 Thursday

1.5.1 Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs

  • Video
  • Code
  • Introduced "Brain-Score" evaluation system
    • Measures similarty in human mistakes to artificial neural mistakes
  • CORnet-s performs similar to ResNet on imagenet with signifigantly fewer layers
  • Theme = shallow networks with recurrent layers can outperform deep networks

1.5.2 Learning Perceptual Inference by Contrasting

1.5.3 Universality and individuality in neural dynamics across large populations of recurrent networks

1.5.4 Better Transfer Learning with Inferred Successor Maps

  • Slides
  • Transfer learning in RL
  • Mentioned Dayan 1993, fifth time ive heard that here. Need to read that.
  • Clustered tasks by similarty in reward
  • Bayesian Successor Representation

1.5.5 A Unified Theory for the Origin of Grid Cells through the Lens of Pattern Formation

  • Slides
  • Site
  • Code (doesnt seem to be posted yet)
  • Neural networks learn grid patterns
  • High level but very interesting talk

1.5.6 Infra-slow brain dynamics as a marker for cognitive function and decline

1.5.7 Agency + Automation: Designing Artificial Intelligence into Interactive Systems

1.5.8 Making AI Forget You: Data Deletion in Machine Learning

1.5.9 XLNet: Generalized Autoregressive Pretraining for Language Understanding

1.5.10 Other Papers

  1. On the Downstream Performance of Compressed Word Embeddings

1.6 Workshops

1.6.1 MLSys: Systems for machine learning

  • This workshop surronds tools for machine learning. Many of the tools focused on distributed training, model compilation, and workflow improvements
  • Some highlights:
    • SKTime: Think scikit-learn but for time series.
    • Condensa: Programmable Model Compression
    • NeMo: toolkit for building AI applications using neural modules
  • Vivenne Sze Talk
    • stressing that DRAM reads are expensive and MACS are not (relatively)
    • energy estimation tool
    • (More or less the same talk as the other keynote she gave)

1.6.2 Machine Learning for Physical Sciences

  • Modeling Turbulent Flow
    • Rayleigh Bernard convection model
    • Author's Tf-flow model was able to accurately model small scale eddies
    • Could not speak to what would happen at an increased resolution
    • Trained in 10K sequences of RBC data
  • JAX: MD simulations in pure python
    • JIT compiled for GPU
    • Ability to run model step by step
    • ML as a first class citizen, any function can be a neural network
    • JAX
    • Paper
  • Katie Boumann: Black Hole discovery
  • Al├ín Aspuru-Guzik
    • ML and MD
    • SMILES
    • Autoencoders for drug discovery
    • "Self driving laboratory" = least amount of experiments, optimal outcome
    • ChemOS, ChemOS
    • Author warns agasint the use of smiles due to grammer constraints. suggests that SELFIES should be used. SELFIES is feature incomplete
  • Lenka Zdeborova

Author: Sam Partee

Created: 2019-12-26 Thu 16:52