Jeong and I attended NIPS 2017 in December, 2017. Our notes are as follows.
TakeAways for Professionals
As shown in the statistics shared by organizers during opening remarks, the majority of NIPS papers are from academia. Even papers from industry, which are only a small fraction, are mostly from research organizations. What can professionals take away from this academic conference? In my experience, people from industry can get following benefits from NIPS.
 Cuttingedge research: This might not be applicable in practice immediately, but can still provide important perspectives and directions on each problem.
 Recruiting: I would say that 90% of sponsors are focus on hiring. All big companies had their afterparties (a.k.a. recruiting events).
 Networking: For some people, this is the most important benefit at NIPS. With over 7,000 attendees, NIPS 2017 was the largest academic conference in Machine Learning. Everyday I enjoyed conversations with many people in the same field at the poster sessions, afterparties, and even on the way back home with uberpool.
Technical Trends
We noticed technical trends as follows.
 Metalearning
 Interpretability
 ML systems (or systems for ML)
 Bayesian modeling
 Unsupervised learning
 Probabilistic programming
Below are areas that I would like to investigate further in 2018.
 Model Interpretation
 Attention models
 Online learning
 Reinforcement learning
Detailed SessionbySession Notes
Below are more detailed notes:
On 12/4 (Mon)
Tutorials
Title  Comments 

Deep Learning: Practice and Trends  Very good summary of deep learning’s current status and trends. CNN, RNN, adversarial networks and unsupervised domain adaptation are closer to actual application. These models should be in professionals' tool boxes. Meta learning and graph networks are interesting but further away from application. 
Deep Probabilistic Modeling with Gaussian Processes  This talk brings an important point. In real world applications, we need to know not only pointwise predictions, but also the level of uncertainty in predictions to support decision making. 
Geometric Deep Learning on Graphs and Manifolds by Michael Bronstein  This talk focuses on a interesting trend in deep learning, which uses deep learning on graphs data. In my opinion, there is still a long way to have real applications out of this field. 
Opening Remarks/ Invited Talk
Title  Comments 

Opening Remarks & Powering the next 100 years  Opening remarks has several interesting statistics of NIPS. It shows that NIPS is a very academiacentric conference. The invited talk, explains the huge amount of energy human need and the limitation of fossil fuel and lowcarbon tech. Some ideas of how machine learning can help new energy (fusion) next 100 years and have big impact. Including: exploration and inference experiments data. Adding human (domain experts) preferences into ML approach. Mentioned several Bayesian approaches. It is about applied machine learning in physics which can impact world a lot. Thanks to many open source frameworks, it gets much easier to apply ML to different problem. ML becomes a major tool and will have huge impact across different domains. 
Poster Sessions
Title  Comments 

SvCCa: Singular vector Canonical Correlation analysis for Deep understanding and improvement  Google’s blog and paper to understand deep learning models. It can be used to improve prediction performance. The key idea is using Singular vector Canonical Correlation (SvCC) to analysis hidden layer parameters. 
Dropoutnet: addressing Cold Start in recommender Systems  This focuses only on the item cold start. It need a metadata based vector representative of new items. 
LightGBM: A Highly Efficient Gradient Boosting Decision Tree  This paper explains the implementation of LightGBM. It uses different approximate approach from XGBoost's. 
Discovering Potential Correlations via Hypercontractivity  An interesting idea to find potential relationship in the subset of data. 
Other interesting papers  * Learning Hierarchical Information Flow with Recurrent Neural Modules. * Learning ReLUs via Gradient Descent. * Clone MCMC: Parallel HighDimensional Gaussian Gibbs Sampling * Efficient Use of LimitedMemory Accelerators for Linear Learning on Heterogeneous Systems 
On 12/5 (Tue)
Invited Talk
Title  Comments 

Why AI Will Make it Possible to Reprogram the Human Genome  This is one of the most impactful areas of AI/DL. Lately, AI/DL has been used to tackle many challenges in healthcare and shown some promising results. 
Test Of Time Award: Random Features for LargeScale Kernel Machines  This is the spotlight talk of NIPS 2017. It stirred a lot of discussions online. I highly recommend that you watch the video. Points from both sides of discussion are valid. Some related discussions: Yann LeCun's rebuttal to Ali's talk Alchemy, Rigour and Engineering 
The Trouble with Bias  This is a good topic. Data collection and creation process can introduce strong undesirable bias to the data set. ML algorithms can reproduce and even reinforce such bias. This is more than a technical problem. 
Poster Sessions
Title  Comments 

A Unified Approach to Interpreting Model Predictions  Use expectations and Shapley values to interpret model prediction. Unified several previous approaches including LIME. https://github.com/slundberg/shap 
PositiveUnlabeled Learning with NonNegative Risk Estimator  1 class classification is very useful in real world, e.g. click ads, watch content, etc. This paper use a different loss function in PU learning. 
An Applied Algorithmic Foundation for Hierarchical Clustering  There are several papers on hierarchical clustering. This is just one of them. Hierarchical clustering is also very useful in real world. In this paper it more focus on the foundation(objective function) of this problem. 
Affinity Clustering: Hierarchical Clustering at Scale  Another hierarchical clustering paper. A bottomup hierarchical clustering. Each time make many merge decisions. 
Mean teachers are better role models: Weightaveraged consistency targets improve semisupervised deep learning results  This is an interesting semisupervised deep learning approach. I feel it used students to prevent overfitting. Teacher and student improve each other in a virtuous cycle. 
Unbiased estimates for linear regression via volume sampling  Choose samples wisely can get similar (not bad) performance w entire data set. This will be useful in the scenarios which is costly to get labels. 
A framework for MultiA(rmed)/B(andit) Testing with Online FDR Control  There are several papers of MAB(Multiarmed bandit), this is one of them. MAB can be very useful in website optimization. 
Other interesting papers  * Streaming Weak Submodularity: Interpreting Neural Networks on the Fly * Generalization Properties of Learning with Random Features 
On 12/6 (Wed)
Invited Talk
Title  Comments 

The Unreasonable Effectiveness of Structure  This talk discussed the structure in input and output. Then describe a way to describe “structure” in data. (Probabilistic Soft Logic http://psl.linqs.org/ ) 
Deep Learning for Robotics  If working in robotics domain, this is a must attend talk. This talk discussed many unsolved pieces to the AI robotics puzzle and how DL (deep reinforcement learning, meta learning, etc ) can help. Some ideas might be useful in other domain. 
Poster Sessions
Title  Comments 

Clustering with Noisy Queries  This paper describe and analysis a way of how to gather answers of a clustering problem. Instead of asking “do element u belong to cluster A” this paper suggest asking “do elements u and v belong to the same cluster?” 
EndtoEnd Differentiable Proving  Very interesting paper which try to combine NN and 1st order logic expert system. Learn vector representation of symbols. 
ELF: An Extensive, Lightweight and Flexible Research Platform for Realtime Strategy Games  Looks like a fun place to try AI(:)). 
Attention Is All You Need  A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. 
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles  Measure the uncertainty is very important. This paper describe a way (simple nonBayesian baseline) to measure uncertainty. 
Other interesting papers  * Train longer, generalize better: closing the generalization gap in large batch training of neural networks * Unsupervised ImagetoImage Translation Networks * A simple neural network module for relational reasoning * Style Transfer from Nonparallel Text by CrossAlignment 
On 12/7 (Thu)
Invited Talk
Title  Comments 

Learning State Representations  This is a very interesting talk. It tried to peel the onions of how human make decision and learn stuff. The researcher also design experiments to prove the hypothesis of “we cluster experiences together into task states based on similarity and learning happens within a cluster, not across cluster borders”. Then try to design model structure to represent this cluster(state). 
On Bayesian Deep Learning and Deep Bayesian Learning  This talk is about combine Bayesian Learning and Deep Learning. This topic can be very useful in the future. It also include several projects in this area. 
Symposium – Interpretable ML
Title  Comments 

About this symposium  I think interpretability is a very important part of models. As be mentioned in one talk of this symposium interpretability is not a purely computational problem and beyond tech. The final goal still be untangle(understand) causal impact, model interpretability can be valuable in at least 2 aspects: debug model predict, help generate hypotheses to do controlled experiment. 
Invited talk  The role of causality for interpretability.  This talk discussed how to use causality in model interpretability. 
Invited talk  Interpretable Discovery in Large Image Data Sets  This talk present a DEMUD(SVODbased plus explanations) method to interprete image data sets. 
Poster  * Detecting Bias in BlackBox Models Using Transparent Model Distillation * The Intriguing Properties of Model Explanations * Feature importance scores and lossless feature pruning using Banzhaf power indices 
Debate about whether or not interpretability is necessary for machine learning  Interesting debates about interpretability. Worth to watch. 
Other Resources
NIPS videos, slides and notes are available as follows.
 NIPS 2017 Proceedings
 Slides
 Videos
 Curated Resources
 Notes

 NIPS 2017 – Day 1 Highlights by Emmanuel Ameisen
 NIPS 2017 – Day 2 Highlights by Emmanuel Ameisen
 NIPS 2017 – Day 3 Highlights by Emmanuel Ameisen
 Highlights from My First NIPS by Ryan Rosario
 NIPS 2017 Notes by David Abel (pdf)
 NIPS 2017 Reports by Viktoriya Krakovna
 NIPS 2017 notes and thoughts by Olga Liakhovich