Up to Speed on AI & Deep Learning: November Update

Sharing some of the latest research, announcements, and resources on deep learning.

Continuing our series of deep learning updates, we pulled together some of the awesome resources that have emerged since our last post. In case you missed it, you can find all past updates here. As always, this list is not comprehensive, so let us know if there’s something we should add, or if you’re interested in discussing this area further. If you’re a machine learning practitioner or student, join our Talent Network here to get exposed to awesome ML opportunities.

Research & Announcements

OpenPose: Real-time multi-person keypoint detection library for body, face, and hands by CMU. OpenPose is a library for real-time multi-person keypoint detection and multi-threading written in C++ using OpenCV and Caffe.

An extension: Openpose from CMU implemented using Tensorflow with Custom Architecture for fast inference. It provides some changes to the network structure for real-time processing on the CPU or low-power embedded devices.

‘Mind-reading’ AI: Neural Encoding and Decoding with Deep Learning for Dynamic Natural Vision by Wen et al of Purdue Engineering. Researchers have demonstrated how to decode what the human brain is seeing by using artificial intelligence to interpret fMRI scans from people watching videos, representing a sort of mind-reading technology. Youtube video here. Original paper here.

Dynamic Routing Between Capsules by Hinton et al of Google Brain. New paper by Geoffrey Hinton. A capsule is a group of neurons whose activity vector represents the instantiation parameters of a specific type of entity such as an object or object part. We show that a discrimininatively trained, multi-layer capsule system achieves state-of-the-art performance on MNIST and is considerably better than a convolutional net at recognizing highly overlapping digits. PyTorch implementation of the paper here.

DeepXplore: Automated Whitebox Testing of Deep Learning Systems by Pei et al of Columbia Engineering.

Researchers at Columbia and Lehigh universities have come up with a way to automatically error-check the thousands to millions of neurons in a deep learning neural network. Their tool feeds confusing, real-world inputs into the network to expose rare instances of flawed reasoning by clusters of neurons.
DeepXplore efficiently finds thousands of incorrect corner case behaviors (e.g., self-driving cars crashing into guard rails and malware masquerading as benign software) in state-of-the-art DL models with thousands of neurons trained on five popular datasets including ImageNet and Udacity self-driving challenge data.

Voice Conversion with Non-Parallel Data (i.e. Speaking like Kate Winslet) by Ahn and Park. Deep neural networks for voice conversion (voice style transfer) in Tensorflow. GitHub repo.

Uber AI Labs Open Sources Pyro, a Deep Probabilistic Programming Language. Website and docs here.

Why probabilistic modeling? To correctly capture uncertainty in models and predictions for unsupervised and semi-supervised learning, and to provide AI systems with declarative prior knowledge.
Why (universal) probabilistic programs? To provide a clear and high-level, but complete, language for specifying complex models.
Why deep probabilistic models? To learn generative knowledge from data and reify knowledge of how to do inference.
Why inference by optimization? To enable scaling to large data and leverage advances in modern optimization and variational inference.

Resources, Tutorials & Data

Deep RL Bootcamp by UC Berkeley. _This two-day long bootcamp will teach you the foundations of Deep RL through a mixture of lectures and hands-on lab sessions, so you can go on and build new fascinating applications using these techniques and maybe even push the algorithmic frontier._ Great topics, content, and speakers — the lectures and labs are available online.

Latest Deep Learning OCR with Keras and Supervisely in 15 minutes by Deep Systems. This tutorial is a gentle introduction to building modern text recognition system using deep learning in 15 minutes.

AMA with the two of the team members at DeepMind who developed AlphaGo. AlphaGo Zero uses a quite different approach to deep RL than typical (model-free) algorithms such as policy gradient or Q-learning. By using AlphaGo search we massively improve the policy and self-play outcomes — and then we apply simple, gradient based updates to train the next policy + value network. This appears to be much more stable than incremental, gradient-based policy improvements that can potentially forget previous improvements.

BTW: How to Read a Paper by S. Keshav of University of Waterloo. Not newly published, but very useful as the number of new ML papers continues to grow rapidly.

By Isaac Madan. Isaac is an investor at Venrock (email). If you’re interested in deep learning or there are resources I should share in a future newsletter, I’d love to hear from you. If you’re a machine learning practitioner or student, join our Talent Network here to get exposed to awesome ML opportunities.

Requests for Startups is a newsletter of entrepreneurial ideas & perspectives by investors, operators, and influencers.

**Please tap or click “︎**❤” to help to promote this piece to others.