ML Digest: Going Hands-On with Aurélien Géron

ML Digest: Going Hands-On with Aurélien Géron.

Welcome to this week of the Best of Machine Learning Digest. In this weekly newsletter, we resurface some of the best resources in Machine Learning posted in the past week. This time, we've gotten 58 submissions, and we'd love for you to get involved to pump this number up.

Get involved and educate hundreds of ML Engineers around the world. Start now.

Papers

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new "Colossal Clean Crawled Corpus", we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.

Read More>>

Self-Paced Contextual Reinforcement Learning

Generalization and adaptation of learned skills to novel situations is a core requirement for intelligent autonomous robots. Although contextual reinforcement learning provides a principled framework for learning and generalization of behaviors across related tasks, it generally relies on uninformed sampling of environments from an unknown, uncontrolled context distribution, thus missing the benefits of structured, sequential learning. We introduce a novel relative entropy reinforcement learning algorithm that gives the agent the freedom to control the intermediate task distribution, allowing for its gradual progression towards the target context distribution. Empirical evaluation shows that the proposed curriculum learning scheme drastically improves sample efficiency and enables learning in scenarios with both broad and sharp target context distributions in which classical approaches perform sub-optimally.

Read More>>

Projects

This week, 14 Projects were posted on Best of ML. In the following, we're showing you the two top posts of this week.

CIFAR-10H - Human guess distribution soft labels for CIFAR-10 (Dataset Release)

CIFAR-10H is a new dataset of soft labels reflecting human perceptual uncertainty for the 10,000-image CIFAR-10 test set, which we are releasing today.

Read More>>

Fast Super Resolution GAN

From Reddit: "I've been super intrigued by image super resolution problems. Reading online, I found the SRGAN paper to be interesting, especially how the PSNR and SSIM metrics are unreliable when compared to human perception of quality. I wanted to create a faster version of the SRGAN, so I decided to use a MobileNet as the generator. This idea is somewhat inspired by Realtime Image Enhancement, Galteri et al. I want to use it to upsample low quality videos, for scenarios when you may not have access to high speed internet. You can leverage the GPU to do synthetic super resolution. I would appreciate any ideas towards increasing speed/quality of this project."

Read More>>

Books

This week, 2 Books were posted on Best of ML. In the following, we're showing you the Top 2 posts of this week.

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition

Through a series of recent breakthroughs, deep learning has boosted the entire field of machine learning. Now, even programmers who know close to nothing about this technology can use simple, efficient tools to implement programs capable of learning from data. This practical book shows you how.

By using concrete examples, minimal theory, and two production-ready Python frameworks—Scikit-Learn and TensorFlow—author Aurélien Géron helps you gain an intuitive understanding of the concepts and tools for building intelligent systems. You’ll learn a range of techniques, starting with simple linear regression and progressing to deep neural networks. With exercises in each chapter to help you apply what you’ve learned, all you need is programming experience to get started.

Read More>>

Feature Engineering for Machine Learning

Feature engineering is a crucial step in the machine-learning pipeline, yet this topic is rarely examined on its own. With this practical book, you’ll learn techniques for extracting and transforming features—the numeric representations of raw data—into formats for machine-learning models. Each chapter guides you through a single data problem, such as how to represent text or image data. Together, these examples illustrate the main principles of feature engineering.

Rather than simply teach these principles, authors Alice Zheng and Amanda Casari focus on practical application with exercises throughout the book. The closing chapter brings everything together by tackling a real-world, structured dataset with several feature-engineering techniques. Python packages including numpy, Pandas, Scikit-learn, and Matplotlib are used in code examples.

Read More>>

Blog Posts

This week, 38 Blog Posts were posted on Best of ML. In the following, we're showing you the Top 2 posts of this week.

How to Improve Training your Deep Neural Network in Tensorflow 2.0

When it comes to building and training Deep Neural Networks, you need to set a massive amount of hyper-parameters. Setting those parameters right has a tremendous influence on the success of your net and also on the time you spend heating the air, aka training you model. One of those parameters that you always have to choose is the so-called learning rate (also known as update rate or step size). For a long time, selecting this right was more like trial and error or black art. However, there exists a smart, though simple technique for finding a decent learning rate, which I guess became very popular through being used in fastai.

Read More>>

Book Review: Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and…

If you’re comfortable with coding in Python and want a quick introduction to both classic and deep learning techniques in Python from an experienced practitioner, “Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems” by Aurélien Géron just might be the book for you!

Read More>>