Neural Architecture Search: Foundations and Trends
by Colin White, Debadeepta Dey
In the past decade, advances in deep learning have resulted in breakthroughs in a variety of areas from computer vision to natural language understanding to speech recognition. While many factors played into the rise of deep learning, the design of high-performing neural architectures has been crucial to its success. Neural architecture search (NAS), the process of automating the design of architectures for a given task, has seen rapid development with over 1200 papers released in the last two years, resulting in human-designed architectures being outpaced by automatically-designed architectures. In this tutorial, we give an organized and easy-to-follow guide to NAS. We survey search spaces, black-box optimization techniques, weight sharing techniques, speedup techniques, extensions, and applications. For each topic, we take care to describe the fundamentals in addition to cutting-edge trends. We conclude by describing best practices for NAS research, resources, and promising future directions.
by Aleksandra Faust
Training reinforcement learning (RL) systems that perform well in real-world end tasks is difficult for a number of reasons. One significant reason is that the engineers and applied researchers are faced with a multitude of design choices aiming to represent the real-world problem into the Partially Observable Markov Decision (POMDP) abstraction, which is insufficient to capture all aspects of the problem. As a result, the engineer acts through trial and error, optimizing the RL system design until satisfactory performance is reached. This is a tiring, time-consuming, and inefficient process. Learning to learn and Auto RL automates parts of this process, allowing users to focus on higher-level design questions. In this tutorial, we will go over the currently established technique, such as environment, algorithm, representation, and reward learning, and discuss available tools, how and why they work, and when they fail. Finally, since this is an emerging field, we will conclude with the future outlook and open problems facing the field.
Efficient Neural Architecture Search
by Tejaswini Pedapati, Martin Wistuba
The growing interest in the automation of deep learning has led to the development of a wide variety of automated methods for Neural Architecture Search. However, initial neural architecture algorithms were computationally intensive and took several GPU days. Training a candidate network is the most expensive step of the search. Rather than training each candidate network from scratch, the next few algorithms proposed parameter sharing amongst the candidate networks. But these parameter-sharing algorithms had their own drawbacks. In this tutorial, we would give an overview of some of the one-shot algorithms, their drawbacks, and how to combat them. Later advancements accelerated the search by training fewer candidates using techniques such as zero-shot, few-shot, and transfer learning. Just by using some characteristics of a randomly initialized neural network, some search algorithms were able to find a well-performing model. Rather than searching from scratch, some methods leveraged transfer learning. In this tutorial, we cover several of these flavors of algorithms that expedited the Neural Architecture Search.
Learning Curves for Decision Making in Supervised Machine Learning
by Felix Mohr and Jan N. van Rijn
Learning curves have commonly been adopted in the context of machine learning to assess the performance of a learning algorithm with respect to a certain resource, e.g. the number of training examples or the number of training iterations. Learning curves have important applications in several contexts of machine learning, most importantly for the context of data acquisition, early stopping of model training and model selection. For example, by modeling the learning curves, one can assess at an early stage whether the algorithm and hyperparameter configuration have the potential to be a suitable choice, often speeding up the algorithm selection process. Some models answer the binary decision question of whether a certain algorithm at a certain budget will outperform a certain reference performance, whereas more complex models predict the entire learning curve of an algorithm. This tutorial presents a framework that categorizes learning curve approaches using three criteria: the decision situation that they address, the intrinsic learning curve question that they answer and the type of resources that they use. We present prominent methods from literature that facilitate learning curves for decision making and classify them into this framework. This tutorial follows an extensive survey paper that we recently published.
Statistical Analysis for Benchmarking Experimental Data (Virtual)
by Tome Eftimov, Peter Korošec
Moving to the era of AutoML, a comprehensive comparison of the performance of single-objective stochastic optimization algorithms has become an increasingly important task. One of the most common ways to compare the performance of stochastic optimization algorithms is to apply statistical analyses. However, for performing them, there are still caveats that need to be addressed for acquiring relevant and valid conclusions. First of all, the selection of the performance measures should be done with great care since some measures can be correlated and their data is then further involved into statistical analyses. Further, the statistical analyses require good knowledge from the user to apply them properly, which is often lacking and leads to incorrect conclusions. Next, the standard approaches can be influenced by outliers (e.g., poor runs) or some statistically insignificant differences (solutions within some ε-neighborhood) that exist in the data. This tutorial will provide an overview of the current approaches for analyzing algorithms performance with special emphasis on caveats that are often overlooked. We will show how these can be easily avoided by applying simple principles that lead to Deep Statistical Comparison. The tutorial will not be based on equations, but mainly examples through which a deeper understanding of statistics will be achieved. Examples will be based on various comparison scenarios for single-objective optimization algorithms. The tutorial will end with a demonstration of a web-service-based framework (i.e. DSCTool) for statistical comparison of single-objective stochastic optimization algorithms. In addition, R clients for performing the analyses will be also presented.