Topic Title: Automated machine learning on Platform of Artificial Intelligence
Technical Area: Automated machine learning (Auto ML), Learning2Learn,Meta Learning, Transfer Learning
Machine learning has a great success in the past decades , especially in the last few years have seen much success of deep neural networks in many challenging applications, such as speech recognition, image recognition and machine translation. Along with this success is a paradigm shift from feature designing to hyper-parameter tuning, neural architecture designing, and transfer learning, all of which still requires a lot of expert knowledge and takes ample time. AutoML is the process of automating the end-to-end process of applying machine learning to real-world problems, which could offer not only the advantages of producing simpler solutions, but also faster creation of those solutions, and models that often outperform models that were designed by hand.
In Alibaba tens of thousands of machine learning jobs runs on PAI (Platform of Artificial Intelligence) everyday. Most of those jobs are offline training jobs, after which the well-tuned models are used in the search, advertisement displaying and commodity recommendation and other applications. All the jobs cost tens of thousands of CPU cores and thousands of GPU cards. Besides applications inside Alibaba, PAI is also supplying a PAAS service through Aliyun Cloud. As most of the PAI users are algorithm developers, we are looking forward to use advanced techniques in automl like learning2learn, bayesian optimization, reinforcement learning, evolution algorithm and transfer learning to reducing the barriers to the entry of machine learning and accelerating the algorithm application.
We invite researchers who are either experts or are keenly aware of the challenges and opportunities that their fields bring to auto ml work on this newly and also hot research domain, and building the product AutoML on PAI.
Automated machine learning can target various stages of the machine learning process. On PAI we are interested in all the bellowed topics, but not limited to:
- Hyper-parameter optimization f the learning algorithm
- Deep neural network architecture search
- Transfer learning
- Device placement optimization in the operation graph
- The optimization for optimization algorithms in neural networks
- The optimization for activation functions in neural networks
Related Research Topics
1. Hyper-parameter Tuning
In recent years, machine learning models have exploded in complexity and expressibility at the cost of staggering computational costs and a growing number of tuning parameters that are difficult to set by standard optimization techniques. These hyperparameters are inputs to machine learning algorithms that govern how the algorithm’s performance generalizes to new, unseen data; examples of hyperparameters include those that impact model architecture, amount of regularization, and learning rates. The quality of a predictive model critically depends on its hyperparameter configuration, but it is poorly understood how these hyperparameters interact with each other to affect the quality of the resulting model. Consequently, practitioners often default to brute-force methods like random search and grid search.
Hyper-parameter optimization includes hyperparameter configuration selection and evaluation. In the former, Bayesian optimization methods are dominated, however, the sequential property and dimension disaster of Bayesian optimization make it hard to use in big data scenarios. And in the hyperparameter selection evaluation, there is no good general early stopping algorithm.
2. Neural Architecture Search
Discovering high-performance neural network architectures required years of extensive research by human experts through trial and error. The combinatorial explosion in the design space makes handcrafted architectures not only expensive to obtain, but also likely to be suboptimal in performance. Recently, there has been a surge of interest in using algorithms to automate the manual process of architecture design. Their goal can be described as finding the optimal architecture in a given search space such that the validation accuracy is maximized on the given task. Representative architecture search algorithms can be categorized as evolution algorithms, and reinforcement learning.
When using evolutionary algorithms (EA), each neural network structure is encoded as a string, and random mutations and recombinations of the strings are performed during the search process. When using reinforcement learning (RL), the agent performs a sequence of actions, which specifies the structure of the model; this model is then trained and its validation performance is returned as the reward function, which is used to update the RNN controller. Although both EA and RL methods have been able to learn network structures that outperform manually designed architectures, they require significant computational resources. We need a highly effective algorithm to accelerate the architecture search, so that we can use it on daily machine learning jobs on PAI.
3. Transfer Learning
Many machine learning methods work well only under a common assumption: the training and test data are drawn from the same feature space and the same distribution. When the distribution changes, most statistical models need to be rebuilt from scratch using newly collected training data. In many real world applications, it is expensive or impossible to re-collect the needed training data and rebuild the models. It would be nice to reduce the need and effort to re-collect the training data. In such cases, knowledge transfer or transfer learning between task domains would be desirable. Transfer learning is classified to three different settings: inductive transfer learning, transductive transfer learning and unsupervised transfer learning. So far transfer learning techniques have been mainly applied to small scale applications with a limited variety, a lot of explorations need to be done on large scale data applications.