Guanya Shi 石冠亚

I am a G4 PhD student at the Department of Computing and Mathematical Sciences in Caltech, advised by Prof. Soon-Jo Chung and Yisong Yue. I am also a member of the Center for Autonomous Systems and Technologies and the DOLCIT, which is broadly centered around research pertaining statistical decision theory, statistical machine learning, and optimization. I also collaborate closely with Prof. Anima Anandkumar, Adam Wierman and Joel Burdick. Currently I am working on the intersection of machine learning and control theory, and their applications on robotics.

I did my bachelors at Tsinghua University. In summer 2016, I was fortunate enough to be selected into Stanford UGVR program. I was also a ML research intern at NVIDIA AI algorithm group in 2020.

CV  /  Email  /  GitHub  /  Google Scholar  /  Twitter  /  LinkedIn

News: I am co-organizing Control Meets Learning, a virtual seminar series on the intersection of control and learning.

News: I am awarded the prestigious Simoudis Discovery Prize at Caltech CMS.

News: Two theory papers about online learning and control were accepted by NeurIPS 2020.

News: Our Neural Swarm paper was accepted by ICRA 2020 and highlighed by Caltech news and Yahoo news.

News: Interviewed by Facebook PyTorch team about learning and control research in robotic systems. [video]

News: Our work on Neural Lander was accepted by ICRA 2019 and reported by Caltech homepage.

Research and Selected Publications

My research interests are in the intersection of machine learning and control theory, spanning the entire spectrum from theory and foundations, algorithms to real-world dynamical systems such as robotics.
Neural-Lander Family: Learning Based Nonlinear Provably Stable Control in Multi-Agent and Changing Environments

Neural-Lander: Stable Drone Landing Control using Learned Dynamics
Guanya Shi, Xichen Shi, Michael O'Connell, Rose Yu, Kamyar Azizzadenesheli, Animashree Anandkumar, Yisong Yue, Soon-Jo Chung
International Conference on Robotics and Automation (ICRA), 2019
[arXiv] [video] [Caltech homepage news] [highlighted by Import AI]

We present a novel deep-learning-based robust nonlinear controller for stable quadrotor control during landing. Our approach blends together a nominal dynamics model coupled with a DNN that learns the high-order interactions, such as the complex interactions between the ground and multi-rotor airflow. To the best of our knowledge, this is the first DNN-based nonlinear feedback controller with stability guarantees that can utilize arbitrarily large neural nets.

Neural-Swarm: Heterogeneous Multi-Robot Control and Planning Using Learned Interactions
Guanya Shi, Wolfgang Hoenig, Xichen Shi, Yisong Yue, Soon-Jo Chung
International Conference on Robotics and Automation (ICRA), 2020
Journal version submitted to IEEE Transactions on Robotics (T-RO)
[arXiv] [video] [Caltech news] [Yahoo news]

Close-proximity control and planning are challenging due to the complex aerodynamic interaction effects between multirotors. We proposed Neural-Swarm, a non-linear decentralized stable learning-based controller and motion planner for close-proximity flight of heterogeneous multirotor swarms. We employ heterogeneous deep sets to encode multi-vehicle interactions in an index-free manner, enabling better generalization to new formations and varying number of vehicles.

Neural-Fly: Meta-Learning-Based Robust Adaptive Flight Control under Uncertain Wind Conditions
Michael O'Connell, Guanya Shi, Xichen Shi, Kamyar Azizzadenesheli, Animashree Anandkumar, Yisong Yue, Soon-Jo Chung
ongoing work
[demo video: a drone flying in Caltech CAST fan wall]

Real-time model learning proves challenging for complex dynamical systems. Deep learning has high representation power but is often too slow to update onboard. On the other hand, adaptive control relies on simple linear parameter models can update as fast as the feedback control loop. We propose an online composite adaptation method that treats outputs from a deep neural network as a set of basis functions capable of representing different wind conditions. Meta-learning techniques are used to optimize the network such that the last layer is fast for adaptation. We validate our approach by flying a drone in an open air wind tunnel under varying wind conditions.

Foundations of Online Learning and Control Theory

Online Optimization with Memory and Competitive Control
Guanya Shi, Yiheng Lin, Soon-Jo Chung, Yisong Yue, Adam Wierman
Neural Information Processing Systems (NeurIPS), 2020

This paper presents competitive algorithms for a novel class of online optimization problems with memory. We consider a setting where the learner seeks to minimize the sum of a hitting cost and a switching cost that depends on the previous p decisions. This setting generalizes Smoothed Online Convex Optimization. The proposed approach, Optimistic Regularized Online Balanced Descent, achieves aconstant, dimension-free competitive ratio. Further, we show a connection between online optimization with memory and online control with adversarial disturbances. This connection, in turn, leads to a new constant-competitive policy for a rich class of online control problems.

The Power of Predictions in Online Control
Chenkai Yu, Guanya Shi, Soon-Jo Chung, Yisong Yue, Adam Wierman
Neural Information Processing Systems (NeurIPS), 2020

We study the impact of predictions in online Linear Quadratic Regulator control with both stochastic and adversarial disturbances in the dynamics. In both settings, we characterize the optimal policy and derive tight bounds on the minimum cost and dynamic regret. Perhaps surprisingly, our analysis shows that the conventional greedy MPC approach is a near-optimal policy in both stochastic and adversarial settings. Specifically, for length-T problems, MPC requires only O(logT) predictions to reach O(1) dynamic regret, which matches (up to lower-order terms) our lower bound on the required prediction horizon for constant regret.

Competitive Control with Delayed Imperfect Information
Chenkai Yu, Guanya Shi, Soon-Jo Chung, Yisong Yue, Adam Wierman

This paper studies the impact of imperfect information in online control with adversarial disturbances. In particular, we consider both delayed state feedback and inexact predictions of future disturbances. We introduce a greedy, myopic policy that yields a constant competitive ratio against the offline optimal policy with delayed feedback and inexact predictions. A special case of our result is a constant competitive policy for the case of exact predictions and no delay, a previously open problem. We also analyze the fundamental limits of online control with limited information by showing that our competitive ratio bounds for the greedy, myopic policy in the adversarial setting match (up to lower-order terms) lower bounds in the stochastic setting.

Uncertainty Quantification and Safe Explorations in Dynamical Systems

Fast Uncertainty Quantification for Deep Object Pose Estimation
Guanya Shi, Yifeng Zhu, Jonathan Tremblay, Stan Birchfield, Fabio Ramos, Animashree Anandkumar, Yuke Zhu
Internship project at NVIDIA AI algorithm research group
[arXiv] [project website]

Deep learning-based object pose estimators are often unreliable and overconfident especially when the input image is outside the training domain, for instance, with sim2real transfer. Efficient and robust uncertainty quantification (UQ) in pose estimators is critically needed in many robotic tasks. In this work, we propose a simple, efficient, and plug-and-play UQ method for 6-DoF object pose estimation. We ensemble 2-3 pre-trained models with different neural network architectures and/or training data sources, and compute their average pairwise disagreement against one another to obtain the uncertainty quantification. We evaluate the proposed UQ method on three tasks where our uncertainty quantification yields much stronger correlations with pose estimation errors than the baselines. Moreover, in a real robot grasping task, our method increases the grasping success rate from 35% to 90%.

Robust Regression for Safe Exploration in Control
Anqi Liu, Guanya Shi, Soon-Jo Chung, Animashree Anandkumar, Yisong Yue
Conference on Learning for Dynamics and Control (L4DC), 2020

We study the problem of safe learning and exploration in sequential control problems. The goal is to safely collect data samples from operating in an environment, in order to learn to achieve a challenging control goal (e.g., an agile maneuver close to a boundary). A central challenge in this setting is how to quantify uncertainty in order to choose provably-safe actions that allow us to collect informative data and reduce uncertainty, thereby achieving both improved controller safety and optimality. To address this challenge, we present a deep robust regression model that is trained to directly predict the uncertainty bounds for safe exploration. We derive generalization bounds for learning and connect them with safety and stability bounds in control. We demonstrate empirically that our robust regression approach can outperform the conventional Gaussian process (GP) based safe exploration in settings where it is difficult to specify a good GP prior.

Chance-Constrained Trajectory Optimization for Safe Exploration and Learning of Nonlinear Systems
Yashwanth Kumar Nakka, Anqi Liu, Guanya Shi, Animashree Anandkumar, Yisong Yue, Soon-Jo Chung
IEEE Robotics and Automation Letters (RA-L)
[arXiv] [blog]

Learning-based control algorithms require data collection with abundant supervision for training. Safe exploration algorithms ensure the safety of this data collection process even when only partial knowledge is available. We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained stochastic optimal control with dynamics learning and feedback control. We derive an iterative convex optimization algorithm that solves an Information-cost Stochastic Nonlinear Optimal Control problem (Info-SNOC). The optimization objective encodes both optimal performance and exploration for learning, and the safety is incorporated as distributionally robust chance constraints. The dynamics are predicted from a robust regression model that is learned from data. We prove the safety of rollout from our exploration method and reduction in uncertainty over epochs, thereby guaranteeing the consistency of our learning method. We demonstrate that our approach has higher success rate in ensuring safety when compared to a deterministic trajectory optimization approach.


CS/CNS/EE/IDS 165: Foundations of Machine Learning and Statistical Inference, Caltech

Talks and Activities

"Using Deep Learning and PyTorch to Power Next Generation Aircraft at Caltech", interviewed by Facebook PyTorch team about learning and control research in robotic systems. [video]

Conference and Journal Reviewing

Conference: ICML 2020, NeurIPS 2020, ICRA 2019-2020, IROS 2020, CoRL 2020, ICLR 2021

Journal: IEEE Transaction on Automatic Control (TAC)

Course Projects

Teaching Mario to Play Mario: Reinforcement Learning on Super Mario Bros.
Guanya Shi, Botao Hu, Yan Wu

Final project of Caltech CS 159. We present a deep learning model to successfully learn control policies from high-dimensional input data using reinforcement learning. The model is based on the idea of Deep Q-Network (DQN), with convolutional neural network trained by Q-learning algorithm, whose input is tile representation of the screen and output is a value estimation function. Also, replay buffer, target network and double Q-learning are applied to lower data dependency and approximate real gradiant descent.


I love playing basketball, soccer and MOBA games. I am also very interested in photography, hiking, travelling and cooking. Here are some photos taken by me. Feel free to email me if you are interested in these photos.

prl prl
Winter Tsinghua Beckman Auditorium, Caltech
prl prl
Beijing National Stadium Catalina Island
prl prl
Wudaokou, Beijing Tokugawaen, Nagoya, Japan
prl prl
Santa Monica, California Yosemite, California

Based on this website.