Guanya Shi 石冠亚

I am a G4 PhD student at the Department of Computing and Mathematical Sciences in Caltech, advised by Prof. Soon-Jo Chung and Yisong Yue. I am also a member of the Center for Autonomous Systems and Technologies and the DOLCIT, which is broadly centered around research pertaining statistical decision theory, statistical machine learning, and optimization. I also collaborate closely with Prof. Anima Anandkumar, Adam Wierman and Joel Burdick. Currently I am working on the intersection of machine learning and control theory, and their applications on robotics.

I did my bachelors at Tsinghua University. In summer 2016, I was fortunate enough to be selected into Stanford UGVR program. I was also a ML research intern at NVIDIA AI algorithm group in 2020.

CV  /  Email  /  GitHub  /  Google Scholar  /  Twitter  /  LinkedIn

News: I am co-organizing Control Meets Learning, a virtual seminar series on the intersection of control and learning.

News: I am awarded the prestigious Simoudis Discovery Prize at Caltech CMS.

News: Our Neural-Swarm2 paper was accepted by IEEE Transactions on Robotics (see the close-proximity flight of 16 drones).

News: Our FastUQ paper (my intern project at NVIDIA) was accepted by ICRA 2021.

News: Two theory papers about online learning and control were accepted by NeurIPS 2020.

News: Our Neural-Swarm paper was accepted by ICRA 2020 and highlighed by Caltech news and Yahoo news.

News: Our work on Neural Lander was accepted by ICRA 2019 and reported by Caltech homepage.

Research and Selected Publications

My research interests are in the intersection of machine learning and control theory, spanning the entire spectrum from theory and foundations, algorithms, to solve cutting-edge problems in real-world dynamical systems such as robotics.
Neural-Lander Family: Learning Based Nonlinear Provably Stable Control in Multi-Agent and Changing Environments

Neural-Lander: Stable Drone Landing Control using Learned Dynamics
Guanya Shi, Xichen Shi, Michael O'Connell, Rose Yu, Kamyar Azizzadenesheli, Animashree Anandkumar, Yisong Yue, Soon-Jo Chung
International Conference on Robotics and Automation (ICRA), 2019
[arXiv] [video] [Caltech homepage news] [highlighted by Import AI]

We present a novel deep-learning-based robust nonlinear controller for stable quadrotor control during landing. Our approach blends together a nominal dynamics model coupled with a DNN that learns the high-order interactions, such as the complex interactions between the ground and multi-rotor airflow. To the best of our knowledge, this is the first DNN-based nonlinear feedback controller with stability guarantees that can utilize arbitrarily large neural nets. Compared to a nonlinear baseline controller, our method can land the drone fast and smoothly.

Neural-Swarm: Heterogeneous Multi-Robot Control and Planning Using Learned Interactions
Guanya Shi, Wolfgang Hoenig, Xichen Shi, Yisong Yue, Soon-Jo Chung
International Conference on Robotics and Automation (ICRA), 2020
Journal version accepted by IEEE Transactions on Robotics (T-RO)
[arXiv] [video] [Caltech news] [Yahoo news]

Close-proximity control and planning are challenging due to the complex aerodynamic interaction effects between multirotors. We proposed Neural-Swarm, a non-linear decentralized stable learning-based controller and motion planner for close-proximity flight of heterogeneous multirotor swarms. We develop and employ heterogeneous deep sets to encode multi-vehicle interactions in an index-free manner, enabling better generalization to new formations and varying number of vehicles. Neural-Swarm enables close-proximity flight with 24 cm minimum vertical distance for a heterogeneous aerial team with 16 robots (60 cm in prior works).

Neural-Fly: Meta-Learning-Based Robust Adaptive Flight Control under Uncertain Wind Conditions
Michael O'Connell, Guanya Shi, Xichen Shi, Kamyar Azizzadenesheli, Animashree Anandkumar, Yisong Yue, Soon-Jo Chung
ongoing work
[perliminary version on arXiv] [demo video: a drone in CAST fan wall]

Real-time model learning proves challenging for complex dynamical systems. Deep learning has high representation power but is often too slow to update onboard and hard to analyze. On the other hand, adaptive control based on simple linear parameter models can update as fast as the feedback control loop. We propose an online composite adaptation method that treats outputs from a deep neural network as a set of basis functions capable of representing different wind conditions. Meta-learning techniques are used to optimize the network such that the last layer is fast for adaptation. We validate our approach by flying a drone in an open air wind tunnel under varying wind conditions.

Foundations of Online Learning and Control Theory

Online Optimization with Memory and Competitive Control
Guanya Shi, Yiheng Lin, Soon-Jo Chung, Yisong Yue, Adam Wierman
Neural Information Processing Systems (NeurIPS), 2020
[arXiv] [NeurIPS video]

This paper presents competitive algorithms for a novel class of online optimization problems with memory. We consider a setting where the learner seeks to minimize the sum of a hitting cost and a switching cost that depends on the previous p decisions. This setting generalizes Smoothed Online Convex Optimization. The proposed approach, Optimistic Regularized Online Balanced Descent, achieves aconstant, dimension-free competitive ratio. Further, we show a connection between online optimization with memory and online control with adversarial disturbances. This connection, in turn, leads to a new constant-competitive policy for a rich class of online control problems.

The Power of Predictions in Online Control
Chenkai Yu, Guanya Shi, Soon-Jo Chung, Yisong Yue, Adam Wierman
Neural Information Processing Systems (NeurIPS), 2020
[arXiv] [NeurIPS video]

We study the impact of predictions in online Linear Quadratic Regulator control with both stochastic and adversarial disturbances in the dynamics. In both settings, we characterize the optimal policy and derive tight bounds on the minimum cost and dynamic regret. Perhaps surprisingly, our analysis shows that the conventional greedy MPC approach is a near-optimal policy in both stochastic and adversarial settings. Specifically, for length-T problems, MPC requires only O(logT) predictions to reach O(1) dynamic regret, which matches (up to lower-order terms) our lower bound on the required prediction horizon for constant regret.

Competitive Control with Delayed Imperfect Information
Chenkai Yu, Guanya Shi, Soon-Jo Chung, Yisong Yue, Adam Wierman

This paper studies the impact of imperfect information in online control with adversarial disturbances. In particular, we consider both delayed state feedback and inexact predictions of future disturbances. We introduce a greedy, myopic policy that yields a constant competitive ratio against the offline optimal policy with delayed feedback and inexact predictions. A special case of our result is a constant competitive policy for the case of exact predictions and no delay, a previously open problem. We also analyze the fundamental limits of online control with limited information by showing that our competitive ratio bounds for the greedy, myopic policy in the adversarial setting match (up to lower-order terms) lower bounds in the stochastic setting.

Uncertainty Quantification and Safe Explorations in Dynamical Systems

Fast Uncertainty Quantification for Deep Object Pose Estimation
Guanya Shi, Yifeng Zhu, Jonathan Tremblay, Stan Birchfield, Fabio Ramos, Animashree Anandkumar, Yuke Zhu
Internship project at NVIDIA AI algorithm research group
International Conference on Robotics and Automation (ICRA), 2021
[arXiv] [project website]

Deep learning-based object pose estimators are often unreliable and overconfident especially when the input image is outside the training domain, for instance, with sim2real transfer. Efficient and robust uncertainty quantification (UQ) in pose estimators is critically needed in many robotic tasks. In this work, we propose a simple, efficient, and plug-and-play UQ method for 6-DoF object pose estimation. We ensemble 2-3 pre-trained models with different neural network architectures and/or training data sources, and compute their average pairwise disagreement against one another to obtain the uncertainty quantification. We evaluate the proposed UQ method on three tasks where our uncertainty quantification yields much stronger correlations with pose estimation errors than the baselines. Moreover, in a real robot grasping task, our method increases the grasping success rate from 35% to 90%.

Robust Regression for Safe Exploration in Control
Anqi Liu, Guanya Shi, Soon-Jo Chung, Animashree Anandkumar, Yisong Yue
Conference on Learning for Dynamics and Control (L4DC), 2020

We study the problem of safe learning and exploration in sequential control problems. The goal is to safely collect data samples from operating in an environment, in order to learn to achieve a challenging control goal (e.g., an agile maneuver close to a boundary). A central challenge in this setting is how to quantify uncertainty in order to choose provably-safe actions that allow us to collect informative data and reduce uncertainty, thereby achieving both improved controller safety and optimality. To address this challenge, we present a deep robust regression model that is trained to directly predict the uncertainty bounds for safe exploration. We derive generalization bounds for learning and connect them with safety and stability bounds in control. We demonstrate empirically that our robust regression approach can outperform the conventional Gaussian process (GP) based safe exploration in settings where it is difficult to specify a good GP prior.

Chance-Constrained Trajectory Optimization for Safe Exploration and Learning of Nonlinear Systems
Yashwanth Kumar Nakka, Anqi Liu, Guanya Shi, Animashree Anandkumar, Yisong Yue, Soon-Jo Chung
IEEE Robotics and Automation Letters (RA-L)
[arXiv] [blog]

Learning-based control algorithms require data collection with abundant supervision for training. Safe exploration algorithms ensure the safety of this data collection process even when only partial knowledge is available. We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained stochastic optimal control with dynamics learning and feedback control. We derive an iterative convex optimization algorithm that solves an Information-cost Stochastic Nonlinear Optimal Control problem (Info-SNOC). The optimization objective encodes both optimal performance and exploration for learning, and the safety is incorporated as distributionally robust chance constraints. The dynamics are predicted from a robust regression model that is learned from data. We prove the safety of rollout from our exploration method and reduction in uncertainty over epochs, thereby guaranteeing the consistency of our learning method. We demonstrate that our approach has higher success rate in ensuring safety when compared to a deterministic trajectory optimization approach.

Other Fun Projects

Teaching Mario to Play Mario: Reinforcement Learning on Super Mario Bros.
Guanya Shi, Botao Hu, Yan Wu

Final project of Caltech CS 159. We present a deep learning model to successfully learn control policies from high-dimensional input data using reinforcement learning. The model is based on the idea of Deep Q-Network (DQN), with convolutional neural network trained by Q-learning algorithm, whose input is tile representation of the screen and output is a value estimation function. Also, replay buffer, target network and double Q-learning are applied to lower data dependency and approximate real gradiant descent.


I love playing basketball, soccer and MOBA games. I am also very interested in photography, hiking, travelling and cooking. Here are some photos taken by me. Feel free to email me if you are interested in these photos.

prl prl
Winter Tsinghua Beckman Auditorium, Caltech
prl prl
Beijing National Stadium Catalina Island
prl prl
Wudaokou, Beijing Tokugawaen, Nagoya, Japan
prl prl
Santa Monica, California Yosemite, California

Based on this website.