Space Exploration Tracker

Autonomous Spacecraft Trajectory Design via Reinforcement Learning

Status: Completed

Start Date: 2019-08-01

End Date: 2023-05-12

Description: NASA's 2018 Strategic Plan highlighted the need for new technologies that will enable deep space missions. In particular, NASA looks to move crewed spaceflight out of Low Earth Orbit (LEO), and begin developing the technologies that will return us to the Moon, and eventually, to Mars. In establishing a permanent human presence outside of LEO, numerous new mission constraints must be overcome by equally novel approaches. While short-term missions can benefit from years of prior preparation, long-term crewed missions demand methods to quickly – and at times autonomously – adapt as unexpected obstacles are met. For trajectory design, this challenge is compounded by increasingly complex dynamic models and mission constraints. This proposal aims to ease the computational burden of traditional trajectory design techniques by leveraging recent advancements in machine learning, and in particular, Reinforcement Learning (RL). One of the primary challenges in multi-body trajectory design is finding desired motion within an immense solution space. This problem is worsened by introducing further complexity in the dynamic model, such as additional gravitational bodies or low-thrust capability. Reinforcement Learning offers a way of organizing the search through the high-dimensional solution space, and a means of interacting directly with complex environments. The proposed research is aimed at developing a framework that can autonomously learn from a multibody environment with no a priori knowledge of system dynamics. This will allow for new ways of pathfinding through historically challenging high-fidelity dynamic models. By separating the learning from the environment, this methodology will allow for rapid trajectory design directly in the desired force model. Furthermore, the agent/environment boundary will allow for the learning to be model-agnostic, and easily generalizable to other force models. This generalizability characteristic of RL is especially appealing in trajectory design because it allows for techniques that are not unique to one particular application. NASA has a stated need for automated trajectory design techniques. Many current approaches rely on simplifying assumptions and computationally expensive brute-force algorithms. Using RL will allow for both rapid computation of new trajectories, and for the discovery of new paths through difficult dynamic regimes. By having the algorithm learn directly from a force model, complex spacecraft trajectories can be uncovered that may not exist in a simplified environment. This capability will help enable NASA in moving crewed spaceflight outside of LEO, and in establishing a permanent human presence in deep space.

Benefits: NASA has a stated need for automated trajectory design techniques. Many current approaches rely on simplifying assumptions and computationally expensive brute-force algorithms. Using RL will allow for both rapid computation of new trajectories, and for the discovery of new paths through difficult dynamic regimes. By having the algorithm learn directly from a force model, complex spacecraft trajectories can be uncovered that may not exist in a simplified environment. This capability will help enable NASA in moving crewed spaceflight outside of LEO, and in establishing a permanent human presence in deep space.

Lead Organization: Purdue University-Main Campus