Computer graphics has long sought to create realistic depictions of humans, both in appearance and in their movement. A good starting point for animation has been to begin by collecting data from the real world about how people move. The use of prerecorded motion clips has been a standard building block for producing the motion of virtual characters for use in film, games, and interactive simulations. Of course, it is impossible to capture every motion a person might perform, and so animation methods attempt to do as much as possible with the limited movements that one might be able to record using either a motion capture setup or handmade animation.
With motion clips in hand, the basic idea for creating new motions is simple: play sequences of the prerecorded clips while being careful to only switch to a new clip when this can be done without causing any noticeable artifacts. Smoothly blending between motions during the transition period can further help to increase the number of feasible transitions between clips. For interactive settings such as games and simulations, control can be added by making decisions about which clip to play next based on the user’s goals, such as a particular desired walking direction or speed. A set of motion clips and their feasible clip transitions naturally form a "motion graph," which is a directed graph of motions with demarcated transition points where motions can join or split off. Modern game engines exploit various extensions of these ideas, such as the further use of blending to interpolate between motions, for example, walks of different speeds or turn rates, and the use of independent motion clips for the upper body and lower body when this can be done without introducing artifacts. However, the ongoing motion is still constructed from an underlying set of motions that are following their preset paths.
The preceding model comes with some obvious limitations, however. First, if the user commands a sudden turn or change in speed, the motion must wait until the next available clip transition before anything can be done. At this point, many game engines choose to rapidly cut or blend to the desired motion rather than producing a higher quality continuous motion that feels sluggish to the player because of the longer response time. Second, the motion can never really leave the confines of the prerecorded paths, at least not without further manual crafting of motion clip blends.
The following paper is exciting because it proposes to largely discard the idea of motion clips, and instead it effectively treats the motion clip as a set of independent motion vectors, where a motion vector consists of a character pose and its related velocities. In their ensemble, these high-dimensional motion vectors define a motion field that governs how the state of a character evolves over time. For any given state of the character, the passive dynamics is defined by the mean motion field, which can be reconstructed as a weighted average of the set of motion vectors that are nearest the current state. However, the mean motion does not allow for any control, and so produces realistic but unresponsive motion. One of the key insights of this paper, then, is to also use the neighboring motion vectors to define a discrete set of actions that each represent an alternative choice for guiding the future evolution of the motion. With this ability to "steer" the motion of the character in hand, reinforcement learning can be used to great effect to precompute the optimal actions for given task goals, such as walking in a given direction at a given speed.
The following paper is exciting because it treats the motion clip as a set of independent motion vectors, where a motion vector consists of a character pose and its related velocities.
The beauty of the method is that the synthesized motions are free to depart from the prerecorded motion clips, and yet they can still be guided to satisfy the goals of the motion. Part of the magic comes from the available control actions being implicitly deconstructed from the motion field and therefore not needing to be explicitly specified. Also exciting is that the method supports the use of largely unstructured motion data and is conceptually easy to integrate with other existing kinematic and dynamic methods. The paper convincingly demonstrates the benefits of rethinking existing representations for problems in order to arrive at new solutions. Lastly, it demonstrates a compelling example of how reinforcement learning methods can be applied convincingly in high-dimensional settings.