Use deep learning to make video game characters move more realistically
TL; DR: Computer researchers at the University of Edinburgh and Adobe Research have developed an AI that can be used to make video game characters interact more naturally with their surroundings. The technique uses a deep neural network called a “neural state machine” or NSM to accurately animate a character by deducing their movements in a given scenario.
In the days of 8- and 16-bit video games, character animations were pretty rudimentary, with most games having static environments and limited interaction. Therefore, the avatar’s movements didn’t require a lot of different animations.
After switching to 3D games, animation tasks got more complicated. Now that games have gotten much more complex with huge open worlds to explore and interact with, animating characters in the game requires hundreds, if not thousands, of different movement skills.
One way to alleviate the tedious animation process and make it faster is to use motion capture (mo-cap) to digitize the movements of the actors in the corresponding character animations. The result is in-game movements that appear more realistic. However, it’s next to impossible to capture every possible way a player might interact with the environment.
Also, the transitions between animations can seem a bit awkward and frozen. Usually, the changes between moves are handled by reused algorithms that perform the same every time. Think about how a character might sit on a chair or put down a box. It gets even more complicated when the objects vary in size. Resting your arms on chairs of different sizes or lifting objects of varying sizes and shapes becomes cumbersome to animate.
In their article, “Neural State Machine for Character-Scene Interactions,” the team demonstrates the complexities of a given animation using the example of taking an object. Before the object can even be lifted, several movements must be considered and animated, including starting to walk, slowing down, turning around while placing your feet precisely, and interacting with the object. All of this happens before the action of picking up the item.
Researchers call this “planning and adapting,” and this is where deep learning begins to come in.
“Achieving this in production-ready quality is not easy and time consuming,” says doctoral student and lead author Sebastian Starke (video above). “Our Neural State Machine instead learns the required motion and state transitions directly from the stage geometry and a given lens action. At the same time, our method is capable of producing several types of high quality movements and actions from a single network. “
The NSM is trained using mo-cap data to learn how to switch naturally from one movement to another. The network infers the character’s next pose based on both their previous pose and the geometry of the scene.
For example, the animations of an avatar walking through a door would be different if an object blocked the entrance. Instead of just crossing, the character should go around or step over the obstacle.
The framework the researchers created allows users to move the character around environments with simple control commands. In addition, the NSM network does not need to keep all motion capture data. Once an animation is learned, the mo-cap is compressed and stored while retaining the learned behavior.
“The technique essentially mimics how a human intuitively moves through a scene or environment and how they interact with objects, realistically and precisely,” said article co-author Taku Komura, who is Chairman of Computer Graphics at the University of Edinburgh.
Researchers plan to continue working on other related scenarios, such as naturally moving the character around in a crowd or performing multiple actions simultaneously. Consider the repetitive, jerky movements that run through crowds in Assassin’s Creed games.
The team will present their current research at ACM Transactions on Graphics / Siggraph Asia, to be held in Brisbane, Australia, November 17-20.