How Do You Grade a Movement? The Hidden Problem at the Heart of Motion AI

Before an AI can generate good movement, someone has to decide what "good" means. That turns out to be one of the deepest questions in the field — and the place where somatic knowledge has the most to offer.


Imagine you have built an AI that generates human movement. You type "a person walks across the room and sits down," and it produces an animation. Now answer a simple question: is the animation any good?

You would think this would be easy to check. You watch it. Either it looks right or it doesn't. But turn that intuition into something a computer can measure — a number, a score, a standard the AI can be trained to improve against — and you discover one of the quietest and hardest problems in the entire field. How do you grade a movement?

This is not an academic side-question. It is foundational. An AI learns by being told how well it is doing. Whatever you measure becomes what it optimises toward. Choose the wrong measure of "good movement," and you get an AI that is excellent at the wrong thing. So the question of how to grade movement quietly determines what kind of movement these systems learn to make.


The First Answer: Does It Obey Physics?

The first natural answer is: a good movement obeys the laws of physics. Real bodies have weight. Feet don't slide along the floor when they're planted. You can't balance in a position that would topple a real person. You can't accelerate without a force to cause it.

So one whole family of grading methods checks generated movement against physics. Does the foot skate? Penalise it. Does the body float or balance impossibly? Penalise it. This catches a lot of the obvious failures that make AI movement look fake — the eerie gliding, the subtle floating, the physically impossible poses.

But here is the catch, and it is a deep one: a movement can be perfectly physically valid and still look completely wrong. A robot executing a technically correct walk can be unmistakably lifeless. The physics checks out; the movement is dead. Physical correctness is necessary but nowhere near sufficient. Obeying physics keeps a movement from looking broken. It does nothing to make it look alive.


The Second Answer: Does It Look Right to a Human?

So the field turned to a second answer: ask humans. Show people generated movements, have them rate which look real and which look fake, and train the AI to produce movements that humans rate as real.

This captures something the physics approach misses — the hard-to-name quality of aliveness that humans perceive instantly but can't reduce to a rule. A trained eye knows when a movement is convincing.

But this approach has its own problem: human judgements are coarse and inconsistent. Ask someone "real or fake?" and you get a crude binary. Ask many people and they disagree. Turning these scattered, subjective, low-resolution judgements into a precise, scalable measure an AI can train against is genuinely hard. The signal is real but noisy and thin.


This Year's Advance: Combining the Two

A piece of research presented recently (called PP-Motion, from a team at Tsinghua University) made a real advance by combining the two answers into one measure. It calculates, precisely, how far a movement is from being physically valid — a fine-grained, continuous physics score, rather than a crude pass/fail. Then it blends that with a model of human perception. The result tracks both whether a movement is physically sound and whether it looks right to people, in a single grade.

This is genuine progress. It is the most complete answer the field has to the "how do you grade a movement" question: physical feasibility plus human perception, fused.

And yet — if you come to this from a background in movement practice, you can immediately feel what is still missing.


The Third Answer Nobody Is Measuring Yet

Both of the answers above — physics and human perception — share something. They are both external. They grade a movement from the outside: physics, from the standpoint of mechanical law; perception, from the standpoint of an observer watching.

But anyone who has trained seriously in movement — dancers, somatic practitioners, martial artists, athletes — knows there is a third standard that neither of these captures. The standard of whether a movement is true to the experience it expresses. Whether it comes from the right place in the body. Whether the effort is organised with integrity or merely arranged to look correct from outside.

A skilled mover can perform two versions of the same phrase: one that satisfies any external observer, and one that is also internally honest — initiated from the centre, supported by the breath, free of unnecessary holding, expressing a genuine inner state rather than a convincing outer shape. From the outside, with enough skill, these can look identical. From the inside, they are entirely different. And it is the inside difference that movement practice spends years cultivating.

No current AI grading system measures this. Physics can't see it — both versions obey physics. Human perception can barely see it — a convincing outer shape fools most observers. The interior fidelity of a movement, its truth to lived experience, sits outside every metric the field currently has.


Why This Is the Opening for Somatic Knowledge

This gap is not a failure of the engineers. It is a genuinely hard frontier — possibly the hardest in the field — because interior fidelity is, almost by definition, not visible from the outside. You cannot measure it with a camera or a physics simulation. You can only assess it through signals closer to the body's interior (like the muscle activity that recent EMG-based research has started to capture) and, ultimately, through the trained perception of people who know movement from the inside.

This is exactly where movement practitioners have something AI research needs and largely lacks: a cultivated, reliable, teachable capacity to perceive interior movement quality. The decades of refinement in disciplines like Laban Movement Analysis, the Alexander Technique, and somatic education are, among other things, decades of work on how to assess movement from the inside — the very thing the field's grading systems cannot yet do.

The future of grading movement — and therefore the future of what movement AI learns to make — runs through that knowledge. The first two answers, physics and perception, were built by engineers. The third answer, interior fidelity, will have to be built in collaboration with the people who have spent their lives learning to feel it.


Further Reading

Zhao, S., et al. (2025). PP-Motion: Physical-perceptual fidelity evaluation for human motion generation. arXiv:2508.08179. https://arxiv.org/abs/2508.08179

Lin, J., et al. (2026). The quest for generalisable motion generation: Data, model, and evaluation. arXiv:2510.26794. https://arxiv.org/abs/2510.26794