Thinking in Movement: Maxine Sheets-Johnstone and the Intent Layer as a Site of Cognition, Not Execution
Framing the Intersection
The May deep analysis established, through Merleau-Ponty, that movement is organised by motor intentionality — a forward-model-like reaching toward possibility that precedes explicit representation. The June deep analysis established, through Thomas Fuchs, that the resulting movement skill is sedimented as body memory that is constitutive rather than representational. This month's frontier report documented the AI field's independent convergence on the necessity of an intermediate layer between abstract instruction and physical execution — what the field calls intent, state, or continuous residual.
There is a temptation to interpret this intermediate layer as a mere translation stage: a place where the abstract command gets converted into executable detail. On this reading, the intent layer is a mechanism — a smart intermediary that makes the mapping from idea to action tractable. Useful, but not itself a site of anything interesting; just a better pipeline.
The phenomenologist Maxine Sheets-Johnstone offers a radically different, and I will argue more accurate, understanding. In her account of thinking in movement, the intermediate layer is not a translation stage at all. It is a site of cognition — a place where thinking actually happens, in and as movement, rather than a channel through which pre-formed thoughts are transmitted to the muscles. This distinction has significant consequences for how we should understand what AI motion systems are doing, and what they are not.
Thinking in Movement: The Argument
Sheets-Johnstone's central claim, developed across The Primacy of Movement (1999/2011) and related essays, is that movement is not merely the output of cognition but is itself a form of thinking. When an improvising dancer moves, when an infant explores a new motor possibility, when any skilled mover responds to an unfolding situation, they are not first thinking and then moving. They are thinking in movement — the movement is the thought, the cognition is happening kinetically, in real time, in the medium of the moving body itself.
This is a strong claim and it is easy to underestimate. Sheets-Johnstone is not saying that movement expresses thought, or accompanies thought, or is guided by thought. She is saying that in certain fundamental cases, movement is thought — that there is a genuine cognitive process that takes place in the kinetic medium and cannot be translated without remainder into propositional or representational terms.
Her primary evidence is developmental and phenomenological. The infant learning to move does not have a prior mental plan of movement that it then executes; it thinks its way into movement possibilities kinetically, discovering what its body can do by doing it. The improvising dancer does not choreograph internally and then perform; the choreography emerges in the moving, as a kinetic thinking that is generative in real time. In both cases, the movement is exploratory and generative in the way that thinking is — it discovers, it resolves, it creates — and it does so in the medium of movement, not in an inner mental space that movement merely reports.
The Intermediate Layer Reconsidered
Now return to the AI field's intermediate layer with Sheets-Johnstone's argument in view.
The engineering conception of the intent layer (MIND, SCRIPT) is that it is a representation that bridges language and action — a learned intermediate that makes the mapping tractable. The intent is computed from the instruction and then used to determine the action. It is a stage in a pipeline: instruction in, intent derived, action out. Cognition, on this picture, happens elsewhere (in whatever process produced the instruction); the intent layer is downstream machinery.
Sheets-Johnstone's account suggests this gets the structure importantly wrong for the case of genuinely generative movement. In thinking-in-movement, the intermediate layer is not downstream of cognition — it is the cognition. The generative work, the actual thinking, happens in the kinetic intermediate, not before it. The improvising body is not executing an intent computed elsewhere; it is thinking, in the medium of movement and its immediate anticipatory organisation.
This distinction maps onto a real difference between two kinds of movement, and it clarifies what current AI can and cannot do.
Reproductive movement — performing a known phrase, executing a familiar action — can plausibly be understood on the pipeline model. There is an intent (the known movement's organisation) that determines the execution. AI systems that generate movement from learned patterns are doing something structurally similar: retrieving and adapting a learned organisation. For this kind of movement, the engineering conception of the intent layer may be adequate.
Generative movement — improvising, discovering, responding creatively to an unfolding situation — cannot be understood on the pipeline model, because there is no pre-existing intent to execute. The intent is being created in the moving, as thinking-in-movement. This is the kind of movement that somatic improvisation practices cultivate, and it is precisely the kind that current AI motion systems do not do, because they generate by adapting learned patterns rather than by thinking kinetically in real time.
Why AI Motion Systems Do Not Think in Movement
The distinction lets us state precisely what is missing from even the most sophisticated current systems, including the intent-layer architectures of June 2026.
MIND generates behavioural intent from a text instruction. The intent is derived from the instruction and the model's learned priors; it is then used to produce physically controlled movement. This is genuinely impressive, and for reproductive and instruction-following movement it works well. But the intent in MIND is computed, not thought. It is the output of a learned mapping from language to a latent representation. Nothing in the system is thinking in movement — discovering, in the kinetic medium and in real time, a movement organisation that did not pre-exist the moving.
This is not a criticism of MIND, which does not claim to improvise creatively. It is a clarification of the boundary. The intent layer, as the field currently builds it, is a computed bridge, not a site of kinetic cognition. It makes the pipeline work; it does not think.
The gap matters most exactly where somatic practice is richest: in improvisation, in creative response, in the generative discovery of movement that has never been made before. These are the cases where movement is thinking, where the intermediate layer is cognitive rather than merely mediating. And these are the cases current AI cannot reach, because it generates by adapting the already-known rather than by thinking kinetically into the not-yet-known.
What This Implies for Somatic AI
Three implications follow, and they are more subtle than a simple "AI cannot improvise."
First, the target is real-time kinetic generativity, not just intermediate representation. A somatic-AI system that aspired to genuine co-creative dialogue would need to do more than compute an intent layer from an instruction. It would need to participate in real-time kinetic generation — to respond to the practitioner's thinking-in-movement with something that is itself generative in the moment, not retrieved from a learned repertoire. This is a far harder target than instruction-following generation, and naming it precisely matters: the goal is not better mapping but kinetic responsiveness.
Second, the practitioner's thinking-in-movement is the irreplaceable element. If genuine movement cognition happens in the kinetic medium, and current AI does not have that medium (it has no moving body thinking in real time), then in any somatic-AI co-creation the thinking in movement is contributed by the practitioner. The AI can respond, propose, extend — but the kinetic cognition, the actual thinking-in-movement, is the human's. This is not a limitation to be engineered away; it is the structural role of the human in the collaboration. The practitioner thinks in movement; the AI participates in the resulting field.
Third, sensing the intent layer is the closest AI can come. If the AI cannot itself think in movement, the most it can do is sense the practitioner's kinetic thinking as it happens — through the intermediate-layer signals (EMG intent, muscular organisation) that read the thinking-in-movement from the inside, as it forms. An AI that senses the practitioner's intent layer in real time is not thinking in movement itself, but it is reading the practitioner's kinetic thought at the layer where that thought lives. This is the achievable version of participation in thinking-in-movement: not replicating it, but sensing and responding to it at its own level.
Conclusion
The AI field's convergence on the intent layer is a genuine and important discovery: movement must be organised through an intermediate layer, and modelling that layer is what makes generation work. But Sheets-Johnstone's account of thinking in movement reveals that the field's conception of this layer — as a computed bridge between instruction and action — captures only its reproductive function. In generative movement, the intermediate layer is not a bridge but a site of cognition: the place where thinking happens, kinetically, in real time.
Current AI builds the bridge. It does not think in movement. And the difference is precisely the difference between a system that can execute and adapt known movement and a partner that can think kinetically alongside a human in the generative present. For somatic AI, the honest and clarifying conclusion is that the thinking-in-movement remains the human's contribution — and that the highest achievable role for the AI is to sense that kinetic thinking at the intermediate layer where it lives, and to respond to it, rather than to replicate it. The intent layer is where the field and somatic practice meet. It is also where they remain, for now, importantly different.
APA References
Sheets-Johnstone, M. (2011). The primacy of movement (2nd ed.). John Benjamins. https://doi.org/10.1075/aicr.82
Sheets-Johnstone, M. (2009). The corporeal turn: An interdisciplinary reader. Imprint Academic.
Merleau-Ponty, M. (1962). Phenomenology of perception (C. Smith, Trans.). Routledge. (Original work published 1945)
Li, B., Zhang, R., Liang, H., et al. (2026). MIND: Multi-scale intent diffusion for text-driven physics-based humanoid control. arXiv:2605.26006. https://arxiv.org/abs/2605.26006
Varela, F. J., Thompson, E., & Rosch, E. (1991). The embodied mind: Cognitive science and human experience. MIT Press.