Somatic-AI Content Platform

Kinesthetic Intelligence and the Limits of Motion Manifold Learning: A Phenomenological Critique

I. The Problem of Beginning: Why Kinematics Is Not Kinesthesia

Contemporary machine learning approaches to movement representation share a foundational assumption so pervasive it is rarely articulated: that movement can be adequately described as the coordinated displacement of body segments through Euclidean space over time. Motion capture systems, skeletal pose sequences, and the manifold-learning architectures that compress and generalize them all operate on this premise. A motion manifold, in the technical sense established by Holden et al. (2015), is a lower-dimensional embedding of high-dimensional kinematic data—joint angles, root velocities, contact flags—learned through an autoencoder whose latent space is presumed to capture the structure of natural movement. The representational commitment here is to trajectory: the where and when of bodily displacement.

Maxine Sheets-Johnstone's phenomenology of movement begins from a fundamentally different premise. In The Primacy of Movement (2011), she argues that kinesthetic consciousness is not a sensory channel that reads off spatial displacement from proprioceptive signals; it is, rather, the original form of self-awareness, temporally and ontologically prior to any objectified notion of the body as a thing that moves through space. The moving body does not first exist and then move: it comes to know itself through movement. Before representation, before perception of the external world as a field of objects, there is movement—and within movement, there is a qualitative richness that current computational architectures neither encode nor can in principle recover.

This essay examines what Sheets-Johnstone's account demands of any adequate representation of movement, and why motion manifolds as currently constructed fall short of that demand across three specific axes of failure. The stakes are not merely philosophical. For AI systems that claim to model movement—whether for character animation, human-robot interaction, or somatic feedback applications—the question of whether the latent space encodes anything analogous to kinesthetic quality, or only kinematic trajectory, determines what such systems can actually know about the body.

II. Qualitative Dynamics: The Unit Sheets-Johnstone Proposes

The core theoretical contribution of Sheets-Johnstone's mature phenomenology is the concept of qualitative dynamics—a term that resists easy paraphrase because it names precisely what is absent from the vocabulary of computational motion science. For Sheets-Johnstone (2011), the fundamental unit of kinesthetic experience is not a pose, not a keyframe, not a sequence of joint-angle vectors. It is a dynamic quality: the felt character of a movement as it unfolds—its tensional structure, its linear quality (whether it moves in a straight, curved, or undulating path), its amplitudinal spread, and its projectional quality (its sense of gathered force or release, its directional impetus).

These four dimensions—tension, linearity, amplitude, projection—are not features added to movement by a perceiving subject; they are constitutive of what the movement is as lived experience. They are what Sheets-Johnstone (2011), drawing on her earlier phenomenological analyses of dance, calls the "qualitative dynamics" that make one movement phenomenologically distinct from another even when the two movements are kinematically similar. A swing of the arm with high tension and projective force is not merely the same joint-angle trajectory performed at higher velocity: it is a qualitatively different movement, experienced differently from the inside and perceived differently from the outside.

This account has direct implications for motion manifold design. A latent space learned from kinematic data cannot, in principle, encode qualitative dynamics, because the training signal contains no information about tensional quality, projectional intent, or amplitudinal spread as experienced phenomena. What the autoencoder learns is the statistical structure of positional displacement sequences. What Sheets-Johnstone (2011) identifies as the unit of kinesthetic experience—the qualitative dynamics that give movement its character—is systematically absent from the input, and therefore cannot be present in the latent space, however sophisticated the architecture.

III. Movement Thinking: Intelligence Without Representation

A second dimension of Sheets-Johnstone's argument concerns the epistemic status of movement. She proposes, provocatively, that movement is not a product of intelligence but its original form. "Thinking in movement" (Sheets-Johnstone, 2011, p. 117) is not a metaphor for the way that cognitive processes are implemented in embodied action; it is a claim about the phylogenetic and ontogenetic priority of kinesthetic intelligence. Animate creatures solve problems, navigate environments, and coordinate with conspecifics through movement before they develop the representational capacities that are ordinarily identified with cognition. The infant's discovery of the world is a discovery made through movement—through the felt consequences of kicking, grasping, turning, and reaching.

This thesis connects directly to Husserl's account of kinesthesia in Thing and Space (1907/1997). Husserl argued that the constitution of spatial experience—the sense that there is an objective world extended in three dimensions—depends on kinesthetic sensations that accompany voluntary movement. The body does not perceive space; it constitutes space through movement. The kinesthetic "I can"—the felt capacity to move—is what first opens a spatial field for experience. Without this grounding of kinesthesia in constitutive bodily activity, the perceived world would have no depth, no orientation, no sense of reachability.

Gallagher (2005), synthesizing Husserl with more recent empirical work in cognitive neuroscience, distinguishes the body schema—the pre-reflective sensorimotor system that organizes ongoing movement—from the body image, the explicit representation of the body available to reflective consciousness. The body schema is not a model; it is a capacity, a dynamic readiness for movement that is neither propositional nor representational in any ordinary sense. What Gallagher's (2005) distinction reveals is that the intelligence operative in skilled movement is not the application of stored representations to motor execution; it is the ongoing, non-representational coordination of the sensing-moving body with its environment.

For motion manifold learning, this creates a foundational problem. The training paradigm assumes that movement intelligence is captured in the statistical structure of movement sequences—that what needs to be learned is the manifold of trajectories that a body typically produces. But if Sheets-Johnstone and Gallagher are correct, the intelligence in movement is not located in the trajectory; it is located in the capacity to produce qualitatively appropriate movement in response to felt circumstances. A manifold of trajectories encodes the outputs of movement intelligence, not the intelligence itself. This is analogous to studying cognition purely through behavioural outputs while systematically excluding any account of the processes that generate them.

IV. What a Phenomenologically Adequate Motion Representation Would Need to Encode

If we take Sheets-Johnstone's (2011) account seriously as a constraint on representational adequacy, what would a phenomenologically adequate motion manifold need to encode? Three necessary conditions emerge from the preceding analysis.

First, the representation must encode tensional quality—the degree of muscular engagement, effort, and resistance that characterises a movement's inner dynamic. Effort quality, in Sheets-Johnstone's (2011) vocabulary (building on Rudolf Laban's earlier movement analysis), is not reducible to force output or velocity profile, though it is correlated with both. Two movements may exhibit identical force profiles while differing in tensional quality—one performed with gathered, sustained attention, the other with sudden release. Encoding tensional quality requires either ground-truth physiological data (EMG signals, intramuscular pressure) or perceptual judgements from skilled observers. Neither is standardly included in the training sets used for motion manifold learning.

Second, the representation must encode temporal quality as experienced duration, not as a metric parameter. Husserl (1907/1997) distinguished between objective time—the measurable succession of events—and the living present (lebendige Gegenwart) of temporal consciousness, which retains the just-passed and anticipates the about-to-come in a dynamic structure of protention and retention. In kinesthetic consciousness, this means that a movement is not experienced as a sequence of instants but as a dynamic whole with an internal temporal shape: a movement has a beginning that is already oriented toward its end, and an end that carries the trace of its beginning. Current motion manifold architectures treat time as an indexing parameter over frames; they do not encode the phenomenological structure of movement as lived temporal experience.

Third, the representation must encode intentional quality—the sense that movement is toward something, that it is organised around a perceptual or practical target in a way that is intrinsic to its character, not added post-hoc. Gallagher (2005) argues that the body schema is always already intentional: it is structured by practical goals and affordances rather than by abstract spatial metrics. A reaching movement is not first a trajectory and then an intention; it is constituted as reaching by its orientation toward the graspable. This intentional quality—what Merleau-Ponty termed the "intentional arc"—is phenomenologically non-separable from the movement's kinetic character, but is entirely absent from representations that encode only joint configurations and velocities.

V. Three Specific Gaps Between Current Motion Manifolds and Sheets-Johnstone's Account

Gap 1: The Flattening of Effort into Velocity. Current architectures, including the convolutional autoencoder framework of Holden et al. (2015), represent movement dynamics principally through positional sequences from which velocity and acceleration profiles can be derived. Effort quality in Sheets-Johnstone's (2011) sense—the tensional, energetic character of a movement—is systematically reduced to its kinematic correlates. But tensional quality and velocity are not equivalent: slow movements can carry intense effort; fast movements can be effortless. The latent space of a kinematic autoencoder cannot distinguish these cases because the training signal does not. This is not a problem of model capacity; it is a problem of representational vocabulary. No amount of additional network depth will learn to encode a quality that is absent from the input.

Gap 2: The Absence of Kinesthetic Self-Reference. Sheets-Johnstone (2011) insists that kinesthetic consciousness is self-referential: the moving body is simultaneously the subject and object of kinesthetic experience, and this self-reference is constitutive, not incidental. What is felt in movement is not only the spatial outcome but the moving-itself—the sense of one's own body as an animate, dynamically organised whole. Motion manifolds encode the outputs of this self-referential process—the observed trajectories—but have no representational slot for the first-person dimension of kinesthetic self-awareness. This is not merely a limitation of available data; it reflects a deeper architectural commitment to third-person, allocentric representation that is phenomenologically misaligned with the structure of kinesthetic consciousness as Sheets-Johnstone (2011) describes it.

Gap 3: The Loss of Qualitative Individuation. For Sheets-Johnstone (2011), qualitative dynamics individuate movements in a way that kinematic metrics do not. Two performances of a gesture that are kinematically near-identical—that would cluster together in any embedding learned from positional data—may be phenomenologically remote: one performed with openness and weight, the other with held breath and surface agility. The manifold topology reflects kinematic similarity, not qualitative-dynamic similarity. This matters practically as well as theoretically: for any application in which the felt quality of movement is consequential—rehabilitation, expressive performance, somatic practice—a latent space that cannot individuate qualitative dynamics will systematically group together movements that are experientially distinct and separate movements that are experientially related.

VI. Toward a Post-Kinematic Representational Framework

The implications of the foregoing analysis are not defeatist. They do not entail that computational motion representation is impossible or that AI systems cannot meaningfully engage with movement. They entail, rather, that the field requires a principled theoretical account of what movement is as a phenomenological phenomenon before it can design representational architectures adequate to movement's structure.

Several directions are suggested by the analysis. Multimodal training regimes that include physiological effort signals alongside kinematic data could begin to close the first gap. Architectures that model movement as continuous temporal processes—rather than sequences of discrete frames—could better approximate the phenomenological structure of movement as lived duration. And evaluation frameworks that assess learned representations not only on reconstruction accuracy but on qualitative-dynamic fidelity—using perceptual ratings from trained movement practitioners as ground truth—could establish a metric capable of detecting the third gap.

What Sheets-Johnstone's (2011) phenomenology ultimately demands, however, is more than better data or different architectures. It demands a reconceptualisation of what motion manifold learning is for. If kinesthetic intelligence is, as she argues, the original form of intelligence—if movement thinking precedes and grounds propositional, representational, and linguistic cognition—then systems that learn from movement data are not merely compressing kinematic sequences. They are, potentially, engaging with a domain of intelligence that has its own qualitative structure, its own temporal form, and its own epistemological commitments. Whether current architectures are capable of doing justice to that structure is, on the analysis presented here, doubtful. That they should aspire to is, philosophically, inescapable.

References

Gallagher, S. (2005). How the body shapes the mind. Oxford University Press.

Holden, D., Saito, J., Komura, T., & Joyce, T. (2015). Learning motion manifolds with convolutional autoencoders. In SIGGRAPH Asia 2015 Technical Briefs (Article 18). ACM.

Husserl, E. (1997). Thing and space: Lectures of 1907 (R. Rojcewicz, Trans.). Kluwer Academic Publishers. (Original lectures delivered 1907)

Sheets-Johnstone, M. (2011). The primacy of movement (2nd ed., expanded). John Benjamins Publishing Company.