I have all the data needed. Compiling the full digest now.
Community Scout Digest — 8–14 April 2026
1. arXiv
Learning Long-term Motion Embeddings for Efficient Kinematics Generation
Stracke, N., Bauer, K., Baumann, S.A., Bautista, M.A., Susskind, J., & Ommer, B. (2026-04-13). Learning Long-term Motion Embeddings for Efficient Kinematics Generation. arXiv:2604.11737. https://arxiv.org/abs/2604.11737
From the CompVis group, this paper sidesteps pixel-level video synthesis entirely: instead of generating full video, a flow-matching model generates a learned 64× temporally compressed motion embedding derived from large-scale tracker trajectories. Text prompts or spatial interaction signals condition the generation. The result outperforms both general video generators and task-specific motion models on motion distribution quality — directly relevant to any pipeline that needs expressive, controllable body kinematics without video rendering overhead.
LPM 1.0: Video-based Character Performance Model
Yang, C., Ge, C., Zhang, E., et al. (2026-04-10). LPM 1.0: Video-based Character Performance Model. arXiv:2604.07823. https://arxiv.org/abs/2604.07823 | Project: https://large-performance-model.github.io
A 17B-parameter Diffusion Transformer (with a distilled real-time streaming variant) trained to model the full audiovisual performance of a conversational character — synchronised speaking, listening, micro-expressions, and natural body motion — at infinite length. The paper frames conversation as a performance problem grounded in Goffman and nonverbal communication theory, which puts it unusually close to somatic and embodied intelligence research. Relevant to interactive embodied character systems and the question of how AI can sustain affective, movement-based presence over time.
Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision
Cha, H., Woo, W., Kim, B., & Joo, H. (2026-04-08). Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision. arXiv:2604.04934. https://arxiv.org/abs/2604.04934 | Project: https://hyunsoocha.github.io/vanast/
Seoul National University proposes a unified one-stage framework that simultaneously transfers garments and animates a human body from a pose-guidance video, eliminating the identity drift common in two-stage pipelines. The architecture — a Dual Module video diffusion transformer — is noteworthy for its handling of full body motion coherence under clothing change, which maps onto body-to-visual synthesis challenges in somatic practice documentation and costume-aware performance AI.
A Unified Conditional Flow for Motion Generation, Editing, and Intra-Structural Retargeting 🚩
Li, J., Song, X., Wang, S., Huang, H., & Zhao, Y. (2026-04-14/15). A Unified Conditional Flow for Motion Generation, Editing, and Intra-Structural Retargeting. arXiv:2604.13427. https://arxiv.org/abs/2604.13427
🚩 Submission date varies between sources (April 14 vs 15); included as borderline.
A single rectified-flow model — with per-joint tokenisation and kinematic constraints — performs text-to-motion generation, zero-shot motion editing, and zero-shot skeletal retargeting within one unified framework. Evaluated on SnapMoGen and Mixamo. This is directly relevant to choreographic editing workflows: the ability to describe motion in language and transfer it across skeletal structures without retraining has clear applications in somatic AI tools.
BiTDiff: Fine-Grained 3D Conducting Motion Generation via BiMamba-Transformer Diffusion 🚩
Jia, T., Yang, K., Yang, X., Tang, X., et al. (2026-04-06). BiTDiff. arXiv:2604.04395. https://arxiv.org/abs/2604.04395
🚩 Submitted April 6; may have appeared in arXiv listings during the Apr 8–14 window.
Introduces CM-Data, the first large-scale (~10 hours) 3D conducting motion dataset, alongside a BiMamba-Transformer diffusion model that synthesises fine-grained conductor gestures from music. Applications cited include music education, virtual performance, and digital human animation. Directly relevant as a gesture-from-audio synthesis model — a close analogue to somatic gesture generation from musical or auditory cues.
Dynamic Whole-Body Dancing with Humanoid Robots 🚩
Zhang, S., Wu, J., Liu, G., et al. (2026-04-05). Dynamic Whole-Body Dancing with Humanoid Robots — A Model-Based Control Approach. arXiv:2604.03999. https://arxiv.org/abs/2604.03999
🚩 Submitted April 5; may have been indexed and circulated during the Apr 8–14 window.
A model-based control framework that converts human motion capture data into dynamically feasible whole-body dance routines executed by physical humanoid robots, validated in a live public performance with four robots dancing in coordination. Sits at the intersection of dance as choreographic data, embodied robotics, and MPC stability — relevant background for practitioners thinking about how somatic movement knowledge transfers to physical AI agents.
2. Hugging Face Daily Papers
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents
Tencent Hunyuan. (2026-04-10). HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents. HuggingFace Daily Papers. https://huggingface.co/papers/2604.07430
Tencent Hunyuan's open embodied foundation model for real-world physical agents. Relevant as an emerging open-weight baseline for embodied AI, signalling rapid scaling of publicly accessible embodied reasoning models. Less directly tied to somatic practice but worth tracking as infrastructure for physical AI pipelines.
Also noted Apr 8 (HuggingFace): Action Images: End-to-End Policy Learning via Multiview Video Generation (https://huggingface.co/papers/2604.06168) — uses multiview video generation as a supervisory signal for robot policy learning; relevant to the body-as-data / movement-to-policy direction.
3. Import AI Newsletter
Import AI #453 — "Breaking AI agents; MirrorCode; and ten views on gradual disempowerment" Jack Clark. (2026-04-13). https://jack-clark.net/2026/04/13/import-ai-453-breaking-ai-agents-mirrorcode-and-ten-views-on-gradual-disempowerment/
Issue #453 (the only issue in the Apr 8–14 window) covers AI agent security vulnerabilities, the MirrorCode software-reimplementation benchmark, and policy frameworks for AI-driven economic disruption. No items relevant to motion, dance, embodied AI, or physical AI this week.
4. Lab Blogs
Gemini Robotics-ER 1.6: Powering Real-World Robotics Through Enhanced Embodied Reasoning
Google DeepMind. (2026-04-14). Gemini Robotics-ER 1.6. DeepMind Blog. https://deepmind.google/blog/gemini-robotics-er-1-6/
A model update focused on spatial understanding, success detection, and instrument reading in real-world robotic environments — enhancing how AI agents perceive and reason about physical tasks including facility inspection and equipment monitoring. Directly relevant as the leading frontier model for embodied physical reasoning; its spatial and success-detection capabilities are foundational for any AI system that must interpret or respond to body-in-space.
Meta AI blog (Apr 8–14): No posts on motion, video generation, embodied AI, or physical/body AI in this window.
5. GitHub Trending
No motion-capture, dance, pose-estimation, or creative-coding repositories appeared in the Python trending list during this week. Nothing to report.
6. Conference News
MOCO 2026 — Programme Published, Registration Open
Movement and Computing. (Ongoing). MOCO'26 — 10th International Conference on Movement and Computing. https://moco26.movementcomputing.org/
Conference dates: 23–25 April 2026, Cité des Arts, Montpellier, France. The full three-day programme is now published (paper sessions, keynotes, practice works, doctoral consortium). This year's theme foregrounds health applications within movement and computing. Acceptance notifications went out January 12; camera-ready was March 2026. The conference is now ~5 days away — the most significant near-term gathering at the exact intersection of this field. No new CFP or paper-list announcement was issued specifically in the Apr 8–14 window, but programme details are live.
NIME 2026 — Upcoming, No New Announcements This Week
NIME. (Ongoing). NIME 2026 — New Interfaces for Musical Expression. https://nime2026.org/ Conference: 23–26 June 2026, London (Imperial College / Loughborough University London). Review process underway; no accepted-paper announcement in the Apr 8–14 window.