Peer-reviewed veterinary case report

Contrastive learning-driven spatiotemporal dynamically adaptive framework for stylized 3D human motion generation.

Year:: 2026
Authors:: Song Z et al.
Affiliation:: College of Physical Education · China

Abstract

Most existing methods for 3D human motion generation focus primarily on global style statistics in the temporal dimension, which limits their ability to capture local stylistic variations in dynamic motions. This often results in generated sequences lacking expressive detail. To address this challenge, a contrastive learning-driven framework is proposed for spatiotemporal dynamically adaptive stylized 3D human motion generation. Building upon conventional spatial attention (SA) and temporal attention (TA) modules, two instance normalization variants-spatial attention instance normalization (SAIN) and temporal attention instance normalization (TAIN)-are introduced to disentangle and extract motion style features from local and global perspectives, respectively. Simultaneously, a dual-path structure is employed to isolate pure motion content at both local and global levels, ensuring effective separation of style and content information. A style injector, composed of spatially adaptive dynamic attention (SADA) and temporally adaptive dynamic attention (TADA) modules, is developed to integrate the extracted style features with motion content in a temporally and spatially ordered manner, enabling fine-grained style injection. During training, style contrastive loss and content contrastive loss are incorporated to enforce compact clustering of features with similar styles or contents in the feature space, while promoting separation of dissimilar ones. This enhances both the stylistic diversity and content fidelity of the generated sequences. Comprehensive experiments conducted on the Xia dataset demonstrate the superior performance of the proposed method, achieving an FID of 0.06, accuracy of 96.70%, diversity of 5.67, and multimodality of 0.97, all of which are close to real data (FID 0.01). In the motion style transfer task, our model attains 94.11 CRA and 89.41 SRA, outperforming state-of-the-art baselines.

Find similar cases for your pet

PetCaseFinder finds other peer-reviewed reports of pets with the same symptoms, plus a plain-English summary of what was tried across them.

Search related cases →

Original publication: https://europepmc.org/article/MED/41712582

Contrastive learning-driven spatiotemporal dynamically adaptive framework for stylized 3D human motion generation.

Abstract

Find similar cases for your pet

Related cases