Facial rigging—creating FACS-based blendshapes together with inner-mouth geometry (teeth, gums, and tongue)—remains a major bottleneck in 3D character production. Existing pipelines still require substantial designer effort, especially for manual landmark annotation, per-character template adjustment, and inner-mouth placement.
We present OmniFaceRig, a fully automatic end-to-end pipeline that converts a static surface-only 3D character mesh, with no pre-modeled oral cavity, into an inner-mouth-aware FACS rig with up to 155 blendshapes, procedurally fitted teeth, gums, and tongue, and re-packed UV/texture. OmniFaceRig supports diverse topologies—humans, humanoids, long-muzzled animals (e.g., dogs, wolves, foxes), and short-muzzled animals (e.g., cats, bears, rabbits, tigers)—with no manual landmarks, no user-provided templates, and no per-asset setup.
The pipeline combines hybrid VLM+CV riggability checking, multi-model face parsing, dense keypoint-driven template registration, procedural inner-mouth construction, and collision-aware blendshape transfer. For non-human characters, OmniFaceRig selects topology-specific face and inner-mouth templates and uses collision-aware inner-mouth fitting to reduce teeth-face intersections without exposing users to category-specific tuning.
We also publicly release Omni-Bench, a freely available benchmark dataset of 1,000 biped 3D characters with FACS facial blendshapes and inner-mouth geometry, spanning humans, humanoids, cats, dogs, and other animals. Experiments show high final rigging success on screened Omni-Bench inputs, nearly complete face detection recall from the segmentation ensemble, reliable inner-mouth placement with low penetration, and 20–30 s end-to-end processing time per asset on a single A100 GPU, including data I/O. Together, OmniFaceRig provides an automatic path from static generated characters to animation-ready facial rigs across both human and non-human topologies.
Each card below shows three columns: the input body image, the original surface-only GLB, and the rigged GLB with FACS animation. Click the ▶ Load button on a card to activate both 3D viewers at once. Inside each viewer, drag the splitter to toggle texture (left half) vs. solid grey (right half); camera and split position are shared across both viewers (left-drag rotate, scroll zoom, shift+drag pan). After 2.5 s of idle time the model slowly auto-rotates.
Riggability assessment, multi-model face detection & segmentation, landmark extraction, and rigid + non-rigid template registration produce a fitted face mesh.
Face mesh fusion, teeth registration & texture baking, and FACS blendshape transfer produce the final inner-mouth-aware FACS rig.
To facilitate future research in automatic facial rigging, we release Omni-Bench, to our knowledge the first open-source benchmark of biped 3D characters with FACS facial blendshapes and inner-mouth geometry (teeth, gums, and tongue). Omni-Bench contains 1,000 rigged biped 3D characters, split into 500 human and humanoid characters and 500 animals (150 cats across 10 breeds, 150 dogs across 10 breeds, and 200 other common animals including bears, tigers, lions, foxes, wolves, rabbits, and deer). All assets are released in T-pose with up to 12 appearance variations and full generation-pipeline metadata (text prompt, intermediate 2D reference image, final 3D mesh).
Omni-Bench stands out in three critical ways. First, it uniquely supports both human and animal characters within a unified rigging framework. Second, every model is equipped with a complete set of auto-generated FACS blendshapes (up to 155 shapes) that explicitly include teeth, gums, and tongue — a feature absent in nearly all existing large-scale datasets. Third, each asset includes the complete generation pipeline data (text + reference image + 3D mesh), making it a valuable resource for text-to-3D generation and multimodal research.
| Dataset | Year | Total Models | Species | FACS Blendshapes | Inner Mouth | Full Character | Text + 2D Image |
|---|---|---|---|---|---|---|---|
| BFM | 2009 | 200 | Human | ✗ | ✗ | ✗ | ✗ |
| CoMA | 2018 | 144 | Human | ✗ | ✗ | ✗ | ✗ |
| VOCASET | 2019 | 12 (4D) | Human | ✗ | ✗ | ✗ | ✗ |
| FaceScape | 2020 | 16,940 | Human | ✗ | ✗ | ✗ | ✗ |
| ICT FaceKit | 2020 | Parametric | Human | ✓ | ✓ | ✗ | ✗ |
| Multiface | 2022 | 13 | Human | ✗ | ✗ | ✗ | ✗ |
| RaBit | 2023 | 1,500 | Human & Cartoon | ✗ | ✗ | ✓ | ✗ |
| Anymate | 2025 | 230,000 | Human & Animal | ✗ | ✗ | ✓ | ✗ |
| Omni-Bench (Ours) | 2026 | 1,000 | Human & Animal | ✓ | ✓ | ✓ | ✓ |
If you find this work useful, please cite our paper.
@article{wang2026omnifacerig,
title={OmniFaceRig: Fully Automatic Inner-Mouth-Aware Face Rigging Across Diverse 3D Character Topologies},
author={Wang, Chao and Ma, Guangyao and Doublestein, John and Chen, Junming and Lin, Yiming and Su, Zhaoen and Luo, Xiaomin and Cheng, Shiyang and Shen, Jie and Roble, Doug and Wang, Dilin and Li, Yilei and Ranjan, Rakesh},
journal={arXiv preprint arXiv:2606.08043},
year={2026}
}