image imagewidth (px) 620 2.87k | label class label 2
classes |
|---|---|
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
0RefPic | |
1imgs | |
1imgs |
360ยฐ-Motion Dataset
Project page | Paper | Code
Acknowledgments
We thank Jinwen Cao, Yisong Guo, Haowen Ji, Jichao Wang, and Yi Wang from Kuaishou Technology for their help in constructing our 360ยฐ-Motion Dataset.
News
- [2024-12] We release the V1 dataset (72,000 videos consists of 50 entities, 6 UE scenes, and 121 trajectory templates).
Data structure
โโโ 360Motion-Dataset Video Number Cam-Obj Distance (m)
โโโ 480_720/384_672
โโโ Desert (desert) 18,000 [3.06, 13.39]
โโโ location_data.json
โโโ HDRI
โโโ loc1 (snowy street) 3,600 [3.43, 13.02]
โโโ loc2 (park) 3,600 [4.16, 12.22]
โโโ loc3 (indoor open space) 3,600 [3.62, 12.79]
โโโ loc11 (gymnastics room) 3,600 [4.06, 12.32]
โโโ loc13 (autumn forest) 3,600 [4.49 11.91]
โโโ location_data.json
โโโ RefPic
โโโ CharacterInfo.json
โโโ Hemi12_transforms.json
(1) Released Dataset Information
| Argument | Description | Argument | Description |
|---|---|---|---|
| Video Resolution | (1) 480ร720 (2) 384ร672 | Frames/Duration/FPS | 99/3.3s/30 |
| UE Scenes | 6 (1 desert+5 HDRIs) | Video Samples | (1) 36,000 (2) 36,000 |
| Camera Intrinsics (fx,fy) | (1) 1060.606 (2) 989.899 | Sensor Width/Height (mm) | (1) 23.76/15.84 (2) 23.76/13.365 |
| Hemi12_transforms.json | 12 surrounding cameras | CharacterInfo.json | entity prompts |
| RefPic | 50 animals | 1/2/3 Trajectory Templates | 36/60/35 (121 in total) |
| {D/N}_{locX} | {Day/Night}_{LocationX} | {C}_ {XX}_{35mm} | {Close-Up Shot}_{Cam. Index(1-12)} _{Focal Length} |
Note that the resolution of 384ร672 refers to our internal video diffusion resolution. In fact, we render the video at a resolution of 378ร672 (aspect ratio 9:16), with a 3-pixel black border added to both the top and bottom.
(2) Difference with the Dataset to Train on Our Internal Video Diffusion Model
The release of the full dataset regarding more entities and UE scenes is still under our internal license check.
| Argument | Released Dataset | Our Internal Dataset |
|---|---|---|
| Video Resolution | (1) 480ร720 (2) 384ร672 | 384ร672 |
| Entities | 50 (all animals) | 70 (20 humans+50 animals) |
| Video Samples | (1) 36,000 (2) 36,000 | 54,000 |
| Scenes | 6 | 9 (+city, forest, asian town) |
| Trajectory Templates | 121 | 96 |
(3) Load Dataset Sample
Change root path to
dataset. We provide a script to load our dataset (video & entity & pose sequence) as follows. It will generate the sampled video for visualization in the same folder path.python load_dataset.pyVisualize the 6DoF pose sequence via Open3D as follows.
python vis_trajecotry.pyAfter running the visualization script, you will get an interactive window like this. Note that we have converted the right-handed coordinate system (Open3D) to the left-handed coordinate system in order to better align with the motion trajectory of the video.
๐ Image
Citation
@inproceedings{fu20243dtrajmaster,
author = {Fu, Xiao and Liu, Xian and Wang, Xintao and Peng, Sida and Xia, Menghan and Shi, Xiaoyu and Yuan, Ziyang and Wan, Pengfei and Zhang, Di and Lin, Dahua},
title = {3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation},
booktitle = {ICLR},
year = {2025}
}
Contact
Xiao Fu: lemonaddie0909@gmail.com
- Downloads last month
- 243
