Upload s1ckpt

by Lexiyutou - opened Apr 2

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

-419

Files changed (8) hide show

README.md +3 -110
amr_vitbb.pth +0 -3
config_s1_HYDRA.yaml +0 -147
config_s3_HYDRA.yaml +0 -147
my_smpl_00781_4_all.pkl +0 -3
my_smpl_data_00781_4_all.pkl +0 -3
s3ckpt.ckpt +0 -3
walking_toy_symmetric_pose_prior_with_cov_35parts.pkl +0 -3

README.md CHANGED Viewed

@@ -1,110 +1,3 @@
----
-tags:
-- computer_vision
-- animal_pose_and_shape_estimation
-- DeepLabCut
-pipeline_tag: image-to-3d
----
-# MODEL CARD:
-## Model Details
-• PRIMA model(s) developed by the [M.W.Mathis Lab](http://www.mackenziemathislab.org/) in 2026, trained to predict quadruped shape and pose from images.
-Please see **paper link**  for details.
-• There are two main models:
-  - `s1ckpt.ckpt` is the stage-1 model trained with Animal3D, CtrlAni3D, and Quadruped2D datasets.
-  - `s3ckpt.ckpt` is the stage-3 model trained with Animal3D, CtrlAni3D, and Quadruped3D datasets.
-```python
-from pathlib import Path
-from huggingface_hub import hf_hub_download
-repo_id = "MLAdaptiveIntelligence/PRIMA"
-model_dir = Path("./prima_model")
-model_dir.mkdir(parents=True, exist_ok=True)
-# download stage-1 checkpoint
-s1_path = hf_hub_download(
-    repo_id=repo_id,
-    filename="s1ckpt.ckpt",
-    local_dir=model_dir
-)
-# donwload stage-3 checkpoint
-s3_path = hf_hub_download(
-    repo_id=repo_id,
-    filename="s3ckpt.ckpt",
-    local_dir=model_dir
-)
-```
-## Intended Use
-• Intended to be used for shape and pose estimation of quadruped images taken from a single view.
-• Intended for academic and research professionals working in fields related to animal behavior, such as neuroscience
-and ecology.
-• Not suitable as a zero-shot model for applications that require high shape and pose precision, but can be further optimized with 2D keypoint
-annotations or from SuperAnimal to improve accuracy. Also, it is not suitable for videos that look dramatically different from those
-we show in the paper.
-## Metrics
-• PA-MPJPE (Procrustes-aligned mean per-joint position error), computed over 3D joints.
-• PA-MPVPE (Procrustes-aligned mean per-vertex position error), computed over the SMAL mesh vertices.
-• PCK (Percentage of Correct Keypoints) measures the proportion of predicted keypoints within a specified threshold of the ground-truth keypoints.
-• AUC (Area Under the Curve), computed by integrating the PCK values as the threshold varies from 0 to 1.
-## Evaluation Data
-• In the paper we benchmark on Animal3d, CtrlAni3D, Quadruped2D, and AnimalKingdom.
-## Training Data:
-It consists of being trained together on the following datasets:
-- **Animal3D** see full details at (1).
-- **CtrlAni3D** See full details at (2).
-- **Quadruped2D** See full details at (3).
-- **Quadruped3D** See full details at **paper link**.
-## Ethical Considerations
-• No experimental data were collected for this model; all datasets used are cited.
-## License
-Modified MIT.
-Copyright 2026 by Mackenzie Mathis, Xiaohang Yu, and contributors.
-Permission is hereby granted to you (hereafter "LICENSEE") a fully-paid, non-exclusive,
-and non-transferable license for academic, non-commercial purposes only (hereafter “LICENSE”)
-to use the "MODEL" weights (hereafter "MODEL"), subject to the following conditions:
-The above copyright notice and this permission notice shall be included in all copies or substantial
-portions of the Software:
-This software may not be used to harm any animal deliberately.
-LICENSEE acknowledges that the MODEL is a research tool.
-THE MODEL IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
-BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
-IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
-WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE MODEL
-OR THE USE OR OTHER DEALINGS IN THE MODEL.
-If this license is not appropriate for your application, please contact Prof. Mackenzie W. Mathis
-(mackenzie@post.harvard.edu) and/or the TTO office at EPFL (tto@epfl.ch) for a commercial use license.
-Please cite **paper link** if you use this model in your work.
-## References
-1. Xu, J., Zhang, Y., Peng, J., Ma, W., Jesslen, A., Ji, P., Hu, Q., Zhang, J., Liu, Q.,
-Wang, J., et al.: Animal3d: A comprehensive dataset of 3d animal pose and shape.
-In: ICCV. pp. 9099–9109 (2023)
-2. Lyu, J., Zhu, T., Gu, Y., Lin, L., Cheng, P., Liu, Y., Tang, X., An, L.: Animer:
-Animal pose and shape estimation using a family-aware transformer. In: CVPR. pp.
-17486–17496 (2025)
-3. Ye, S., Filippova, A., Lauer, J., Schneider, S., Vidal, M., Qiu, T., Mathis, A.,
-Mathis, M.W.: Superanimal pretrained pose estimation models for behavioral analysis. Nature communications 15(1), 5165 (2024)

+---
+license: apache-2.0
+---

amr_vitbb.pth DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:dcf670ec02263ef338a8f01ef3330c2ca120f25b8f0b5d59128e1886652d8eb7
-size 2523794490

config_s1_HYDRA.yaml DELETED Viewed

@@ -1,147 +0,0 @@
-task_name: train
-tags:
-- dev
-train: true
-test: false
-ckpt_path: true
-seed: null
-trainer:
-  _target_: pytorch_lightning.Trainer
-  default_root_dir: ${paths.output_dir}
-  accelerator: gpu
-  devices: 1
-  deterministic: false
-  num_sanity_val_steps: 0
-  log_every_n_steps: ${GENERAL.LOG_STEPS}
-  val_check_interval: ${GENERAL.VAL_STEPS}
-  check_val_every_n_epoch: ${GENERAL.VAL_EPOCHS}
-  precision: 16-mixed
-  max_steps: ${GENERAL.TOTAL_STEPS}
-  limit_val_batches: 80
-paths:
-  root_dir: ${oc.env:PROJECT_ROOT}
-  data_dir: ${paths.root_dir}/data/
-  log_dir: logs/
-  output_dir: ${hydra:runtime.output_dir}
-  work_dir: ${hydra:runtime.cwd}
-extras:
-  ignore_warnings: false
-  enforce_tags: true
-  print_config: true
-exp_name: newbioposeStage02
-SMAL:
-  DATA_DIR: data/smal
-  MODEL_PATH: data/smal/my_smpl_00781_4_all.pkl
-  SHAPE_PRIOR_PATH: data/smal/my_smpl_data_00781_4_all.pkl
-  POSE_PRIOR_PATH: data/smal/walking_toy_symmetric_pose_prior_with_cov_35parts.pkl
-  NUM_JOINTS: 34
-EXTRA:
-  FOCAL_LENGTH: 1000
-  NUM_LOG_IMAGES: 4
-  NUM_LOG_SAMPLES_PER_IMAGE: 4
-  PELVIS_IND: 0
-DATASETS:
-  CONFIG:
-    SCALE_FACTOR: 0.3
-    ROT_FACTOR: 30
-    TRANS_FACTOR: 0.02
-    COLOR_SCALE: 0.2
-    ROT_AUG_RATE: 0.6
-    TRANS_AUG_RATE: 0.5
-    DO_FLIP: false
-    FLIP_AUG_RATE: 0.0
-    EXTREME_CROP_AUG_RATE: 0.0
-    EXTREME_CROP_AUG_LEVEL: 1
-  ANIMAL3D:
-    ROOT_IMAGE: ./datasets/animal3d/
-    JSON_FILE:
-      TRAIN: ./datasets/animal3d/train.json
-      TEST: ./datasets/animal3d/test.json
-    WEIGHT: 1.0
-  CONTROL_ANIMAL3D:
-    ROOT_IMAGE: ./datasets/control_animal3dlatest/
-    JSON_FILE:
-      TRAIN: ./datasets/control_animal3dlatest/train.json
-      TEST: ./datasets/control_animal3dlatest/test.json
-    WEIGHT: 0.5
-  QUADRUPED2D:
-    ROOT_IMAGE: ./datasets/quadruped2d/
-    JSON_FILE:
-      TRAIN: ./datasets/quadruped2d/train.json
-      TEST: ./datasets/quadruped2d/test.json
-    WEIGHT: 0.15
-GENERAL:
-  TOTAL_STEPS: 450000
-  LOG_STEPS: 533
-  VAL_STEPS: 533
-  VAL_EPOCHS: 1
-  CHECKPOINT_EPOCHS: 1
-  CHECKPOINT_SAVE_TOP_K: 2
-  NUM_WORKERS: 2
-  PREFETCH_FACTOR: 2
-LOSS_WEIGHTS:
-  KEYPOINTS_3D: 0.05
-  KEYPOINTS_2D: 0.01
-  INTERMEDIATE_KP2D: 0.001
-  INTERMEDIATE_KP3D: 0.001
-  GLOBAL_ORIENT: 0.005
-  POSE: 0.001
-  BETAS: 0.0005
-  TRANSL: 0.0005
-  ADVERSARIAL: 0.0
-  SUPCON: 0.0005
-TRAIN:
-  LR: 3.75e-06
-  WEIGHT_DECAY: 0.0001
-  BATCH_SIZE: 48
-  LOSS_REDUCTION: mean
-  NUM_TRAIN_SAMPLES: 2
-  NUM_TEST_SAMPLES: 64
-  POSE_2D_NOISE_RATIO: 0.01
-  SMPL_PARAM_NOISE_RATIO: 0.005
-MODEL:
-  IMAGE_SIZE: 256
-  IMAGE_MEAN:
-  - 0.485
-  - 0.456
-  - 0.406
-  IMAGE_STD:
-  - 0.229
-  - 0.224
-  - 0.225
-  BACKBONE:
-    TYPE: vith
-    PRETRAINED_WEIGHTS: ./data/amr_vitbb.pth
-    FREEZE: false
-  USE_BIOCLIP_EMBEDDING: true
-  BIOCLIP_EMBEDDING:
-    EMBED_DIM: 1280
-    TYPE: bioclip1
-  USE_KEYPOINT_EMBEDDING: false
-  KEYPOINT_EMBEDDING:
-    NUM_KEYPOINTS: 26
-    KEYPOINT_DIM: 2
-    EMBED_DIM: 1280
-    HIDDEN_DIM: 512
-    TYPE: token
-  SMAL_HEAD:
-    TYPE: new_bio_pose_transformer_decoder
-    IN_CHANNELS: 1280
-    IEF_ITERS: 1
-    DECODER_DIM: 1280
-    NUM_DECODER_LAYERS: 6
-    NUM_HEADS: 8
-    MLP_RATIO: 4.0
-    USE_KEYPOINT_2D_TOKENS: true
-    USE_KEYPOINT_3D_TOKENS: true
-    KEYPOINT_TOKEN_UPDATE: true
-    KP2D_INJECT_IMAGE_FEAT: true
-    TRANSFORMER_DECODER:
-      depth: 6
-      heads: 8
-      mlp_dim: 1024
-      dim_head: 64
-      dropout: 0.0
-      emb_dropout: 0.0
-      norm: layer
-      context_dim: 1280

config_s3_HYDRA.yaml DELETED Viewed

@@ -1,147 +0,0 @@
-task_name: train
-tags:
-- dev
-train: true
-test: false
-ckpt_path: true
-seed: null
-trainer:
-  _target_: pytorch_lightning.Trainer
-  default_root_dir: ${paths.output_dir}
-  accelerator: gpu
-  devices: 1
-  deterministic: false
-  num_sanity_val_steps: 0
-  log_every_n_steps: ${GENERAL.LOG_STEPS}
-  val_check_interval: ${GENERAL.VAL_STEPS}
-  check_val_every_n_epoch: ${GENERAL.VAL_EPOCHS}
-  precision: 16-mixed
-  max_steps: ${GENERAL.TOTAL_STEPS}
-  limit_val_batches: 80
-paths:
-  root_dir: ${oc.env:PROJECT_ROOT}
-  data_dir: ${paths.root_dir}/data/
-  log_dir: logs/
-  output_dir: ${hydra:runtime.output_dir}
-  work_dir: ${hydra:runtime.cwd}
-extras:
-  ignore_warnings: false
-  enforce_tags: true
-  print_config: true
-exp_name: bestquad3dStage02
-SMAL:
-  DATA_DIR: data/smal
-  MODEL_PATH: data/smal/my_smpl_00781_4_all.pkl
-  SHAPE_PRIOR_PATH: data/smal/my_smpl_data_00781_4_all.pkl
-  POSE_PRIOR_PATH: data/smal/walking_toy_symmetric_pose_prior_with_cov_35parts.pkl
-  NUM_JOINTS: 34
-EXTRA:
-  FOCAL_LENGTH: 1000
-  NUM_LOG_IMAGES: 4
-  NUM_LOG_SAMPLES_PER_IMAGE: 4
-  PELVIS_IND: 0
-DATASETS:
-  CONFIG:
-    SCALE_FACTOR: 0.3
-    ROT_FACTOR: 30
-    TRANS_FACTOR: 0.02
-    COLOR_SCALE: 0.2
-    ROT_AUG_RATE: 0.6
-    TRANS_AUG_RATE: 0.5
-    DO_FLIP: false
-    FLIP_AUG_RATE: 0.0
-    EXTREME_CROP_AUG_RATE: 0.0
-    EXTREME_CROP_AUG_LEVEL: 1
-  ANIMAL3D:
-    ROOT_IMAGE: ./datasets/animal3d/
-    JSON_FILE:
-      TRAIN: ./datasets/animal3d/train.json
-      TEST: ./datasets/animal3d/test.json
-    WEIGHT: 1.0
-  CONTROL_ANIMAL3D:
-    ROOT_IMAGE: ./datasets/control_animal3dlatest/
-    JSON_FILE:
-      TRAIN: ./datasets/control_animal3dlatest/train.json
-      TEST: ./datasets/control_animal3dlatest/test.json
-    WEIGHT: 0.5
-  QUADRUPED2D:
-    ROOT_IMAGE: ./datasets/quadruped2d/
-    JSON_FILE:
-      TRAIN: ./datasets/quadruped2d/train3d_60filtered.json
-      TEST: ./datasets/quadruped2d/test.json
-    WEIGHT: 0.5
-GENERAL:
-  TOTAL_STEPS: 450000
-  LOG_STEPS: 451
-  VAL_STEPS: 451
-  VAL_EPOCHS: 1
-  CHECKPOINT_EPOCHS: 1
-  CHECKPOINT_SAVE_TOP_K: 2
-  NUM_WORKERS: 2
-  PREFETCH_FACTOR: 2
-LOSS_WEIGHTS:
-  KEYPOINTS_3D: 0.05
-  KEYPOINTS_2D: 0.01
-  INTERMEDIATE_KP2D: 0.01
-  INTERMEDIATE_KP3D: 0.01
-  GLOBAL_ORIENT: 0.005
-  POSE: 0.001
-  BETAS: 0.0005
-  TRANSL: 0.0005
-  ADVERSARIAL: 0.0
-  SUPCON: 0.0005
-TRAIN:
-  LR: 3.75e-06
-  WEIGHT_DECAY: 0.0001
-  BATCH_SIZE: 48
-  LOSS_REDUCTION: mean
-  NUM_TRAIN_SAMPLES: 2
-  NUM_TEST_SAMPLES: 64
-  POSE_2D_NOISE_RATIO: 0.01
-  SMPL_PARAM_NOISE_RATIO: 0.005
-MODEL:
-  IMAGE_SIZE: 256
-  IMAGE_MEAN:
-  - 0.485
-  - 0.456
-  - 0.406
-  IMAGE_STD:
-  - 0.229
-  - 0.224
-  - 0.225
-  BACKBONE:
-    TYPE: vith
-    PRETRAINED_WEIGHTS: ./data/amr_vitbb.pth
-    FREEZE: false
-  USE_BIOCLIP_EMBEDDING: true
-  BIOCLIP_EMBEDDING:
-    EMBED_DIM: 1280
-    TYPE: bioclip1
-  USE_KEYPOINT_EMBEDDING: false
-  KEYPOINT_EMBEDDING:
-    NUM_KEYPOINTS: 26
-    KEYPOINT_DIM: 2
-    EMBED_DIM: 1280
-    HIDDEN_DIM: 512
-    TYPE: token
-  SMAL_HEAD:
-    TYPE: new_bio_pose_transformer_decoder
-    IN_CHANNELS: 1280
-    IEF_ITERS: 1
-    DECODER_DIM: 1280
-    NUM_DECODER_LAYERS: 6
-    NUM_HEADS: 8
-    MLP_RATIO: 4.0
-    USE_KEYPOINT_2D_TOKENS: true
-    USE_KEYPOINT_3D_TOKENS: true
-    KEYPOINT_TOKEN_UPDATE: true
-    KP2D_INJECT_IMAGE_FEAT: true
-    TRANSFORMER_DECODER:
-      depth: 6
-      heads: 8
-      mlp_dim: 1024
-      dim_head: 64
-      dropout: 0.0
-      emb_dropout: 0.0
-      norm: layer
-      context_dim: 1280

my_smpl_00781_4_all.pkl DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:22831db0e0e564dc95128e098da19995c2dda39b1aa18acc1335a6e62e0e3a59
-size 33686326

my_smpl_data_00781_4_all.pkl DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:b21364a1eb9cc60c9ff5bb07182eab3d715a200da48904e0e0465fbb8b57e153
-size 246211

s3ckpt.ckpt DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:194c24e8179ce8ef2fe52e2e7c39f67d3a246c94277cc4aa9dc883d578087239
-size 10222809027

walking_toy_symmetric_pose_prior_with_cov_35parts.pkl DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:c91a65abff0a67f40f888b3e7c05c350e9d1c128a07ee6c1b01ed4449cf8379f
-size 541909