Open to Collab

7 123 273

Muhammad Umair

umair894

AI & ML interests

Multimodal Reidentification | Feature Upscaling | Object Tracking |PhD UESTC

Recent Activity

liked a Space about 18 hours ago

linoyts/Qwen-Image-Edit-2511-AnyPose

upvoted a paper 2 days ago

DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer

liked a Space 3 days ago

tencent/HY-Motion-1.0

View all activity

Organizations

upvoted a paper 2 days ago

DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer

Paper • 2601.01425 • Published 4 days ago • 41

upvoted a paper 8 days ago

LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation

Paper • 2512.23576 • Published 10 days ago • 64

upvoted a paper 14 days ago

DreaMontage: Arbitrary Frame-Guided One-Shot Video Generation

Paper • 2512.21252 • Published 15 days ago • 34

upvoted a paper 15 days ago

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

Paper • 2512.19673 • Published 17 days ago • 61

upvoted a paper 16 days ago

Name That Part: 3D Part Segmentation and Naming

Paper • 2512.18003 • Published 20 days ago • 3

upvoted a paper 29 days ago

Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

Paper • 2512.08765 • Published 30 days ago • 128

upvoted 3 papers about 1 month ago

Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length

Paper • 2512.04677 • Published Dec 4, 2025 • 167

MedSAM3: Delving into Segment Anything with Medical Concepts

Paper • 2511.19046 • Published Nov 24, 2025 • 49

Insights from the ICLR Peer Review and Rebuttal Process

Paper • 2511.15462 • Published Nov 19, 2025 • 6

upvoted 4 papers about 2 months ago

PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

Paper • 2510.14528 • Published Oct 16, 2025 • 111

upvoted 3 papers 2 months ago

Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Paper • 2508.03680 • Published Aug 5, 2025 • 122

Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations

Paper • 2510.23607 • Published Oct 27, 2025 • 177

A Definition of AGI

Paper • 2510.18212 • Published Oct 21, 2025 • 35

upvoted 4 papers 3 months ago

VideoAgentTrek: Computer Use Pretraining from Unlabeled Videos

Paper • 2510.19488 • Published Oct 22, 2025 • 19

DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile Phone Agents

Paper • 2510.19336 • Published Oct 22, 2025 • 16

Chronos-2: From Univariate to Universal Forecasting

Paper • 2510.15821 • Published Oct 17, 2025 • 19

Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

Paper • 2510.15742 • Published Oct 17, 2025 • 50

Muhammad Umair

AI & ML interests

Recent Activity

Organizations

umair894's activity