1 4 2

Zixuan Jiang

Andrew0425

https://anxmuy.github.io/

AnXMuy

AI & ML interests

Vision Language Model, Multimodel Language Model, Remote sensing

Recent Activity

upvoted a paper about 17 hours ago

Towards Human-Like Interactive Speech Recognition With Agentic Correction and Semantic Evaluation

submitted a paper about 17 hours ago

Towards Human-Like Interactive Speech Recognition With Agentic Correction and Semantic Evaluation

authored a paper about 19 hours ago

Annotation-Free Open-Vocabulary Segmentation for Remote-Sensing Images

View all activity

Organizations

upvoted a paper about 17 hours ago

Towards Human-Like Interactive Speech Recognition With Agentic Correction and Semantic Evaluation

Paper • 2605.29430 • Published 12 days ago • 1

submitted a paper to Daily Papers about 17 hours ago

Towards Human-Like Interactive Speech Recognition With Agentic Correction and Semantic Evaluation

Paper • 2605.29430 • Published 12 days ago • 1

authored a paper about 19 hours ago

Annotation-Free Open-Vocabulary Segmentation for Remote-Sensing Images

Paper • 2508.18067 • Published Aug 25, 2025

authored a paper about 20 hours ago

MMAE: A Massive Multitask Audio Editing Benchmark

Paper • 2606.07229 • Published 4 days ago • 41

upvoted a paper about 23 hours ago

MMAE: A Massive Multitask Audio Editing Benchmark

Paper • 2606.07229 • Published 4 days ago • 41

liked 2 datasets 5 months ago

earth-insights/DE-Benchmark

Viewer • Updated Oct 1, 2025 • 540 • 6 • 2

earth-insights/DE-Dataset

Viewer • Updated Sep 28, 2025 • 321k • 14 • 2

updated a model 7 months ago

earth-insights/Qwen2.5-VL-3B-RC-1120

Image-Text-to-Text • 4B • Updated Nov 20, 2025 • 4

published a model 7 months ago

earth-insights/Qwen2.5-VL-3B-RC-1120

Image-Text-to-Text • 4B • Updated Nov 20, 2025 • 4

upvoted a paper 7 months ago

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6, 2025 • 242

updated a model 8 months ago

earth-insights/Qwen2.5-VL-3B-RC

Image-Text-to-Text • 4B • Updated Oct 2, 2025 • 5

published a model 8 months ago

earth-insights/Qwen2.5-VL-3B-RC

Image-Text-to-Text • 4B • Updated Oct 2, 2025 • 5

updated a model 8 months ago

earth-insights/DescribeEarth

4B • Updated Oct 2, 2025 • 8 • 1

published a model 8 months ago

earth-insights/DescribeEarth

4B • Updated Oct 2, 2025 • 8 • 1

updated a dataset 8 months ago

earth-insights/DE-Dataset

Viewer • Updated Sep 28, 2025 • 321k • 14 • 2

published a dataset 8 months ago

earth-insights/DE-Dataset

Viewer • Updated Sep 28, 2025 • 321k • 14 • 2

upvoted a collection 12 months ago

Describe Anything

Collection

Multimodal Large Language Models for Detailed Localized Image and Video Captioning • 7 items • Updated 1 day ago • 63

Zixuan Jiang

AI & ML interests

Recent Activity

Organizations

Andrew0425's activity