Tim Wu

changtimwu

6 10 29

AI & ML interests

DL,IoT,Devop

Recent Activity

liked a Space 29 days ago

build-small-hackathon/Aiko-AI-Waifu

liked a model about 1 month ago

Roblox/cubepart

new activity about 2 months ago

zhifeixie/Mega-ASR:The weight should be loaded by any Qwen3-ASR imlementation, right?

View all activity

Organizations

liked a Space 29 days ago

Aiko Chan - an AI Waifu and Companion

👀

Meet Aiko-chan, my AI Anime Waifu

liked a model about 1 month ago

Roblox/cubepart

Text-to-3D • Updated May 28 • 18

New activity in zhifeixie/Mega-ASR about 2 months ago

The weight should be loaded by any Qwen3-ASR imlementation, right?

#3 opened about 2 months ago by

changtimwu

New activity in Tiiny/SmallThinker-21BA3B-Instruct 12 months ago

Are there any other frameworks tested besides transformers that can be deployed?

#5 opened 12 months ago by

DarrenChen

liked a model 12 months ago

RedHatAI/Qwen3-32B-NVFP4

Text Generation • 19B • Updated Nov 21, 2025 • 3.22k • 9

New activity in omeng-nvidia/saved_models_Qwen3-30B-A3B_nvfp4_hf about 1 year ago

Can you explain how this model was built?

#2 opened about 1 year ago by

changtimwu

liked a model about 1 year ago

Qwen/Qwen3-32B-FP8

Text Generation • 33B • Updated Jul 26, 2025 • 92k • 84

liked a Space about 1 year ago

The Ultra-Scale Playbook

🌌

3.94k

The ultimate guide to training LLM on large GPU Clusters

upvoted a paper over 1 year ago

DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving

Paper • 2401.09670 • Published Jan 18, 2024 • 2

upvoted an article over 1 year ago

Article

Uncensor any LLM with abliteration

mlabonne

•

Jun 13, 2024

• 874

liked a model over 1 year ago

microsoft/Phi-4-multimodal-instruct

Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 519k • 1.61k

upvoted a paper over 1 year ago

LLaVA-CoT: Let Vision Language Models Reason Step-by-Step

Paper • 2411.10440 • Published Jul 21, 2025 • 133

liked a model over 1 year ago

QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF

Text Generation • 3B • Updated Nov 2, 2024 • 420 • 10

upvoted 2 papers over 1 year ago

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Paper • 2501.05366 • Published Jan 9, 2025 • 105

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28, 2025 • 125

liked a Space almost 2 years ago

Llama3.1 S V0.2 Checkpoint 2024 08 20

😻

115

Chat with Llama3.1 using spoken audio or synthesize speech

liked a model almost 2 years ago

shenzhi-wang/Llama3.1-8B-Chinese-Chat

Text Generation • 8B • Updated Jul 29, 2024 • 7.42k • • 266

liked a model about 2 years ago

openbmb/MiniCPM-Llama3-V-2_5-gguf

Updated Feb 27, 2025 • 4.9k • 217

liked a Space about 2 years ago

Microsoft Phi-3-Vision-128k

😻

219

Chat with an image using Phi-3 Vision model

liked a model about 2 years ago

google/paligemma-3b-pt-224

Image-Text-to-Text • 3B • Updated Sep 21, 2024 • 180k • 511

Tim Wu

AI & ML interests

Recent Activity

Organizations

changtimwu's activity

Aiko Chan - an AI Waifu and Companion

The weight should be loaded by any Qwen3-ASR imlementation, right?

Are there any other frameworks tested besides transformers that can be deployed?

Can you explain how this model was built?

The Ultra-Scale Playbook

Uncensor any LLM with abliteration

Llama3.1 S V0.2 Checkpoint 2024 08 20

Microsoft Phi-3-Vision-128k