Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
12
21
9
Deqing Fu
PRO
deqing
Follow
adamm-hf's profile picture
Gargaz's profile picture
charleslwang's profile picture
13 followers
·
18 following
https://deqingfu.github.io
DeqingFu
DeqingFu
AI & ML interests
None yet
Recent Activity
updated
a model
4 days ago
deqing/convergent-llama-300M-muon-6digit-addition_6digit_custom6
upvoted
a
paper
6 days ago
Value-Aware Stochastic KV Cache Eviction for Reasoning Models
submitted
a paper
6 days ago
Value-Aware Stochastic KV Cache Eviction for Reasoning Models
View all activity
Organizations
deqing
's models
158
Sort: Recently updated
deqing/lstm-window-4-v5
Text Generation
•
0.2B
•
Updated
Mar 19
•
2
deqing/llama-300M-v5-fivegram
Text Generation
•
0.3B
•
Updated
Mar 18
•
15
deqing/llama-300M-v5-base_7
Text Generation
•
0.3B
•
Updated
Mar 18
•
5
deqing/llama-300M-v5-permute
Text Generation
•
0.3B
•
Updated
Mar 17
•
21
deqing/llama-300M-v5-isolate-old
Text Generation
•
0.3B
•
Updated
Mar 16
•
3
deqing/test-fone-hub-upload
Updated
Mar 16
deqing/llama-600M-v4-isolate
Text Generation
•
0.6B
•
Updated
Mar 14
•
4
deqing/llama-600M-v4-fivegram
0.6B
•
Updated
Mar 13
deqing/llama-600M-v4-bigram
Text Generation
•
0.6B
•
Updated
Mar 12
•
2
deqing/llama-600M-v4-unigram
0.6B
•
Updated
Mar 11
•
14
deqing/mamba-370m-v4
Updated
Mar 11
deqing/llama-600M-v4-swap_numbers
Text Generation
•
0.6B
•
Updated
Mar 10
•
5
deqing/llama-600M-v4-isolate-old
Text Generation
•
0.6B
•
Updated
Mar 9
•
9
deqing/llama-600M-v4-original
Text Generation
•
0.6B
•
Updated
Mar 8
•
3
deqing/llama-300M-v3-muon-original
Text Generation
•
0.3B
•
Updated
Mar 6
•
3
deqing/llama-300M-v3-original
Text Generation
•
0.3B
•
Updated
Mar 5
•
6
deqing/llama-300M-v2-isolate
Text Generation
•
0.3B
•
Updated
Mar 2
•
7
deqing/llama-300M-v2-swap_numbers
Text Generation
•
0.3B
•
Updated
Mar 1
•
5
deqing/llama-300M-v2-fourgram
Text Generation
•
0.3B
•
Updated
Feb 28
•
2
deqing/llama-300M-v2-trigram
Text Generation
•
0.3B
•
Updated
Feb 28
•
2
deqing/llama-300M-v2-bigram
Text Generation
•
0.3B
•
Updated
Feb 28
•
13
deqing/llama-300M-v2-unigram
Text Generation
•
0.3B
•
Updated
Feb 27
•
18
deqing/llama-300M-v2-fivegram
Text Generation
•
0.3B
•
Updated
Feb 26
•
3
deqing/llama-300M-v2-text_only
Text Generation
•
0.3B
•
Updated
Feb 26
•
3
deqing/llama-300M-v2-uniform
Text Generation
•
0.3B
•
Updated
Feb 26
•
2
deqing/llama-300M-v2-original
Text Generation
•
0.3B
•
Updated
Feb 26
•
4
deqing/llama-300M-trigram
0.3B
•
Updated
Feb 23
•
2
deqing/fone-llama-3.2-1B-fineweb-sample-100BT-fone3d-hybrid-tile-v3
Updated
Feb 23
deqing/fone-llama-3.2-1B-fineweb-sample-100BT-fone3d-hybrid-tile-v2
1B
•
Updated
Feb 22
•
1
deqing/llama-300M-bigram
0.3B
•
Updated
Feb 22
•
1
Previous
1
2
3
4
5
6
Next