AI & ML interests
None yet
Organizations
None yet
zeliang0426/DS_Qwen25-7-full-lora-3k
Updated
zeliang0426/DS_Qwen25-7-cache-lora-3k
Updated
zeliang0426/DS-Qwen25-7-Think
Updated
zeliang0426/QKV_Qwen25-7-full-param-3k
Text Generation
•
8B
•
Updated
zeliang0426/QKV_Qwen25-3-full-param-3k
Text Generation
•
3B
•
Updated
•
99
zeliang0426/DS-LLama-vanilla-6K
Updated
zeliang0426/Distill-LLama-8B-4k
Text Generation
•
8B
•
Updated
•
2
zeliang0426/Distill-LLama-8B-5k-try_2
Text Generation
•
8B
•
Updated
•
1
zeliang0426/Distill-LLama-8B-5k-smallLR
Text Generation
•
8B
•
Updated
•
3
zeliang0426/Distill-LLama-8B-6k
Text Generation
•
8B
•
Updated
•
1
zeliang0426/Distill-LLama-8B-7k
Text Generation
•
8B
•
Updated
•
1
zeliang0426/Distill-LLama-8B-5k
Text Generation
•
8B
•
Updated
•
2
zeliang0426/Distill-LLama-8B-5k-largeLR
Updated
zeliang0426/Qwen25-3-Think-nglobal_16
Text Generation
•
3B
•
Updated
zeliang0426/Qwen25-3-Think-nglobal_32
Text Generation
•
3B
•
Updated
zeliang0426/Qwen25-3-Think-no_global
Text Generation
•
3B
•
Updated
zeliang0426/Qwen25-3-Think-nglobal_48
Text Generation
•
3B
•
Updated
zeliang0426/Qwen25-3-Cache-Sink
Text Generation
•
3B
•
Updated
zeliang0426/Distill_Llama_Darpo-full-lora-3k
Updated
Text Generation
•
3B
•
Updated
Text Generation
•
3B
•
Updated
•
1
zeliang0426/Long_Distill_Llama_Darpo-cache-adapter-3k
Text Generation
•
8B
•
Updated
•
1
zeliang0426/Distill_Llama_SFT-full
Updated
zeliang0426/DS_Darpo-full-lora-3k
Updated
zeliang0426/DS_Llama_Darpo-cache-lora-3k
Updated
zeliang0426/Short_DS_Llama_Darpo-cache-adapter-3k
Text Generation
•
7B
•
Updated
•
1
zeliang0426/DS_Llama_Darpo-cache-adapter-3k
Text Generation
•
7B
•
Updated
•
1