chen's picture

4 5 17

chen

2395959141pq

·

AI & ML interests

生成式AI ， CV

Organizations

None yet

upvoted 2 articles 4 months ago

Article

Efficient Request Queueing – Optimizing LLM Performance

Apr 2, 2025

•

21

Article

Prefill and Decode for Concurrent Requests - Optimizing LLM Performance

Apr 16, 2025

•

59

upvoted 2 articles 7 months ago

Article

A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes

Aug 17, 2022

•

122

Article

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

+3

May 24, 2023

•

171

upvoted an article 8 months ago

Article

Mixture of Experts Explained

+4

Dec 11, 2023

•

1.03k