NVIDIA A10 24GO gddr6
AMD EPYC7313p
128Go ddr4ecc
Thanks to ppl who gonna help bcs i cant found a fast enought all 31b models are slow even the idk what to do or config if someone can send me the config too thx !
Join the community of Machine Learners and AI enthusiasts.
Sign UpYo, i personally love the Qwen2.5-coder line of models. I use it to adversarially look at code from other models very frequently. with your setup you could use Qwen/Qwen2.5-Coder-14B-Instruct-GGUF and use the q5_0.gguf quantized version. As far as configs go you could set:
Temperature 0.6
Top_P 1.0
Min_P 0
Alternatives you could use would be:
DeepSeek-Coder-V2-Lite-Instruct
Qwen2.5-Coder-7B Q8
For 31b models to fit with your hardware you would have to use q3 quants and the quality is not going to be the greatest. Alternatively you could look into using a service like Modal. They offer free GPU credits monthly. You can run an app as a shell and use ollama through their GPaaS. This gives you varying GPU's with VRAM that will fit to specific models you're looking for. But if completely local is what you want the models I've listed above should fit your needs.
omg thanks you ur the best on god fr fr