Tag: vLLM

Enhancing DeepSeek Models with MLA and FP8 Optimizations in vLLM

A Compressed Summary Enhanced Performance: DeepSeek models see up to 3x throughput

Klenance Klenance

Introducing AIBrix: Cost-Effective and Scalable Control Plane for vLLM

Open-source large language models (LLMs) like LLaMA, Deepseek, Qwen and Mistral etc

Klenance Klenance