/
vLLM Home
vLLM Home
Incubation | vLLM is an open-source library for fast LLM inference and serving. vLLM utilizes PagedAttention, an attention algorithm that effectively manages attention keys and values. vLLM equipped with PagedAttention redefines the new state of the art in LLM serving: it delivers up to 24x higher throughput than HuggingFace Transformers, without requiring any model architecture changes. GitHub: https://github.com/vllm-project |
Contribution Policy: https://github.com/vllm-project/vllm/blob/57b7be0e1c4e594c58a78297ab65fbb3ec206958/CONTRIBUTING.md#L4 License: Apache 2.0 Requirements Doc: https://github.com/vllm-project/vllm/blob/main/docs/requirements-docs.txt Maintainers:
|
Reference Information
|
Related content
LF AI & Data Foundation
LF AI & Data Foundation
More like this
OSS EU - LF AI & Data Mini Summit, September 12th, 2022
OSS EU - LF AI & Data Mini Summit, September 12th, 2022
More like this
NNStreamer Home
NNStreamer Home
More like this
Ryoma Home
Ryoma Home
More like this
Contents preparation for the website
Contents preparation for the website
More like this
DELTA Home
DELTA Home
More like this