vllm.v1.core.kv_cache_metrics ¶
KV cache metrics tracking.
BlockMetricsState ¶
Tracks lifecycle metrics for a single KV cache block.
Source code in vllm/v1/core/kv_cache_metrics.py
KVCacheMetricsCollector ¶
Collects KV cache residency metrics with sampling.
Source code in vllm/v1/core/kv_cache_metrics.py
drain_events ¶
drain_events() -> list[KVCacheEvictionEvent]
on_block_accessed ¶
on_block_accessed(block: KVCacheBlock) -> None
on_block_allocated ¶
on_block_allocated(block: KVCacheBlock) -> None
on_block_evicted ¶
on_block_evicted(block: KVCacheBlock) -> None