vllm.config.observability ¶
ObservabilityConfig ¶
Configuration for observability - metrics and tracing.
Source code in vllm/config/observability.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 | |
collect_detailed_traces class-attribute instance-attribute ¶
collect_detailed_traces: (
list[DetailedTraceModules] | None
) = None
It makes sense to set this only if --otlp-traces-endpoint is set. If set, it will collect detailed traces for the specified modules. This involves use of possibly costly and or blocking operations and hence might have a performance impact.
Note that collecting detailed timing information for each request can be expensive.
collect_model_execute_time cached property ¶
collect_model_execute_time: bool
Whether to collect model execute time for the request.
collect_model_forward_time cached property ¶
collect_model_forward_time: bool
Whether to collect model forward time for the request.
kv_cache_metrics class-attribute instance-attribute ¶
kv_cache_metrics: bool = False
Enable KV cache residency metrics (lifetime, idle time, reuse gaps). Uses sampling to minimize overhead. Requires log stats to be enabled (i.e., --disable-log-stats not set).
kv_cache_metrics_sample class-attribute instance-attribute ¶
kv_cache_metrics_sample: float = Field(
default=0.01, gt=0, le=1
)
Sampling rate for KV cache metrics (0.0, 1.0]. Default 0.01 = 1% of blocks.
otlp_traces_endpoint class-attribute instance-attribute ¶
otlp_traces_endpoint: str | None = None
Target URL to which OpenTelemetry traces will be sent.
show_hidden_metrics cached property ¶
show_hidden_metrics: bool
Check if the hidden metrics should be shown.
show_hidden_metrics_for_version class-attribute instance-attribute ¶
show_hidden_metrics_for_version: str | None = None
Enable deprecated Prometheus metrics that have been hidden since the specified version. For example, if a previously deprecated metric has been hidden since the v0.7.0 release, you use --show-hidden-metrics-for-version=0.7 as a temporary escape hatch while you migrate to new metrics. The metric is likely to be removed completely in an upcoming release.
_validate_collect_detailed_traces classmethod ¶
_validate_collect_detailed_traces(
value: list[DetailedTraceModules] | None,
) -> list[DetailedTraceModules] | None
Handle the legacy case where users might provide a comma-separated string instead of a list of strings.
Source code in vllm/config/observability.py
_validate_otlp_traces_endpoint classmethod ¶
Source code in vllm/config/observability.py
_validate_show_hidden_metrics_for_version classmethod ¶
Source code in vllm/config/observability.py
_validate_tracing_config ¶
Source code in vllm/config/observability.py
compute_hash ¶
compute_hash() -> str
WARNING: Whenever a new field is added to this config, ensure that it is included in the factors list if it affects the computation graph.
Provide a hash that uniquely identifies all the configs that affect the structure of the computation graph from input ids/embeddings to the final hidden states, excluding anything before input ids/embeddings and after the final hidden states.