vmalert 可以执行一系列给定的rule(基于metricsql),然后发送报警到Alertmanager
特性
- 集成VictoriaMetrics TSDB
- MetricsQL 表达式校验
- prometheus 报警规则格式支持
- 集成Alertmanager
- 轻量级没有额外的依赖
使用
- 构建
目前需要自己构建,很简单 make vmalert 就可以了 - 启动
依赖一系列的规则(promql&&metricssql),数据源地址,通知地址Alertmanager地址,方便处理,聚合报警以及发送通知
命令:
./bin/vmalert -rule=alert.rules \
-datasource.url=http://localhost:8428 \
-notifier.url=http://localhost:9093
一个参考规则
groups:
- name: groupGorSingleAlert
rules:
- alert: VMRows
for: 10s
expr: vm_rows > 0
labels:
label: bar
host: "{{ $labels.instance }}"
annotations:
summary: "{{ $value|humanize }}"
description: "{{$labels}}"
- name: TestGroup
rules:
- alert: Conns
expr: sum(vm_tcplistener_conns) by(instance) > 1
annotations:
summary: "Too high connection number for {{$labels.instance}}"
description: "It is {{ $value }} connections for {{$labels.instance}}"
- alert: ExampleAlertAlwaysFiring
expr: sum by(job)
(up == 1)
- 参考命令
vmalert-20200511-085826-heads-cluster-0-g6c88e352
Usage of ./vmalert:
-datasource.basicAuth.password string
Optional basic auth password for -datasource.url
-datasource.basicAuth.username string
Optional basic auth username for -datasource.url
-datasource.url string
Victoria Metrics or VMSelect url. Required parameter. E.g. http://127.0.0.1:8428
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP is used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-evaluationInterval duration
How often to evaluate the rules. Default 1m (default 1m0s)
-external.url string
External URL is used as alert's source for sent alerts to the notifier
-http.disableResponseCompression
Disable compression of HTTP responses for saving CPU resources. By default compression is enabled to save network bandwidth
-http.maxGracefulShutdownDuration duration
The maximum duration for graceful shutdown of HTTP server. Highly loaded server may require increased value for graceful shutdown (default 7s)
-http.pathPrefix string
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this dealy the servier returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpListenAddr string
Address to listen for http connections (default ":8880")
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. Too low value may increase cache miss rate, which usually results in higher CPU and disk IO usage. Too high value may evict too much data from OS page cache, which will result in higher disk IO usage (default 60)
-notifier.url string
Prometheus alertmanager URL. Required parameter. e.g. http://127.0.0.1:9093
-remoteread.basicAuth.password string
Optional basic auth password for -remoteread.url
-remoteread.basicAuth.username string
Optional basic auth username for -remoteread.url
-remoteread.lookback duration
Lookback defines how far to look into past for alerts timeseries. For example, if lookback=1h then range from now() to now()-1h will be scanned. (default 1h0m0s)
-remoteread.url vmalert
Optional URL to Victoria Metrics or VMSelect that will be used to restore alerts state. This configuration makes sense only if vmalert was configured with `remotewrite.url` before and has been successfully persisted its state. E.g. http://127.0.0.1:8428
-remotewrite.basicAuth.password string
Optional basic auth password for -remotewrite.url
-remotewrite.basicAuth.username string
Optional basic auth username for -remotewrite.url
-remotewrite.url string
Optional URL to Victoria Metrics or VMInsert where to persist alerts state in form of timeseries. E.g. http://127.0.0.1:8428
-rule value
Path to the file with alert rules.
Supports patterns. Flag can be specified multiple times.
Examples:
-rule /path/to/file. Path to a single file with alerting rules
-rule dir/*.yaml -rule /*.yaml. Relative path to all .yaml files in "dir" folder,
absolute path to all .yaml files in root.
-rule.validateTemplates
Indicates to validate annotation and label templates (default true)
-version
Show VictoriaMetrics version
vmalert 暴露的服务
http://<vmalert-addr>/api/v1/alerts - 所有激活的报警;
http://<vmalert-addr>/api/v1/<groupName>/<alertID>/status" - 根据id获取报警状态
http://<vmalert-addr>/metrics - 应用程序的metrics
http://<vmalert-addr>/-/reload - 热加载(目前暂时还没实现)
说明
vmalert 是一个很不错的工具,弥补了VictoriaMetrics以前需要依赖外部进行报警的缺陷,以前我们需要基于原生prometheus
或者promxy或者grafana 等工具进行报警的处理,vmalert依托metrcisql 增强的扩展完善了VictoriaMetrics
参考资料
https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/app/vmalert