西安阎良区建设局网站,建设网站托管费用,商务网站建设数据处理,网站的建设维护推广一、基本介绍
vmagent是一个小型代理#xff0c;可帮助您从各种来源收集指标#xff0c; 重新标记和筛选收集到的指标#xff0c;并通过 Prometheus协议或 VictoriaMetrics协议 将它们存储在VictoriaMetrics 或任何其他存储系统中 。remote_write remote_write 二、 背景 …一、基本介绍vmagent是一个小型代理可帮助您从各种来源收集指标 重新标记和筛选收集到的指标并通过 Prometheus协议或 VictoriaMetrics协议 将它们存储在VictoriaMetrics 或任何其他存储系统中 。remote_writeremote_write二、 背景虽然 VictoriaMetrics 提供了一个高效的指标存储和监控解决方案但我们的用户需要一种速度快、内存占用低的方案以便从兼容 Prometheus 的导出器中抓取指标并导入 VictoriaMetrics。此外我们发现用户的基础架构千差万别没有两个是完全相同的。因此我们决定增加 VictoriaMetrics 的灵活性vmagent例如使其能够 通过常用的推送协议接收指标 并能够 发现兼容 Prometheus 的目标并从中抓取指标 。三、功能描述可以作为 Prometheus 的直接替代品用于发现和抓取诸如node_exporter之类的目标 。请注意单节点 VictoriaMetrics 也可以以相同的方式发现和抓取与 Prometheus 兼容的目标vmagent- 请参阅 这些文档 。可以通过 Prometheus 重新标记功能添加、删除和修改标签也称为标记并在将数据发送到远程存储之前对其进行过滤。 详情请参阅这些文档。可以通过 VictoriaMetrics 支持的所有数据摄取协议接收数据 - 请参阅 这些文档 。可以将传入的样本按时间和标签进行聚合然后再发送到远程存储 - 请参阅 这些文档 。可以将收集到的指标同时复制到多个与 Prometheus 兼容的远程存储系统 - 请参阅 这些文档 。使用VictoriaMetrics 远程写入协议向 VictoriaMetrics 发送数据时可以节省出口网络带宽使用成本 。在远程存储连接不稳定的环境中也能流畅运行。如果远程存储不可用收集到的指标数据会被缓存-remoteWrite.tmpDataPath。一旦与远程存储的连接恢复缓存的指标数据就会立即发送到远程存储。可以通过设置限制缓冲区的最大磁盘使用量-remoteWrite.maxDiskUsagePerURL。与 Prometheus 相比它占用的内存、CPU、磁盘 I/O 和网络带宽都少得多。如果需要还可以根据这些文档进一步降低内存和 CPU 使用率 。vmagent当需要抓取大量目标时 可以将抓取目标分布在多个实例中。请参阅这些文档 。可以从多个文件中加载抓取配置。请参阅 这些文档 。可以高效抓取暴露数百万条时间序列的目标例如Prometheus 中的 /federate 端点 。请参阅 这些文档 。可以通过在抓取时以及将时间序列发送到远程存储系统之前限制唯一时间序列的数量来 处理 高基数 和 高流失率问题。请参阅这些文档 。可以将收集到的指标写入多个租户。请参阅 这些文档 。可以从 Kafka 读取数据/向 Kafka 写入数据。请参阅 这些文档 。可以从 Google PubSub 读取和写入数据。请参阅 这些文档 。1. 多个vmagent抓取监控指标通过哈希计算statefulMode: true extraArgs: envflag.enable: true envflag.prefix: VM_ loggerFormat: json httpListenAddr: :8429 promscrape.dropOriginalLabels: false # 关键添加 cluster sharding 参数 promscrape.cluster.membersCount: 2 # 关键给每条 metric 打上 agent 编号 promscrape.cluster.memberLabel: vmagent_instance # 关键设置 memberNum podNameVM 会自动从名字提取数字 promscrape.cluster.memberNum: $(POD_NAME) # 去重功能 streamAggr.dedupInterval: 15s streamAggr.dropInputLabels: replica # 启用后页面仍会显示每个 target 最近一次 error, 日志不再被刷爆 promscrape.suppressScrapeErrors: true promscrape.cluster.memberURLTemplate: http://vmagent-custom-agent-%d.monitoring.svc.cluster.local:8429/targets # 接收多大的数据 promscrape.maxScrapeSize: 128MB extraEnvs: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name可以看到agent-0 和agent-1 同时在采集不同的job 这在大规模监控系统是非常重要的vmagent-custom-agent-0curl 10.246.107.189:8429/targets jobserviceScrape/vm/victoria-metrics-victoria-metrics-cluster-vminsert/0 (1/1 up) stateup, endpointhttp://10.246.20.116:8480/metrics, labels{containervminsert,endpointhttp,instance10.246.20.116:8480,jobvictoria-metrics-victoria-metrics-cluster-vminsert,namespacevm,podvictoria-metrics-victoria-metrics-cluster-vminsert-6cc878f2sp85,servicevictoria-metrics-victoria-metrics-cluster-vminsert,vmagent_instancevmagent-custom-agent-0}, scrapes_total1, scrapes_failed0, last_scrape15.929s ago, scrape_duration3ms, scrape_response_size49.113KiB, samples_scraped661, error jobserviceScrape/vm/victoria-metrics-victoria-metrics-cluster-vmselect/0 (2/2 up) stateup, endpointhttp://10.246.105.47:8481/metrics, labels{containervmselect,endpointhttp,instance10.246.105.47:8481,jobvictoria-metrics-victoria-metrics-cluster-vmselect,namespacevm,podvictoria-metrics-victoria-metrics-cluster-vmselect-95fcf5bq4dch,servicevictoria-metrics-victoria-metrics-cluster-vmselect,vmagent_instancevmagent-custom-agent-0}, scrapes_total2, scrapes_failed0, last_scrape11.976s ago, scrape_duration3ms, scrape_response_size50.388KiB, samples_scraped643, error stateup, endpointhttp://10.246.107.172:8481/metrics, labels{containervmselect,endpointhttp,instance10.246.107.172:8481,jobvictoria-metrics-victoria-metrics-cluster-vmselect,namespacevm,podvictoria-metrics-victoria-metrics-cluster-vmselect-95fcf5bcggsk,servicevictoria-metrics-victoria-metrics-cluster-vmselect,vmagent_instancevmagent-custom-agent-0}, scrapes_total1, scrapes_failed0, last_scrape13.698s ago, scrape_duration3ms, scrape_response_size51.166KiB, samples_scraped653, error jobserviceScrape/vm/victoria-metrics-victoria-metrics-cluster-vmstorage/0 (1/1 up) stateup, endpointhttp://10.246.105.32:8482/metrics, labels{containervmstorage,endpointhttp,instance10.246.105.32:8482,jobvictoria-metrics-victoria-metrics-cluster-vmstorage,namespacevm,podvictoria-metrics-victoria-metrics-cluster-vmstorage-0,servicevictoria-metrics-victoria-metrics-cluster-vmstorage,vmagent_instancevmagent-custom-agent-0}, scrapes_total1, scrapes_failed0, last_scrape16.649s ago, scrape_duration2ms, scrape_response_size31.727KiB, samples_scraped553, error jobserviceScrape/vm/victoria-operator-victoria-metrics-operator/0 (1/1 up) stateup, endpointhttp://10.246.20.99:8080/metrics, labels{containeroperator,endpointhttp,instance10.246.20.99:8080,jobvictoria-operator-victoria-metrics-operator,namespacevm,podvictoria-operator-victoria-metrics-operator-67d7769d59-ltn22,servicevictoria-operator-victoria-metrics-operator,vmagent_instancevmagent-custom-agent-0}, scrapes_total2, scrapes_failed0, last_scrape2.686s ago, scrape_duration8ms, scrape_response_size335.582KiB, samples_scraped2876, error jobserviceScrape/vm/vmagent-custom-agent/0 (1/1 up) stateup, endpointhttp://10.246.107.189:8429/metrics, labels{containervmagent,endpointhttp,instance10.246.107.189:8429,jobvmagent-custom-agent,namespacevm,podvmagent-custom-agent-0,servicevmagent-custom-agent,vmagent_instancevmagent-custom-agent-0}, scrapes_total1, scrapes_failed0, last_scrape24.940s ago, scrape_duration2ms, scrape_response_size55.216KiB, samples_scraped870, error jobserviceScrape/vm/vmalert-custom-alert/0 (1/1 up) stateup, endpointhttp://10.246.107.148:8080/metrics, labels{containervmalert,endpointhttp,instance10.246.107.148:8080,jobvmalert-custom-alert,namespacevm,podvmalert-custom-alert-6f46bb8d55-c999x,servicevmalert-custom-alert,vmagent_instancevmagent-custom-agent-0}, scrapes_total2, scrapes_failed0, last_scrape9.930s ago, scrape_duration2ms, scrape_response_size33.076KiB, samples_scraped515, error jobserviceScrape/vm/vmalertmanager-custom-alertmanager/0 (2/2 up) stateup, endpointhttp://10.246.113.90:9093/metrics, labels{containeralertmanager,endpointhttp,instance10.246.113.90:9093,jobvmalertmanager-custom-alertmanager-nodeport,namespacevm,podvmalertmanager-custom-alertmanager-0,servicevmalertmanager-custom-alertmanager-nodeport,vmagent_instancevmagent-custom-agent-0}, scrapes_total1, scrapes_failed0, last_scrape29.998s ago, scrape_duration3ms, scrape_response_size59.288KiB, samples_scraped591, error stateup, endpointhttp://10.246.113.90:9093/metrics, labels{containeralertmanager,endpointhttp,instance10.246.113.90:9093,jobvmalertmanager-custom-alertmanager,namespacevm,podvmalertmanager-custom-alertmanager-0,servicevmalertmanager-custom-alertmanager,vmagent_instancevmagent-custom-agent-0}, scrapes_total1, scrapes_failed0, last_scrape26.385s ago, scrape_duration4ms, scrapevmagent-custom-agent-1jobserviceScrape/vm/victoria-metrics-victoria-metrics-cluster-vminsert/0 (1/1 up) stateup, endpointhttp://10.246.12.59:8480/metrics, labels{containervminsert,endpointhttp,instance10.246.12.59:8480,jobvictoria-metrics-victoria-metrics-cluster-vminsert,namespacevm,podvictoria-metrics-victoria-metrics-cluster-vminsert-6cc878f2mr55,servicevictoria-metrics-victoria-metrics-cluster-vminsert,vmagent_instancevmagent-custom-agent-1}, scrapes_total3, scrapes_failed0, last_scrape12.212s ago, scrape_duration2ms, scrape_response_size51.016KiB, samples_scraped684, error jobserviceScrape/vm/victoria-metrics-victoria-metrics-cluster-vmstorage/0 (1/1 up) stateup, endpointhttp://10.246.107.151:8482/metrics, labels{containervmstorage,endpointhttp,instance10.246.107.151:8482,jobvictoria-metrics-victoria-metrics-cluster-vmstorage,namespacevm,podvictoria-metrics-victoria-metrics-cluster-vmstorage-1,servicevictoria-metrics-victoria-metrics-cluster-vmstorage,vmagent_instancevmagent-custom-agent-1}, scrapes_total2, scrapes_failed0, last_scrape17.820s ago, scrape_duration3ms, scrape_response_size31.824KiB, samples_scraped554, error jobserviceScrape/vm/vmagent-custom-agent/0 (1/1 up) stateup, endpointhttp://10.246.105.29:8429/metrics, labels{containervmagent,endpointhttp,instance10.246.105.29:8429,jobvmagent-custom-agent,namespacevm,podvmagent-custom-agent-1,servicevmagent-custom-agent,vmagent_instancevmagent-custom-agent-1}, scrapes_total3, scrapes_failed0, last_scrape1.481s ago, scrape_duration2ms, scrape_response_size55.213KiB, samples_scraped869, error jobserviceScrape/vm/victoria-metrics-victoria-metrics-cluster-vmselect/0 (0/0 up) jobserviceScrape/vm/victoria-operator-victoria-metrics-operator/0 (0/0 up) jobserviceScrape/vm/vmalert-custom-alert/0 (0/0 up) jobserviceScrape/vm/vmalertmanager-custom-alertmanager/0 (0/0 up)2. 分片Operator 支持 vmagent 集群模式下的分片 用于抓取大量目标。分片技术VMAgent将网络爬虫分布到多个部署环境中VMAgent。使用示例这是一个完整的VMAgent高可用性功能示例apiVersion: operator.victoriametrics.com/v1beta1 kind: VMAgent metadata: name: ha-example spec: # ... selectAllByDefault: true vmAgentExternalLabelName: vmagent_ha remoteWrite: - url: http://vmsingle-example.default.svc:8428/api/v1/write # Replication: scrapeInterval: 30s replicaCount: 2 # StatefulMode: statefulMode: true statefulStorage: volumeClaimTemplate: spec: resources: requests: storage: 20Gi # Sharding shardCount: 3 affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchLabels: app.kubernetes.io/name: vmagent app.kubernetes.io/instance: custom-agent topologyKey: kubernetes.io/hostname # ...此配置会在每个部署节点上生成5多个副本2。每个部署都有自己的分片编号并且仅抓取1/5所有目标的数据。此外您可以%SHARD_NUM%在规范字段中使用特殊占位符VMAgent操作员在为 vmagent 创建部署或 StatefulSet 时会将其替换为 vmagent 的当前分片号。在上面的示例中%SHARD_NUM%占位符用于该部分它建议调度器不要将podAntiAffinity具有相同分片号Pod 模板中的标签的 Pod部署在同一节点上。您可以使用其他占位符来指定可用区或区域而不是节点。shard-numtopologyKey请注意目前该操作符未使用-promscrape.cluster.replicationFactor参数VMAgent而是为每个分片创建replicaCount多个副本这会导致资源消耗增加。此问题将在未来版本中修复更多详情请参见此问题 。3. VMAgent CRD 中的内联附加抓取配置您需要直接在配置文件中添加抓取配置vmagent spec.inlineScrapeConfig。它是 YAML 格式的原始文本。请参见以下示例。apiVersion: operator.victoriametrics.com/v1beta1 kind: VMAgent metadata: name: ha-example spec: # ... selectAllByDefault: true vmAgentExternalLabelName: vmagent_ha remoteWrite: - url: http://vmsingle-example.default.svc:8428/api/v1/write # Replication: scrapeInterval: 30s replicaCount: 2 # StatefulMode: statefulMode: true statefulStorage: volumeClaimTemplate: spec: resources: requests: storage: 20Gi # Sharding shardCount: 3 affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchLabels: app.kubernetes.io/name: vmagent app.kubernetes.io/instance: custom-agent topologyKey: kubernetes.io/hostname # ...注意不要在 inlineScrapeConfig 中使用密码和令牌而应使用 Secret。4. 将额外的抓取配置定义为 Kubernetes Secret您需要使用密钥定义 Kubernetes Secret。关键就prometheus-additional.yaml在下面的例子里apiVersion: v1 kind: Secret metadata: name: additional-scrape-configs stringData: prometheus-additional.yaml: | - job_name: prometheus static_configs: - targets: [localhost:9090]之后您需要在 VMAgent CRD 的以下additionalScrapeConfigs部分指定密钥的名称和密钥apiVersion: operator.victoriametrics.com/v1beta1 kind: VMAgent metadata: name: example spec: # ... selectAllByDefault: true additionalScrapeConfigs: name: additional-scrape-configs key: prometheus-additional.yaml remoteWrite: - url: http://vmsingle-example.default.svc:8428/api/v1/write # ...注意VMAgent CRD 配置中只能指定一个 Secret因此请将其用于所有其他抓取配置。5. 抓取 Prometheus 写入节点 VictoriaMetrics用于抓取 Prometheus 目标并将数据写入单节点 VictoriaMetrics 的示例命令/path/to/vmagent -promscrape.config/path/to/prometheus.yml -remoteWrite.urlhttps://victoria-metrics-host:8428/api/v1/write6. Prometheus 的直接替代品如果您仅使用 Prometheus 从各种目标抓取指标并将这些指标转发到远程存储那么vmagent可以考虑使用其他方案来替代 Prometheus。通常情况下vmagent与 Prometheus 相比其他方案所需的内存、CPU 和网络带宽更少。详情请参阅 相关文档 。7. 重新标记和过滤vmagent在将收集的数据发送到远程存储之前可以添加、删除或更新数据标签。此外它还可以通过类似 Prometheus 的重新标记功能在将收集的数据发送到远程存储之前删除不需要的样本。 详情请参阅“重新标记”手册。vmagent crd Yaml 引用 relabelConfig: name: vmagent-relabel key: relabel.yaml # remote write 目标你的 vmselect/vminsert 地址 remoteWrite: - url: http://victoria-metrics-victoria-metrics-cluster-vminsert:8480/insert/0/prometheus urlRelabelConfig: name: vmagent-relabel key: target-1-relabel.yaml --- apiVersion: v1 kind: ConfigMap metadata: name: vmagent-relabel namespace: vm data: relabel.yaml: | - source_labels: [aa] separator: foobar regex: foo.bar target_label: aaa replacement: xxx - action: keep source_labels: [aaa] - action: drop source_labels: [aaa] - action: replace replacement: prod target_label: cluster - action: replace source_labels: [namespace] target_label: ns - action: labeldrop regex: pod - action: drop source_labels: [__name__] regex: go_.* - action: drop source_labels: [job] regex: debug target-1-relabel.yaml: | - action: keep_if_equal source_labels: [foo, bar] - action: drop_if_equal source_labels: [foo, bar]8. 灵活去重流聚合中的去重功能允许为收集的样本设置任意复杂的去重方案。例如以下配置指示 每隔 60 秒vmagent仅发送每个 时间序列的最后一个样本以下配置指示将具有 不同标签值的时间序列vmagent合并 然后每 60 秒仅发送每个合并序列的最后一个样本replicastreamAggr.dedupInterval: 60s streamAggr.dropInputLabels: replica9. 获取所有targessum(vm_promscrape_scrape_pool_targets) by (scrape_job)10. 将指标写入 Kafka以下是 向 Kafka 写入指标的完整示例 apiVersion: operator.victoriametrics.com/v1beta1 kind: VMAgent metadata: name: ent-example spec: # enabling enterprise features license: keyRef: name: k8s-secret-that-contains-license key: key-in-a-secret-that-contains-license image: tag: v1.110.13-enterprise # using enterprise features: writing metrics to Kafka # more details about kafka integration you can read on https://docs.victoriametrics.com/victoriametrics/vmagent/#kafka-integration remoteWrite: # sasl with username and password - url: kafka://broker-1:9092/?topicprom-rw-1security.protocolSASL_SSLsasl.mechanismsPLAIN # it requires to create kubernetes secret kafka-basic-auth with keys username and password in the same namespace basicAuth: username: name: kafka-basic-auth key: username password: name: kafka-basic-auth key: password # sasl with username and password from secret and tls - url: kafka://localhost:9092/?topicprom-rw-2security.protocolSSL # it requires to create kubernetes secret kafka-tls with keys ca.pem, cert.pem and key.pem in the same namespace tlsConfig: ca: secret: name: kafka-tls key: ca.pem cert: secret: name: kafka-tls key: cert.pem keySecret: name: kafka-tls key: key.pem # ...other fields...11. DaemonSet 模式可以将 vmagent 配置为使用DaemonSet 而不是 Deployment 和 StatefulSet。Operator 提供启动模式之间的无缝切换包括 DaemonSetMode、StatefulMode 和默认模式。主要特点减少用于指标抓取的网络流量。分散负载以收集指标。为单个 pod 故障提供恢复能力。在这种情况下VMAgent 的 Pod 将在每个 Kubernetes 节点上启动。Operator 配置 VMAgent 为 Kubernetes API 请求应用spec.nodeNamePod字段选择器 。此字段选择器仅受支持role: pod且只能与一起使用VMPodScrape。它限制了 VMAgent 可选择的对象范围。配置示例kubernetes_sd_configs: - role: pod namespaces: names: - default selectors: - role: pod field: spec.nodeName%{KUBE_NODE_NAME}VMAgent 对象示例apiVersion: operator.victoriametrics.com/v1beta1 kind: VMAgent metadata: name: per-node spec: selectAllByDefault: true daemonSetMode: true remoteWrite: - url: http://vmsingle-example.default.svc:8428/api/v1/writedaemonSetMode 有以下限制不支持分片。不支持 podDisruptionBudget。不支持 horizontalPodAutoScaler。持久队列的卷可以挂载volumes并且必须具有 hostPath 或 emptyDir。仅支持 VMPodScrape。vmagent 重启会导致指标收集出现短暂的空隙。每个节点上仅部署一个来自 DaemonSet 的 Pod。四、监控看板https://grafana.com/grafana/dashboards/12683-victoriametrics-vmagent/五、线上配置apiVersion: operator.victoriametrics.com/v1beta1 kind: VMAgent metadata: name: custom-agent namespace: monitoring spec: imagePullSecrets: - name: uhub-registry secrets: - etcd-client-cert scrapeInterval: 30s vmAgentExternalLabelName: vmagent_ha statefulMode: true #daemonSetMode: true replicaCount: 2 #relabelConfig: # name: vmagent-relabel # key: relabel.yaml #inlineRelabelConfig: # - target_label: bar1 # - source_labels: [aa] additionalScrapeConfigs: name: additional-scrape-configs key: prometheus-additional.yaml # Sharding #shardCount: 2 # remote write 目标你的 vmselect/vminsert 地址 支持写入kafka remoteWrite: - url: http://vm-victoria-metrics-cluster-vminsert:8480/insert/0/prometheus #vmagent_remotewrite_pending_data_bytes 0 持续上升 , 应增加队列数 用于快速写数据 queues: 100 maxBlockSize: 67108864 maxRowsPerBlock: 20000 # 查看真实 remoteWrite URL调试时用 showURL: true #urlRelabelConfig: # name: vmagent-relabel # key: target-1-relabel.yaml #inlineUrlRelabelConfig: # - action: keep_if_equal # source_labels: [foo1, bar2] #image: # repository: uhub.service.ucloud.cn/base-image/victoriametrics/vmagent # tag: v1.128.0 # pullPolicy: Always # image: # repository: operator # tag: v0.65.0 extraArgs: envflag.enable: true envflag.prefix: VM_ loggerFormat: json httpListenAddr: :8429 promscrape.dropOriginalLabels: false # 关键添加 cluster sharding 参数 promscrape.cluster.membersCount: 2 # 关键给每条 metric 打上 agent 编号 promscrape.cluster.memberLabel: vmagent_instance # 关键设置 memberNum podNameVM 会自动从名字提取数字 promscrape.cluster.memberNum: $(POD_NAME) # 去重功能 streamAggr.dedupInterval: 15s streamAggr.dropInputLabels: replica # 启用后页面仍会显示每个 target 最近一次 error, 日志不再被刷爆 promscrape.suppressScrapeErrors: true promscrape.cluster.memberURLTemplate: http://vmagent-custom-agent-%d.monitoring.svc.cluster.local:8429/targets # 接收多大的数据 promscrape.maxScrapeSize: 128MB extraEnvs: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name # 如果你有自定义 configmap可在此开启 externalLabels: cluster: xxx-prod resources: limits: cpu: 8 memory: 10Gi requests: cpu: 100m memory: 256Mi # ------------ Scrape 配置 --------------- scrapeConfigSelector: {} # 可选匹配全部 VMScrapeConfig # matchLabels: # app: custom-scrape serviceScrapeNamespaceSelector: {} # 允许全部 namespace serviceScrapeSelector: {} # 匹配全部 VMServiceScrape podScrapeSelector: {} # 匹配全部 VMPodScrape podScrapeNamespaceSelector: {} # 允许全部 namespace nodeScrapeSelector: {} staticScrapeSelector: {} probeSelector: {} serviceMonitor: enabled: true interval: 30s # 配置抓取的job_name建议写在secret 更易维护以及美观 #inlineScrapeConfig: | # - job_name: prometheus # static_configs: # - targets: [localhost:9090] tolerations: - effect: NoSchedule key: service operator: Equal value: sre-victoria-metrics affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchLabels: app.kubernetes.io/name: vmagent app.kubernetes.io/instance: custom-agent topologyKey: kubernetes.io/hostname nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: business operator: In values: - sre #--- #apiVersion: v1 #kind: ConfigMap #metadata: # name: vmagent-relabel # namespace: vm #data: # relabel.yaml: | # #- action: replace # # replacement: prod # # target_label: cluster # #- action: replace # # source_labels: [namespace] # # target_label: ns # # # 删除无用 label非常安全 # #- action: labeldrop # # regex: controller_revision_hash # #- action: drop # # source_labels: [__name__] # # regex: go_threads # # 删除 noisy 时间序列 # #- action: drop # # source_labels: [__name__] # # regex: go_threads # #- action: drop # # source_labels: [__name__] # # regex: go_.* # #- action: drop # # source_labels: [job] # # regex: debug # target-1-relabel.yaml: | # - action: keep_if_equal # source_labels: [foo, bar] # - action: drop_if_equal # source_labels: [foo, bar] --- apiVersion: v1 kind: Secret metadata: name: additional-scrape-configs namespace: monitoring stringData: prometheus-additional.yaml: | - job_name: kube-proxy-monitor/kube-proxy/0 bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token honor_labels: false kubernetes_sd_configs: - namespaces: names: - kube-system role: endpoints relabel_configs: - action: keep regex: kubelet source_labels: - __meta_kubernetes_service_label_k8s_app - action: keep regex: http-metrics source_labels: - __meta_kubernetes_endpoint_port_name - regex: Node;(.*) replacement: ${1} separator: ; source_labels: - __meta_kubernetes_endpoint_address_target_kind - __meta_kubernetes_endpoint_address_target_name target_label: node - replacement: kube-proxy target_label: job - regex: (.) replacement: ${1} source_labels: - __meta_kubernetes_service_label_jobLabel target_label: job - replacement: http-metrics target_label: endpoint - regex: (.*):(.*) replacement: ${1}:10249 separator: : source_labels: - __address__ target_label: __address__ - job_name: cadvisor kubernetes_sd_configs: - role: node scheme: https tls_config: ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token relabel_configs: - action: labeldrop regex: (id|container_id|image|image_id|container_hash|pod_uid|uid|node_name|name|device|interface|mountpoint|endpoint) - action: drop source_labels: [container] regex: POD|pause - action: drop source_labels: [mountpoint] regex: /var/lib/kubelet/pods/.|/var/lib/docker/.|/run/containerd/. - action: drop source_labels: [device] regex: veth.*|cni.*|docker.*|tun.* - action: drop source_labels: [name] regex: ._[0-9a-f]{8,}$ - action: drop source_labels: [__name__] regex: container_fs_.* - source_labels: [__name__] regex: .*_bucket action: drop - source_labels: [__name__] regex: container_cpu_cfs_.*|container_blkio_.*|container_tasks_.*|container_hugetlb_.* action: drop - action: labelmap regex: __meta_kubernetes_node_label_(.) - target_label: __address__ replacement: kubernetes.default.svc:443 - source_labels: [__meta_kubernetes_node_name] regex: (.) target_label: __metrics_path__ replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor - job_name: xxx-xxx-nginx-exporter metrics_path: /metrics scrape_timeout: 20s static_configs: - targets: - 10.50.66.179:9113 - 10.50.101.137:9113 labels: external: ucloud-nginx severity: critical business: xxx - job_name: nacos-monitor metrics_path: /nacos/actuator/prometheus static_configs: - targets: - 10.53.4.126:8848 - 10.53.5.43:8848 - 10.53.6.192:8848 - job_name: blackbox-exporter-http200 metrics_path: /probe params: module: [http_2xx] static_configs: - targets: - https://api.xx.xxx - https://xxx.xxx.cn - https://xxx.cn relabel_configs: - source_labels: [__address__] target_label: __param_target - source_labels: [__param_target] target_label: instance - target_label: __address__ replacement: blackbox-exporter.sre.svc.cluster.local:9115 - job_name: blackbox-exporter-tcp metrics_path: /probe params: module: [tcp_connect] static_configs: - targets: - 117.50.46.70:443 - 117.50.218.62:443 relabel_configs: - source_labels: [__address__] target_label: __param_target - source_labels: [__param_target] target_label: instance - target_label: __address__ replacement: blackbox-exporter.sre.svc.cluster.local:9115创建etcd secretsagent拉取etcd监控数据需要进行认证也可以通过文件创建/etc/kubernetes/sslkubectl get secrets etcd-client-cert -o yaml -n k8s-monitor etcd-client-cert.yaml root10-53-6-31:/home/sunwenbo/k8s-install/helm-install/vm/victoria-metrics-operator/custom# vim etcd-client-cert.yaml root10-53-6-31:/home/sunwenbo/k8s-install/helm-install/vm/victoria-metrics-operator/custom# kubectl apply -f etcd-client-cert.yaml secret/etcd-client-cert created root10-53-6-31:/home/sunwenbo/k8s-install/helm-install/vm/victoria-metrics-operator/custom#参考文档https://docs.victoriametrics.com/victoriametrics/vmagent/https://docs.victoriametrics.com/operator/resources/