政务公开网站建设管理,昆山兼职做网站,口碑营销的优势,263邮箱入口系列文章目录
第一篇 AI 数据治理#xff1a;LangChain4J 文本分类器在字段对标中的高级玩法 文章目录系列文章目录前言一、简介1.1 全链路拓扑1.2 组件职责二、代码实践2.1.应用侧#xff1a;配置与模型接入2.1.1 配置属性#xff1a;Provider Tools 收口2.1.2 Lang…系列文章目录第一篇 AI 数据治理LangChain4J 文本分类器在字段对标中的高级玩法文章目录系列文章目录前言一、简介1.1 全链路拓扑1.2 组件职责二、代码实践2.1.应用侧配置与模型接入2.1.1 配置属性Provider Tools 收口2.1.2 LangChain4JOpenAiChatModelOpenAI 兼容协议2.1.3 工具调用客户端RestClient Builder 统一配置2.1.4 入口 Controller2.2 观测核心Pipeline 分段 Trace 指标 Token2.2.1 下游工具服务演示依赖 Trace 透传2.2.2 AiChatService把一次 AI 请求拆成 3 段 Span2.3 配置Tracing / Metrics / Logging2.4 观测栈Docker Compose2.4.1 docker-compose.yml2.4.2 Grafana 数据源datasources.yml2.5 验证2.6 Grafana 查询语句示例PromQL LogQL2.6.1 PrometheusPromQL示例1AI 请求 QPS每秒请求数2AI 请求在 1 分钟窗口的请求量更直观3LLM 平均耗时近 5 分钟4LLM 调用次数用 Timer 的 count5Prompt Token 平均值近 5 分钟6Completion Token 平均值近 5 分钟7Prompt Token 峰值近 15 分钟8接口层/api/chat 的 HTTP QPSSpring 默认指标9接口层/api/chat 的 P95 延迟Spring 默认指标2.6.2 LokiLogQL示例总结前言AI 服务的“慢 / 贵 / 不稳定”往往不是单点问题而是链路问题Prompt 组装是否过重工具调用是否抖动LLM 推理是否超时 / 限流Token 是否在某些场景突然暴涨✅没有可观测你只能靠猜有可观测你能靠数据定位。本文用一个可运行的 Demo把一次/api/chat请求拆成清晰的入口 → 工具 → LLM → 返回并把 Trace / Metrics / Logs 同时打通。一、简介1.1 全链路拓扑1.2 组件职责组件它负责什么你能得到什么LangChain4J调用 OpenAI 兼容模型DashScope compatible-modeLLM 返回内容 TokenUsageMicrometer Observation业务语义埋点、span 分段、上下文传播ai.pipeline / ai.tool.* / ai.llmOpenTelemetry OTLP标准化导出协议traces / metrics 统一上报JaegerTrace 收集与 UIspan 耗时、错误、上下游关系Prometheus v3OTLP Metrics 接收QPS、延迟直方图、Token 分布Grafana统一看板Metrics Logs可选再接 Trace 数据源Loki Promtail日志采集与检索结构化日志 trace 关联排障⚠️ 低基数原则非常重要指标/Span 的标签只用model / scene / provider / tool 这类有限枚举。不要把 sessionId / message 当标签否则会造成指标爆炸、存储膨胀、查询变慢。应用侧Spring Boot LangChain4JLangChain4JOpenAiChatModel负责调用兼容 OpenAI 协议的大模型你这里是 DashScope 的 compatible-modeMicrometer Observation / Observed在业务代码里做“埋点语义化”把 AI Pipeline 拆成多个 spanMeterRegistryCounter/Timer/Summary把 LLM 延迟、Token 使用量沉淀为可查询指标TracingOpenTelemetry Jaegerspring-boot-starter-opentelemetry让 Spring Boot 4 一键具备 OTel Tracing 能力Micrometer Tracing bridge exporterOTLP Exporter把 trace 数据通过 OTLP 发给 JaegerJaeger all-in-one接收 OTLP 并提供 UI16686查看调用链MetricsPrometheusOTLP Receiver GrafanaPrometheus v3 OTLP Receiver直接接收应用通过 OTLP 推送的 metricsGrafana配置 Prometheus 数据源做仪表盘与告警Logging结构化日志 Loki PromtailECS 结构化日志日志字段更适合检索、聚合Loki存日志Promtail采集 Docker 容器日志推送到 Loki二、代码实践2.1.应用侧配置与模型接入2.1.1 配置属性Provider Tools 收口packageorg.example.config;importjakarta.validation.constraints.NotBlank;importorg.springframework.boot.context.properties.ConfigurationProperties;importorg.springframework.validation.annotation.Validated;importjava.time.Duration;ValidatedConfigurationProperties(prefixai.provider)publicrecordAiProviderProperties(NotBlankStringbaseUrl,NotBlankStringmodel,Durationtimeout){}packageorg.example.config;importjakarta.validation.constraints.NotBlank;importorg.springframework.boot.context.properties.ConfigurationProperties;importorg.springframework.validation.annotation.Validated;ValidatedConfigurationProperties(prefixai.tools)publicrecordAiToolsProperties(NotBlankStringbaseUrl){}✅ 把模型地址/模型名/超时、工具地址抽成“可运营配置”后续做多模型路由、灰度策略都不动业务代码。2.1.2 LangChain4JOpenAiChatModelOpenAI 兼容协议packageorg.example.config;importdev.langchain4j.model.chat.ChatModel;importdev.langchain4j.model.openai.OpenAiChatModel;importorg.springframework.boot.context.properties.EnableConfigurationProperties;importorg.springframework.context.annotation.Bean;importorg.springframework.context.annotation.Configuration;ConfigurationEnableConfigurationProperties(AiProviderProperties.class)publicclassLangChain4jConfig{BeanpublicChatModelchatModel(AiProviderPropertiesprops){returnOpenAiChatModel.builder().baseUrl(props.baseUrl()).apiKey(System.getenv(LANGCHAIN4J_KEY)).modelName(props.model()).timeout(props.timeout()).build();}}作用对外统一成 ChatModel业务层只依赖接口可观测收益后面我们在 ai.llm span 里打上 model/provider 维度Jaeger 上能按模型过滤Metrics 也能按 model 分组2.1.3 工具调用客户端RestClient Builder 统一配置packageorg.example.config;importorg.springframework.boot.restclient.autoconfigure.RestClientBuilderConfigurer;importorg.springframework.context.annotation.Bean;importorg.springframework.context.annotation.Configuration;importorg.springframework.web.client.RestClient;ConfigurationpublicclassRestClientConfig{BeanpublicRestClient.BuilderrestClientBuilder(RestClientBuilderConfigurerconfigurer){returnconfigurer.configure(RestClient.builder());}}作用让 RestClient 走 Spring Boot 的统一配置拦截器、超时、观测等链路价值工具调用也会在 trace 里形成独立 span你在业务里进一步手动包了 Observation2.1.4 入口 Controllerpackageorg.example.controller;importjakarta.annotation.Resource;importorg.example.dto.ChatRequest;importorg.example.dto.ChatResponse;importorg.example.service.AiChatService;importorg.springframework.web.bind.annotation.*;RestControllerpublicclassChatController{ResourceprivateAiChatServiceaiChatService;PostMapping(/api/chat)publicChatResponsechat(RequestBodyChatRequestchatRequest){returnaiChatService.chat(chatRequest);}}DTOpackageorg.example.dto;publicrecordChatRequest(StringsessionId,Stringscene,Stringmessage){}publicrecordChatResponse(StringsessionId,Stringanswer){}2.2 观测核心Pipeline 分段 Trace 指标 Token2.2.1 下游工具服务演示依赖 Trace 透传packageorg.example.controller;importorg.springframework.web.bind.annotation.GetMapping;importorg.springframework.web.bind.annotation.RestController;importjava.time.Instant;importjava.util.Map;RestControllerpublicclassToolController{GetMapping(/api/tools/time)publicMapString,Objecttime(){returnMap.of(now,Instant.now().toString());}}作用模拟一个内部工具/依赖服务真实项目里可以是查询配置、查用户画像、查数据库、查规则引擎等链路价值你能在 Jaeger 看到 ai.tool.time 的耗时并判断“慢在工具还是慢在 LLM”2.2.2 AiChatService把一次 AI 请求拆成 3 段 Span✅ 建议命名ai.pipeline总链路入口到返回ai.tool.time工具调用ai.llm模型调用✅ 指标ai_chat_requests_total请求量ai_llm_latencyLLM 延迟直方图 P95/P99packageorg.example.service;importdev.langchain4j.data.message.UserMessage;importdev.langchain4j.model.chat.ChatModel;importdev.langchain4j.model.output.TokenUsage;importio.micrometer.core.instrument.*;importio.micrometer.observation.Observation;importio.micrometer.observation.ObservationRegistry;importio.micrometer.observation.annotation.Observed;importjakarta.annotation.Resource;importorg.example.config.AiProviderProperties;importorg.example.config.AiToolsProperties;importorg.example.dto.ChatRequest;importorg.example.dto.ChatResponse;importorg.springframework.stereotype.Service;importorg.springframework.web.client.RestClient;importjava.util.Map;ServicepublicclassAiChatService{ResourceprivateChatModelchatModel;ResourceprivateObservationRegistryobservationRegistry;privatefinalCounterrequests;privatefinalTimerllmLatency;privatefinalDistributionSummarypromptTokens;privatefinalDistributionSummarycompletionTokens;privatefinalRestClienttoolClient;privatefinalAiProviderPropertiesprops;publicAiChatService(AiProviderPropertiesprops,AiToolsPropertiestoolsProperties,RestClient.BuilderrestClientBuilder,MeterRegistrymeterRegistry){this.propsprops;// 内部工具调用演示 trace 分段this.toolClientrestClientBuilder.baseUrl(System.getProperty(TOOLS_BASE_URL,toolsProperties.baseUrl())).build();// Metrics请求量this.requestsCounter.builder(ai_chat_requests_total).description(Total AI chat requests).tag(model,props.model()).register(meterRegistry);// MetricsLLM 延迟this.llmLatencyTimer.builder(ai_llm_latency).description(LLM call latency).tag(model,props.model()).publishPercentileHistogram().register(meterRegistry);// MetricsTokenthis.promptTokensDistributionSummary.builder(ai_llm_tokens_prompt).description(Prompt tokens).tag(model,props.model()).register(meterRegistry);this.completionTokensDistributionSummary.builder(ai_llm_tokens_completion).description(Completion tokens).tag(model,props.model()).register(meterRegistry);}Observed(nameai.chat,contextualNameai-chat)publicChatResponsechat(ChatRequestreq){returnObservation.createNotStarted(ai.pipeline,observationRegistry).lowCardinalityKeyValue(scene,safe(req.scene())).lowCardinalityKeyValue(model,props.model()).observe(()-{Stringprompt你是企业级 Java AI 助手请用工程化语言回答。\n当前时间: fetchNowFromTool()\n场景: safe(req.scene())\n用户问题: safe(req.message())\n;requests.increment();StringanswerllmLatency.record(()-invokeModel(prompt));returnnewChatResponse(req.sessionId(),answer);});}privateStringfetchNowFromTool(){returnObservation.createNotStarted(ai.tool.time,observationRegistry).lowCardinalityKeyValue(tool,time).observe(()-{Map?,?bodytoolClient.get().uri(/api/tools/time).retrieve().body(Map.class);Objectnow(body!null)?body.get(now):null;returnnow!null?now.toString():unknown;});}privateStringinvokeModel(Stringprompt){returnObservation.createNotStarted(ai.llm,observationRegistry).lowCardinalityKeyValue(provider,openai-compatible).lowCardinalityKeyValue(model,props.model()).observe(()-{dev.langchain4j.model.chat.response.ChatResponserespchatModel.chat(UserMessage.from(prompt));TokenUsageusageresp.tokenUsage();if(usage!null){if(usage.inputTokenCount()!null)promptTokens.record(usage.inputTokenCount());if(usage.outputTokenCount()!null)completionTokens.record(usage.outputTokenCount());}returnresp.aiMessage().text();});}privatestaticStringsafe(Strings){return(snull||s.isBlank())?unknown:s;}}2.3 配置Tracing / Metrics / Logging# App spring.application.nameai-observability server.port8080 # AI Provider ai.provider.base-urlhttps://dashscope.aliyuncs.com/compatible-mode/v1 ai.provider.modelqwen-long ai.provider.timeout10s ai.tools.base-urlhttp://localhost:8080 # Actuator management.endpoints.web.exposure.includehealth,info,metrics,loggers,threaddump,httpexchanges # TracingDemo100% 采样 management.tracing.sampling.probability1.0 management.opentelemetry.tracing.export.otlp.endpointhttp://localhost:4318/v1/traces management.opentelemetry.tracing.export.otlp.transporthttp management.opentelemetry.tracing.export.otlp.timeout5s # MetricsOTLP - Prometheus OTLP Receiver management.otlp.metrics.export.urlhttp://localhost:9090/api/v1/otlp/v1/metrics management.otlp.metrics.export.step10s # Logging logging.structured.format.consoleecs logging.level.org.exampleINFO logging.level.io.opentelemetry.exporterINFO logging.level.io.micrometer.tracingINFOTracing 关键点OTLP traces 发给 Jaeger 的 4318HTTPMetrics 关键点OTLP metrics 直接推给 Prometheus v3 的 OTLP receiverLogging 关键点ECS 结构化日志便于 Loki 检索后续建议加入 traceId 字段注入2.4 观测栈Docker Compose2.4.1 docker-compose.ymlservices:prometheus:image:prom/prometheus:v3.8.0command:---config.file/etc/prometheus/prometheus.yml---web.enable-otlp-receivervolumes:-./observability/prometheus.yml:/etc/prometheus/prometheus.yml:roports:-9090:9090jaeger:image:jaegertracing/all-in-one:1.76.0environment:COLLECTOR_OTLP_ENABLED:trueports:-16686:16686-4317:4317-4318:4318loki:image:grafana/loki:latestcommand:[-config.file/etc/loki/config.yml]volumes:-./observability/loki-config.yml:/etc/loki/config.yml:roports:-3100:3100grafana:image:grafana/grafana:latestenvironment:GF_SECURITY_ADMIN_USER:adminGF_SECURITY_ADMIN_PASSWORD:adminvolumes:-./observability/grafana/provisioning:/etc/grafana/provisioning:roports:-3001:3000depends_on:-prometheus-loki-jaeger访问入口Grafanahttp://localhost:3001admin/adminJaegerhttp://localhost:16686Prometheushttp://localhost:9090Lokihttp://localhost:31002.4.2 Grafana 数据源datasources.ymlapiVersion:1datasources:-name:Prometheustype:prometheusaccess:proxyurl:http://prometheus:9090isDefault:true-name:Lokitype:lokiaccess:proxyurl:http://loki:3100作用Grafana 开箱即用指标看 Prometheus、日志看 Loki2.5 验证启动观测栈docker compose up -d启动应用发起请求检查结果Jaegerhttp://localhost:16686。service 选 ai-observability能看到 ai.pipeline / ai.tool.time / ai.llm 等 span并且 HTTP 调用链路会串起来Prometheushttp://localhost:9090。查询ai_chat_requests_total、ai_llm_latency_seconds_count 等OTLP 推进来的指标Grafanahttp://localhost:3000admin/adminExplore → Prometheus / Loki 可直接查2.6 Grafana 查询语句示例PromQL LogQL说明Prometheus 指标命名规律Micrometer 的Timer通常会导出为*_seconds_bucket / *_seconds_sum / *_seconds_count开启直方图时Counter一般保持原名如ai_chat_requests_totalDistributionSummary通常为*_sum / *_count / *_max如果你的指标名不完全一致在 GrafanaExplore → Metrics里先搜索ai_再按实际指标名替换。2.6.1 PrometheusPromQL示例1AI 请求 QPS每秒请求数sum by (model) (rate(ai_chat_requests_total[5m]))2AI 请求在 1 分钟窗口的请求量更直观sum by (model) (increase(ai_chat_requests_total[1m]))3LLM 平均耗时近 5 分钟sum by (model) (rate(ai_llm_latency_seconds_sum[5m]))/sum by (model) (rate(ai_llm_latency_seconds_count[5m]))4LLM 调用次数用 Timer 的 countsum by (model) (rate(ai_llm_latency_seconds_count[5m]))5Prompt Token 平均值近 5 分钟sum by (model) (rate(ai_llm_tokens_prompt_sum[5m]))/sum by (model) (rate(ai_llm_tokens_prompt_count[5m]))6Completion Token 平均值近 5 分钟sum by (model) (rate(ai_llm_tokens_completion_sum[5m]))/sum by (model) (rate(ai_llm_tokens_completion_count[5m]))7Prompt Token 峰值近 15 分钟max_over_time(ai_llm_tokens_prompt_max[15m])8接口层/api/chat 的 HTTP QPSSpring 默认指标sum(rate(http_server_requests_seconds_count{uri“/api/chat”}[5m]))9接口层/api/chat 的 P95 延迟Spring 默认指标histogram_quantile(0.95,sum by (le) (rate(http_server_requests_seconds_bucket{uri“/api/chat”}[5m]))2.6.2 LokiLogQL示例启用了 logging.structured.format.consoleecs日志通常为 JSON。常见玩法先用 | json 解析字段再进行过滤/聚合。ECS 中 trace 字段常见为 trace.id有的链路会变成 trace_id以你的实际字段为准。查看应用日志按容器名筛选{container~“.ai-observability.”}只看 ERROR{container~“.ai-observability.”} | json | level“ERROR”按关键字检索例如 ai.llm{container~“.ai-observability.”} | “ai.llm”按 traceId 串起来看ECS 常见 trace.id{container~“.ai-observability.”} | json | trace.id“YOUR_TRACE_ID”如果你的字段是 trace_id{container~“.ai-observability.”} | json | trace_id“YOUR_TRACE_ID”5 分钟内 ERROR 数量用于图表sum(count_over_time({container~“.ai-observability.”} | json | level“ERROR”[5m]))按异常关键词统计例timeoutsum(count_over_time({container~“.ai-observability.”} | “timeout”[5m]))总结这套方案的关键不是“把 OTel 跑起来”而是把 AI 请求变成可解释、可量化、可追踪的流水线Tracing把一次 AI 请求拆成 pipeline → tool → llm定位瓶颈不用猜Metrics用 Timer Counter Token Summary 把性能与成本沉淀为数据资产Logging结构化日志进入 Loki为排障提供“证据链”下一步升级RAG新增 ai.rag.retrieve / ai.rerank span定位检索与重排耗时Memory新增 ai.memory.read / ai.memory.write span观察命中率与冷启动成本错误分层对超时、限流、5xx、解析失败打不同 error event做告警与归因成本面板按 model/scene 聚合 token做 TOP 场景与阈值告警P95 token 双维