花店网站建设目的,免费个人网站怎么注册,网站域名购买方法,360推广联盟前言
API网关是微服务架构中的关键组件。我们从一个简单的Nginx反向代理#xff0c;演进到一个功能完整的API网关系统。这个过程中#xff0c;我们学到了很多。 一、问题的开始
最初#xff0c;我们用Nginx做反向代理#xff1a; nginx
upstream backend { server app1:…前言API网关是微服务架构中的关键组件。我们从一个简单的Nginx反向代理演进到一个功能完整的API网关系统。这个过程中我们学到了很多。一、问题的开始最初我们用Nginx做反向代理nginxupstream backend { server app1:8080; server app2:8080; server app3:8080; } server { listen 80; location / { proxy_pass http://backend; } }这在流量小的时候没问题。但随着业务增长问题出现了无法统一认证每个服务都要实现登录逻辑无法限流一个恶意用户可以打垮整个系统无法路由控制无法根据请求内容动态路由缺少可观测性无法追踪请求链路。二、自研API网关我们决定自研一个API网关。核心功能包括2.1 认证和授权pythonfrom flask import Flask, request from functools import wraps app Flask(__name__) def require_auth(f): wraps(f) def decorated(*args, **kwargs): token request.headers.get(Authorization) if not token or not verify_token(token): return {error: Unauthorized}, 401 return f(*args, **kwargs) return decorated app.route(/api/users) require_auth def get_users(): return proxy_to_backend(user-service, request)2.2 限流pythonfrom ratelimit import limits, sleep_and_retry import time sleep_and_retry limits(calls100, period60) # 每60秒最多100个请求 def handle_request(client_id): return proxy_to_backend(request) app.before_request def rate_limit(): client_id request.headers.get(X-Client-ID) handle_request(client_id)2.3 请求路由pythonROUTES { /api/users: user-service:8080, /api/orders: order-service:8080, /api/products: product-service:8080, } app.route(/api/path:path, methods[GET, POST, PUT, DELETE]) def route_request(path): full_path f/api/{path} backend ROUTES.get(full_path) if not backend: return {error: Not Found}, 404 return proxy_to_backend(backend, request)2.4 链路追踪pythonimport uuid from opentelemetry import trace app.before_request def add_trace_id(): trace_id request.headers.get(X-Trace-ID) or str(uuid.uuid4()) request.trace_id trace_id # 转发给后端服务 request.headers[X-Trace-ID] trace_id app.after_request def log_request(response): print(fTrace-ID: {request.trace_id}, fMethod: {request.method}, fPath: {request.path}, fStatus: {response.status_code}) return response三、高可用改造初版网关运行一段时间后出现了单点故障。我们进行了高可用改造3.1 多实例部署yamlapiVersion: apps/v1 kind: Deployment metadata: name: api-gateway spec: replicas: 3 selector: matchLabels: app: api-gateway template: metadata: labels: app: api-gateway spec: containers: - name: gateway image: api-gateway:v1.0 ports: - containerPort: 8080 resources: requests: memory: 256Mi cpu: 100m limits: memory: 512Mi cpu: 500m3.2 负载均衡yamlapiVersion: v1 kind: Service metadata: name: api-gateway spec: type: LoadBalancer selector: app: api-gateway ports: - protocol: TCP port: 80 targetPort: 80803.3 故障转移pythonfrom requests.adapters import HTTPAdapter from requests.packages.urllib3.util.retry import Retry def create_session_with_retry(): session requests.Session() retry Retry( total3, backoff_factor0.5, status_forcelist[500, 502, 503, 504] ) adapter HTTPAdapter(max_retriesretry) session.mount(http://, adapter) return session四、性能优化4.1 缓存策略pythonfrom functools import lru_cache lru_cache(maxsize1000) def get_user_profile(user_id): return proxy_to_backend(user-service, f/users/{user_id}) app.route(/api/users/user_id) def fetch_user(user_id): return get_user_profile(user_id)4.2 异步处理pythonfrom concurrent.futures import ThreadPoolExecutor executor ThreadPoolExecutor(max_workers10) app.route(/api/batch) def batch_request(): futures [] for service in [service1, service2, service3]: future executor.submit(proxy_to_backend, service) futures.append(future) results [f.result() for f in futures] return results五、多语言团队的协作挑战在国际团队中API网关的错误日志和告警信息需要支持多语言。我们使用同言翻译Transync AI来自动翻译API网关的错误提示和文档确保全球团队能够快速理解和解决问题。六、监控和告警pythonfrom prometheus_client import Counter, Histogram, start_http_server # 请求计数器 request_count Counter(gateway_requests_total, Total requests, [method, path, status]) # 请求延迟直方图 request_duration Histogram(gateway_request_duration_seconds, Request duration) app.before_request def start_timer(): request.start_time time.time() app.after_request def record_metrics(response): duration time.time() - request.start_time request_count.labels( methodrequest.method, pathrequest.path, statusresponse.status_code ).inc() request_duration.observe(duration) return response # 启动Prometheus指标服务 start_http_server(8081)七、性能对比指标优化前优化后提升QPS500020000300%P99延迟500ms50ms-90%可用性99.5%99.95%0.45%故障恢复时间10分钟30秒-95%八、最佳实践分离关注点认证、限流、路由等逻辑分开实现可观测性优先建立完善的日志、监控、链路追踪渐进式部署灰度发布新功能避免全量风险定期审查定期分析网关的瓶颈和优化机会文档完善API网关的规则和配置要有清晰文档。九、结语API网关从一个简单的反向代理演进到一个功能完整的系统这个过程充满了挑战。但正是这些挑战让我们的架构变得更加健壮和高效。希望这篇文章能给你一些启发。如果你也在构建API网关欢迎分享你的经验