监控管理
Spring Boot Actuator 提供了生产级的监控和管理功能,包括健康检查、指标收集、审计、HTTP 追踪等。本章将详细介绍如何使用和扩展 Actuator。
Actuator 概述
什么是 Actuator?
Actuator 是 Spring Boot 的生产就绪功能模块,提供:
- 健康检查:应用健康状态监控
- 指标收集:性能指标、业务指标
- 端点暴露:通过 HTTP 或 JMX 访问
- 审计功能:记录重要事件
- 远程管理:远程配置和调试
添加依赖
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
内置端点
| 端点 | 说明 | 默认暴露 |
|---|---|---|
health | 应用健康状态 | HTTP/JMX |
info | 应用信息 | HTTP/JMX |
beans | Spring Bean 列表 | JMX |
conditions | 自动配置条件报告 | JMX |
configprops | 配置属性 | JMX |
env | 环境变量 | JMX |
loggers | 日志配置 | JMX |
metrics | 指标信息 | JMX |
mappings | URL 映射 | JMX |
shutdown | 优雅关闭应用 | 无 |
threaddump | 线程转储 | JMX |
heapdump | 堆转储 | 无 |
caches | 缓存信息 | JMX |
scheduledtasks | 定时任务 | JMX |
端点配置
暴露端点
management:
endpoints:
web:
exposure:
# 暴露所有端点
include: "*"
# 排除某些端点
exclude: shutdown,heapdump
# JMX 暴露
jmx:
exposure:
include: "*"
推荐做法:
management:
endpoints:
web:
exposure:
# 生产环境只暴露必要端点
include: health,info,metrics,prometheus
端点安全
management:
endpoints:
web:
base-path: /actuator # 默认路径
exposure:
include: health,info
endpoint:
health:
show-details: when-authorized # 仅授权用户显示详情
# show-details: always # 总是显示
# show-details: never # 从不显示
配合 Spring Security:
@Configuration
public class ActuatorSecurityConfig {
@Bean
public SecurityFilterChain actuatorSecurity(HttpSecurity http) throws Exception {
http
.requestMatcher(EndpointRequest.toAnyEndpoint())
.authorizeExchange(auth -> auth
.requestMatchers(EndpointRequest.to("health", "info")).permitAll()
.anyExchange().hasRole("ACTUATOR")
)
.httpBasic(Customizer.withDefaults());
return http.build();
}
}
自定义端点路径
management:
endpoints:
web:
base-path: /management # 修改基础路径
server:
port: 8081 # 使用独立端口
address: 127.0.0.1 # 只允许本地访问
健康检查
基本使用
访问 GET /actuator/health:
{
"status": "UP",
"components": {
"db": {
"status": "UP",
"details": {
"database": "MySQL",
"validationQuery": "isValid()"
}
},
"diskSpace": {
"status": "UP",
"details": {
"total": 107374182400,
"free": 53687091200,
"threshold": 10485760,
"exists": true
}
},
"ping": {
"status": "UP"
},
"redis": {
"status": "UP",
"details": {
"version": "7.0.0"
}
}
}
}
健康状态
| 状态 | 说明 |
|---|---|
| UP | 正常运行 |
| DOWN | 服务不可用 |
| OUT_OF_SERVICE | 服务暂停 |
| UNKNOWN | 未知状态 |
自定义健康检查
@Component
public class CustomHealthIndicator implements HealthIndicator {
@Autowired
private ExternalService externalService;
@Override
public Health health() {
try {
// 检查外部服务
if (externalService.isAvailable()) {
return Health.up()
.withDetail("service", "External Service")
.withDetail("responseTime", "100ms")
.build();
} else {
return Health.down()
.withDetail("service", "External Service")
.withDetail("error", "Service unavailable")
.build();
}
} catch (Exception e) {
return Health.down(e)
.withDetail("service", "External Service")
.build();
}
}
}
组合健康检查
@Component
public class DatabaseHealthIndicator implements HealthIndicator {
@Autowired
private DataSource dataSource;
@Override
public Health health() {
try (Connection conn = dataSource.getConnection()) {
if (conn.isValid(1)) {
return Health.up()
.withDetail("database", "MySQL")
.withDetail("validationQuery", "isValid()")
.build();
}
return Health.down().withDetail("error", "Connection invalid").build();
} catch (SQLException e) {
return Health.down(e).build();
}
}
}
健康检查配置
management:
endpoint:
health:
show-details: always
group:
# 自定义健康组
liveness:
include: ping,diskSpace
readiness:
include: db,redis
probes:
enabled: true # 启用 Kubernetes 探针
Kubernetes 探针
Spring Boot 2.3+ 支持 Kubernetes 探针:
management:
endpoint:
health:
probes:
enabled: true
health:
livenessstate:
enabled: true
readinessstate:
enabled: true
端点:
/actuator/health/liveness- 存活探针/actuator/health/readiness- 就绪探针
Kubernetes 配置:
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
指标监控
内置指标
访问 GET /actuator/metrics:
{
"names": [
"jvm.memory.max",
"jvm.memory.used",
"jvm.gc.pause",
"process.cpu.usage",
"system.cpu.usage",
"http.server.requests",
"tomcat.threads.busy"
]
}
查看单个指标
GET /actuator/metrics/jvm.memory.used:
{
"name": "jvm.memory.used",
"description": "The amount of used memory",
"baseUnit": "bytes",
"measurements": [
{
"statistic": "VALUE",
"value": 123456789
}
],
"availableTags": [
{
"tag": "area",
"values": ["heap", "nonheap"]
},
{
"tag": "id",
"values": ["G1 Survivor Space", "G1 Old Gen", "G1 Eden Space"]
}
]
}
按标签过滤
GET /actuator/metrics/jvm.memory.used?tag=area:heap:
自定义指标
@Service
@RequiredArgsConstructor
public class OrderService {
private final MeterRegistry meterRegistry;
private final Counter orderCounter;
private final Timer orderTimer;
public OrderService(MeterRegistry meterRegistry) {
this.meterRegistry = meterRegistry;
// 订单计数器
this.orderCounter = Counter.builder("orders.created")
.description("Total orders created")
.tag("type", "online")
.register(meterRegistry);
// 订单处理计时器
this.orderTimer = Timer.builder("orders.processing.time")
.description("Order processing time")
.register(meterRegistry);
}
public Order createOrder(OrderDTO dto) {
return orderTimer.record(() -> {
// 处理订单
Order order = processOrder(dto);
// 增加计数
orderCounter.increment();
return order;
});
}
}
指标类型
| 类型 | 说明 | 使用场景 |
|---|---|---|
| Counter | 只增不减的计数器 | 请求数、错误数 |
| Gauge | 可增可减的值 | 当前连接数、队列大小 |
| Timer | 计时统计 | 请求耗时 |
| DistributionSummary | 分布统计 | 请求大小分布 |
示例:
@Service
public class MetricsService {
private final MeterRegistry registry;
// Counter:计数器
private final Counter requestCounter;
// Gauge:实时值
private final AtomicInteger activeConnections = new AtomicInteger(0);
// Timer:计时器
private final Timer requestTimer;
// DistributionSummary:分布统计
private final DistributionSummary requestSize;
public MetricsService(MeterRegistry registry) {
this.registry = registry;
// 创建 Counter
this.requestCounter = Counter.builder("app.requests")
.description("Total requests")
.tag("endpoint", "/api/orders")
.register(registry);
// 创建 Gauge
registry.gauge("app.connections.active", activeConnections);
// 创建 Timer
this.requestTimer = Timer.builder("app.request.duration")
.description("Request duration")
.publishPercentiles(0.5, 0.95, 0.99)
.register(registry);
// 创建 DistributionSummary
this.requestSize = DistributionSummary.builder("app.request.size")
.description("Request size in bytes")
.baseUnit("bytes")
.register(registry);
}
public void incrementRequest() {
requestCounter.increment();
}
public void recordRequestDuration(long millis) {
requestTimer.record(millis, TimeUnit.MILLISECONDS);
}
public void recordRequestSize(long bytes) {
requestSize.record(bytes);
}
public void connectionAdded() {
activeConnections.incrementAndGet();
}
public void connectionRemoved() {
activeConnections.decrementAndGet();
}
}
HTTP 请求指标
自动收集 HTTP 请求指标:
management:
metrics:
web:
server:
request:
autotime:
enabled: true
percentiles: 0.5,0.95,0.99
Prometheus 集成
添加依赖
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
配置暴露端点
management:
endpoints:
web:
exposure:
include: health,info,metrics,prometheus
metrics:
tags:
application: ${spring.application.name} # 添加应用标签
访问 Prometheus 端点
GET /actuator/prometheus:
# HELP jvm_memory_used_bytes The amount of used memory
# TYPE jvm_memory_used_bytes gauge
jvm_memory_used_bytes{area="heap",id="G1 Old Gen",} 1.23456789E8
jvm_memory_used_bytes{area="heap",id="G1 Eden Space",} 5.67890123E7
Prometheus 配置
# prometheus.yml
scrape_configs:
- job_name: 'spring-boot'
metrics_path: '/actuator/prometheus'
static_configs:
- targets: ['localhost:8080']
Grafana 可视化
使用 Grafana 展示指标:
- 添加 Prometheus 数据源
- 导入 Spring Boot Dashboard(ID: 12900)
- 自定义监控面板
应用信息
配置应用信息
info:
app:
name: @project.name@
version: @project.version@
description: @project.description@
java:
version: @java.version@
author: 张三
contact: [email protected]
访问 GET /actuator/info:
{
"app": {
"name": "myapp",
"version": "1.0.0",
"description": "My Application",
"java": {
"version": "17"
}
},
"author": "张三",
"contact": "[email protected]"
}
Git 信息
添加 git-commit-id-plugin:
<plugin>
<groupId>io.github.git-commit-id</groupId>
<artifactId>git-commit-id-maven-plugin</artifactId>
<version>6.0.0</version>
<executions>
<execution>
<goals>
<goal>revision</goal>
</goals>
</execution>
</executions>
</plugin>
启用 Git 信息:
management:
info:
git:
mode: full
自定义端点
创建自定义端点
@Component
@Endpoint(id = "custom")
public class CustomEndpoint {
@ReadOperation
public Map<String, Object> info() {
Map<String, Object> info = new HashMap<>();
info.put("timestamp", System.currentTimeMillis());
info.put("status", "running");
return info;
}
@ReadOperation
public Map<String, Object> detail(@Selector String name) {
Map<String, Object> detail = new HashMap<>();
detail.put("name", name);
detail.put("value", "detail value");
return detail;
}
@WriteOperation
public void update(@Selector String name, @Nullable String value) {
// 更新操作
}
@DeleteOperation
public void delete(@Selector String name) {
// 删除操作
}
}
访问:
GET /actuator/custom- 调用info()GET /actuator/custom/myname- 调用detail("myname")
Web 端点扩展
@Component
@WebEndpoint(id = "customweb")
public class CustomWebEndpoint {
@ReadOperation
public WebEndpointResponse<Map<String, Object>> info() {
Map<String, Object> data = new HashMap<>();
data.put("message", "Hello from custom endpoint");
return new WebEndpointResponse<>(data, HttpStatus.OK.value());
}
}
控制器端点
@Component
@ControllerEndpoint(id = "customcontroller")
public class CustomControllerEndpoint {
@GetMapping("/hello")
@ResponseBody
public String hello(@RequestParam String name) {
return "Hello, " + name;
}
}
访问:GET /actuator/customcontroller/hello?name=World
审计功能
配置审计
management:
audit:
events:
enabled: true
自定义审计事件
@Configuration
public class AuditConfig {
@Bean
public AuditEventRepository auditEventRepository() {
return new InMemoryAuditEventRepository();
}
}
@Service
@RequiredArgsConstructor
public class UserService {
private final AuditEventRepository auditRepository;
public void login(String username, boolean success) {
auditRepository.add(new AuditEvent(
username,
"AUTHENTICATION",
success ? "SUCCESS" : "FAILURE"
));
}
}
最佳实践
1. 安全配置
management:
endpoints:
web:
exposure:
include: health,info,metrics,prometheus
endpoint:
health:
show-details: when-authorized
2. 独立端口
management:
server:
port: 8081
address: 127.0.0.1
3. 指标标签
management:
metrics:
tags:
application: ${spring.application.name}
environment: ${spring.profiles.active}
4. 监控告警
结合 Prometheus Alertmanager:
groups:
- name: spring-boot
rules:
- alert: HighErrorRate
expr: rate(http_server_requests_seconds_count{status=~"5.."}[5m]) > 0.1
for: 5m
annotations:
summary: "High error rate detected"
小结
本章我们学习了:
- Actuator 概述:了解内置端点
- 端点配置:暴露、安全、路径配置
- 健康检查:自定义健康指示器、Kubernetes 探针
- 指标监控:内置指标、自定义指标
- Prometheus 集成:与监控系统集成
- 自定义端点:扩展监控能力
- 最佳实践:安全、性能、告警
练习
- 配置健康检查端点,显示数据库连接状态
- 创建自定义健康指示器
- 添加自定义业务指标
- 集成 Prometheus 和 Grafana
- 配置 Kubernetes 探针