跳到主要内容

监控管理

Spring Boot Actuator 提供了生产级的监控和管理功能,包括健康检查、指标收集、审计、HTTP 追踪等。本章将详细介绍如何使用和扩展 Actuator。

Actuator 概述

什么是 Actuator?

Actuator 是 Spring Boot 的生产就绪功能模块,提供:

  • 健康检查:应用健康状态监控
  • 指标收集:性能指标、业务指标
  • 端点暴露:通过 HTTP 或 JMX 访问
  • 审计功能:记录重要事件
  • 远程管理:远程配置和调试

添加依赖

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

内置端点

端点说明默认暴露
health应用健康状态HTTP/JMX
info应用信息HTTP/JMX
beansSpring Bean 列表JMX
conditions自动配置条件报告JMX
configprops配置属性JMX
env环境变量JMX
loggers日志配置JMX
metrics指标信息JMX
mappingsURL 映射JMX
shutdown优雅关闭应用
threaddump线程转储JMX
heapdump堆转储
caches缓存信息JMX
scheduledtasks定时任务JMX

端点配置

暴露端点

management:
endpoints:
web:
exposure:
# 暴露所有端点
include: "*"
# 排除某些端点
exclude: shutdown,heapdump

# JMX 暴露
jmx:
exposure:
include: "*"

推荐做法

management:
endpoints:
web:
exposure:
# 生产环境只暴露必要端点
include: health,info,metrics,prometheus

端点安全

management:
endpoints:
web:
base-path: /actuator # 默认路径
exposure:
include: health,info

endpoint:
health:
show-details: when-authorized # 仅授权用户显示详情
# show-details: always # 总是显示
# show-details: never # 从不显示

配合 Spring Security

@Configuration
public class ActuatorSecurityConfig {

@Bean
public SecurityFilterChain actuatorSecurity(HttpSecurity http) throws Exception {
http
.requestMatcher(EndpointRequest.toAnyEndpoint())
.authorizeExchange(auth -> auth
.requestMatchers(EndpointRequest.to("health", "info")).permitAll()
.anyExchange().hasRole("ACTUATOR")
)
.httpBasic(Customizer.withDefaults());
return http.build();
}
}

自定义端点路径

management:
endpoints:
web:
base-path: /management # 修改基础路径

server:
port: 8081 # 使用独立端口
address: 127.0.0.1 # 只允许本地访问

健康检查

基本使用

访问 GET /actuator/health

{
"status": "UP",
"components": {
"db": {
"status": "UP",
"details": {
"database": "MySQL",
"validationQuery": "isValid()"
}
},
"diskSpace": {
"status": "UP",
"details": {
"total": 107374182400,
"free": 53687091200,
"threshold": 10485760,
"exists": true
}
},
"ping": {
"status": "UP"
},
"redis": {
"status": "UP",
"details": {
"version": "7.0.0"
}
}
}
}

健康状态

状态说明
UP正常运行
DOWN服务不可用
OUT_OF_SERVICE服务暂停
UNKNOWN未知状态

自定义健康检查

@Component
public class CustomHealthIndicator implements HealthIndicator {

@Autowired
private ExternalService externalService;

@Override
public Health health() {
try {
// 检查外部服务
if (externalService.isAvailable()) {
return Health.up()
.withDetail("service", "External Service")
.withDetail("responseTime", "100ms")
.build();
} else {
return Health.down()
.withDetail("service", "External Service")
.withDetail("error", "Service unavailable")
.build();
}
} catch (Exception e) {
return Health.down(e)
.withDetail("service", "External Service")
.build();
}
}
}

组合健康检查

@Component
public class DatabaseHealthIndicator implements HealthIndicator {

@Autowired
private DataSource dataSource;

@Override
public Health health() {
try (Connection conn = dataSource.getConnection()) {
if (conn.isValid(1)) {
return Health.up()
.withDetail("database", "MySQL")
.withDetail("validationQuery", "isValid()")
.build();
}
return Health.down().withDetail("error", "Connection invalid").build();
} catch (SQLException e) {
return Health.down(e).build();
}
}
}

健康检查配置

management:
endpoint:
health:
show-details: always
group:
# 自定义健康组
liveness:
include: ping,diskSpace
readiness:
include: db,redis
probes:
enabled: true # 启用 Kubernetes 探针

Kubernetes 探针

Spring Boot 2.3+ 支持 Kubernetes 探针:

management:
endpoint:
health:
probes:
enabled: true
health:
livenessstate:
enabled: true
readinessstate:
enabled: true

端点

  • /actuator/health/liveness - 存活探针
  • /actuator/health/readiness - 就绪探针

Kubernetes 配置

livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 10
periodSeconds: 10

readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 5
periodSeconds: 5

指标监控

内置指标

访问 GET /actuator/metrics

{
"names": [
"jvm.memory.max",
"jvm.memory.used",
"jvm.gc.pause",
"process.cpu.usage",
"system.cpu.usage",
"http.server.requests",
"tomcat.threads.busy"
]
}

查看单个指标

GET /actuator/metrics/jvm.memory.used

{
"name": "jvm.memory.used",
"description": "The amount of used memory",
"baseUnit": "bytes",
"measurements": [
{
"statistic": "VALUE",
"value": 123456789
}
],
"availableTags": [
{
"tag": "area",
"values": ["heap", "nonheap"]
},
{
"tag": "id",
"values": ["G1 Survivor Space", "G1 Old Gen", "G1 Eden Space"]
}
]
}

按标签过滤

GET /actuator/metrics/jvm.memory.used?tag=area:heap

自定义指标

@Service
@RequiredArgsConstructor
public class OrderService {

private final MeterRegistry meterRegistry;
private final Counter orderCounter;
private final Timer orderTimer;

public OrderService(MeterRegistry meterRegistry) {
this.meterRegistry = meterRegistry;

// 订单计数器
this.orderCounter = Counter.builder("orders.created")
.description("Total orders created")
.tag("type", "online")
.register(meterRegistry);

// 订单处理计时器
this.orderTimer = Timer.builder("orders.processing.time")
.description("Order processing time")
.register(meterRegistry);
}

public Order createOrder(OrderDTO dto) {
return orderTimer.record(() -> {
// 处理订单
Order order = processOrder(dto);

// 增加计数
orderCounter.increment();

return order;
});
}
}

指标类型

类型说明使用场景
Counter只增不减的计数器请求数、错误数
Gauge可增可减的值当前连接数、队列大小
Timer计时统计请求耗时
DistributionSummary分布统计请求大小分布

示例

@Service
public class MetricsService {

private final MeterRegistry registry;

// Counter:计数器
private final Counter requestCounter;

// Gauge:实时值
private final AtomicInteger activeConnections = new AtomicInteger(0);

// Timer:计时器
private final Timer requestTimer;

// DistributionSummary:分布统计
private final DistributionSummary requestSize;

public MetricsService(MeterRegistry registry) {
this.registry = registry;

// 创建 Counter
this.requestCounter = Counter.builder("app.requests")
.description("Total requests")
.tag("endpoint", "/api/orders")
.register(registry);

// 创建 Gauge
registry.gauge("app.connections.active", activeConnections);

// 创建 Timer
this.requestTimer = Timer.builder("app.request.duration")
.description("Request duration")
.publishPercentiles(0.5, 0.95, 0.99)
.register(registry);

// 创建 DistributionSummary
this.requestSize = DistributionSummary.builder("app.request.size")
.description("Request size in bytes")
.baseUnit("bytes")
.register(registry);
}

public void incrementRequest() {
requestCounter.increment();
}

public void recordRequestDuration(long millis) {
requestTimer.record(millis, TimeUnit.MILLISECONDS);
}

public void recordRequestSize(long bytes) {
requestSize.record(bytes);
}

public void connectionAdded() {
activeConnections.incrementAndGet();
}

public void connectionRemoved() {
activeConnections.decrementAndGet();
}
}

HTTP 请求指标

自动收集 HTTP 请求指标:

management:
metrics:
web:
server:
request:
autotime:
enabled: true
percentiles: 0.5,0.95,0.99

Prometheus 集成

添加依赖

<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>

配置暴露端点

management:
endpoints:
web:
exposure:
include: health,info,metrics,prometheus

metrics:
tags:
application: ${spring.application.name} # 添加应用标签

访问 Prometheus 端点

GET /actuator/prometheus

# HELP jvm_memory_used_bytes The amount of used memory
# TYPE jvm_memory_used_bytes gauge
jvm_memory_used_bytes{area="heap",id="G1 Old Gen",} 1.23456789E8
jvm_memory_used_bytes{area="heap",id="G1 Eden Space",} 5.67890123E7

Prometheus 配置

# prometheus.yml
scrape_configs:
- job_name: 'spring-boot'
metrics_path: '/actuator/prometheus'
static_configs:
- targets: ['localhost:8080']

Grafana 可视化

使用 Grafana 展示指标:

  1. 添加 Prometheus 数据源
  2. 导入 Spring Boot Dashboard(ID: 12900)
  3. 自定义监控面板

应用信息

配置应用信息

info:
app:
name: @project.name@
version: @project.version@
description: @project.description@
java:
version: @java.version@
author: 张三
contact: [email protected]

访问 GET /actuator/info

{
"app": {
"name": "myapp",
"version": "1.0.0",
"description": "My Application",
"java": {
"version": "17"
}
},
"author": "张三",
"contact": "[email protected]"
}

Git 信息

添加 git-commit-id-plugin

<plugin>
<groupId>io.github.git-commit-id</groupId>
<artifactId>git-commit-id-maven-plugin</artifactId>
<version>6.0.0</version>
<executions>
<execution>
<goals>
<goal>revision</goal>
</goals>
</execution>
</executions>
</plugin>

启用 Git 信息:

management:
info:
git:
mode: full

自定义端点

创建自定义端点

@Component
@Endpoint(id = "custom")
public class CustomEndpoint {

@ReadOperation
public Map<String, Object> info() {
Map<String, Object> info = new HashMap<>();
info.put("timestamp", System.currentTimeMillis());
info.put("status", "running");
return info;
}

@ReadOperation
public Map<String, Object> detail(@Selector String name) {
Map<String, Object> detail = new HashMap<>();
detail.put("name", name);
detail.put("value", "detail value");
return detail;
}

@WriteOperation
public void update(@Selector String name, @Nullable String value) {
// 更新操作
}

@DeleteOperation
public void delete(@Selector String name) {
// 删除操作
}
}

访问:

  • GET /actuator/custom - 调用 info()
  • GET /actuator/custom/myname - 调用 detail("myname")

Web 端点扩展

@Component
@WebEndpoint(id = "customweb")
public class CustomWebEndpoint {

@ReadOperation
public WebEndpointResponse<Map<String, Object>> info() {
Map<String, Object> data = new HashMap<>();
data.put("message", "Hello from custom endpoint");
return new WebEndpointResponse<>(data, HttpStatus.OK.value());
}
}

控制器端点

@Component
@ControllerEndpoint(id = "customcontroller")
public class CustomControllerEndpoint {

@GetMapping("/hello")
@ResponseBody
public String hello(@RequestParam String name) {
return "Hello, " + name;
}
}

访问:GET /actuator/customcontroller/hello?name=World

审计功能

配置审计

management:
audit:
events:
enabled: true

自定义审计事件

@Configuration
public class AuditConfig {

@Bean
public AuditEventRepository auditEventRepository() {
return new InMemoryAuditEventRepository();
}
}

@Service
@RequiredArgsConstructor
public class UserService {

private final AuditEventRepository auditRepository;

public void login(String username, boolean success) {
auditRepository.add(new AuditEvent(
username,
"AUTHENTICATION",
success ? "SUCCESS" : "FAILURE"
));
}
}

最佳实践

1. 安全配置

management:
endpoints:
web:
exposure:
include: health,info,metrics,prometheus
endpoint:
health:
show-details: when-authorized

2. 独立端口

management:
server:
port: 8081
address: 127.0.0.1

3. 指标标签

management:
metrics:
tags:
application: ${spring.application.name}
environment: ${spring.profiles.active}

4. 监控告警

结合 Prometheus Alertmanager:

groups:
- name: spring-boot
rules:
- alert: HighErrorRate
expr: rate(http_server_requests_seconds_count{status=~"5.."}[5m]) > 0.1
for: 5m
annotations:
summary: "High error rate detected"

小结

本章我们学习了:

  1. Actuator 概述:了解内置端点
  2. 端点配置:暴露、安全、路径配置
  3. 健康检查:自定义健康指示器、Kubernetes 探针
  4. 指标监控:内置指标、自定义指标
  5. Prometheus 集成:与监控系统集成
  6. 自定义端点:扩展监控能力
  7. 最佳实践:安全、性能、告警

练习

  1. 配置健康检查端点,显示数据库连接状态
  2. 创建自定义健康指示器
  3. 添加自定义业务指标
  4. 集成 Prometheus 和 Grafana
  5. 配置 Kubernetes 探针

参考资源