索引管理
索引(Index)是 Elasticsearch 存储文档的容器。本章介绍如何创建、配置和管理索引。
创建索引
基本创建
# 创建名为 articles 的索引
PUT /articles
响应:
{
"acknowledged": true,
"shards_acknowledged": true,
"index": "articles"
}
创建索引时指定设置
PUT /articles
{
"settings": {
"number_of_shards": 3, # 主分片数量
"number_of_replicas": 1 # 副本数量
}
}
解释:
number_of_shards:主分片数量,创建后不能修改,默认为 1number_of_replicas:副本数量,可以动态修改,默认为 1- 分片数量建议:每个分片大小控制在 10-50GB
创建索引时指定映射
PUT /articles
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "ik_max_word"
},
"content": {
"type": "text",
"analyzer": "ik_max_word"
},
"author": {
"type": "keyword"
},
"category": {
"type": "keyword"
},
"views": {
"type": "integer"
},
"tags": {
"type": "keyword"
},
"status": {
"type": "keyword"
},
"created_at": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
}
}
}
}
使用索引模板
索引模板可以自动应用于新创建的索引:
# 创建索引模板
PUT /_index_template/articles_template
{
"index_patterns": ["articles*"], # 匹配的索引名称
"priority": 100,
"template": {
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1
},
"mappings": {
"properties": {
"title": { "type": "text" },
"author": { "type": "keyword" },
"created_at": { "type": "date" }
}
}
}
}
# 创建匹配模板的索引
PUT /articles_2024
# 自动应用模板中的设置和映射
查看索引
查看所有索引
GET /_cat/indices?v
# 输出示例
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open articles xxx 3 1 5 0 10.5kb 10.5kb
yellow open products xxx 1 1 0 0 225b 225b
列说明:
| 列名 | 说明 |
|---|---|
| health | 健康状态(green/yellow/red) |
| status | 索引状态(open/close) |
| index | 索引名称 |
| pri | 主分片数量 |
| rep | 副本数量 |
| docs.count | 文档数量 |
| store.size | 存储大小 |
查看索引设置
GET /articles/_settings
# 响应
{
"articles": {
"settings": {
"index": {
"number_of_shards": "3",
"number_of_replicas": "1"
}
}
}
}
查看索引映射
GET /articles/_mapping
# 响应
{
"articles": {
"mappings": {
"properties": {
"title": { "type": "text" },
"author": { "type": "keyword" }
}
}
}
}
修改索引
更新设置
# 修改副本数量
PUT /articles/_settings
{
"number_of_replicas": 2
}
# 开启/关闭索引
POST /articles/_close
POST /articles/_open
注意:number_of_shards 创建后不能修改。
添加字段映射
# 向已存在的索引添加新字段
PUT /articles/_mapping
{
"properties": {
"summary": {
"type": "text"
},
"rating": {
"type": "float"
}
}
}
注意:已存在字段的类型不能修改,只能添加新字段。
重建索引
当需要修改已有字段类型时,需要重建索引:
# 1. 创建新索引
PUT /articles_new
{
"mappings": {
"properties": {
"title": { "type": "text" },
"author": { "type": "text" } # 改为 text 类型
}
}
}
# 2. 复制数据
POST /_reindex
{
"source": {
"index": "articles"
},
"dest": {
"index": "articles_new"
}
}
# 3. 删除旧索引
DELETE /articles
# 4. 创建别名(可选)
POST /_aliases
{
"actions": [
{ "add": { "index": "articles_new", "alias": "articles" } }
]
}
删除索引
# 删除单个索引
DELETE /articles
# 删除多个索引
DELETE /articles,products
# 删除匹配的索引
DELETE /articles_*
警告:删除索引会删除所有数据,操作不可恢复!
索引别名
别名是索引的替代名称,可以实现零停机切换索引。
创建别名
# 为单个索引创建别名
POST /_aliases
{
"actions": [
{
"add": {
"index": "articles",
"alias": "articles_alias"
}
}
]
}
# 创建索引时同时创建别名
PUT /articles
{
"aliases": {
"articles_alias": {},
"search_articles": {}
}
}
别名指向多个索引
POST /_aliases
{
"actions": [
{ "add": { "index": "articles_2023", "alias": "articles_all" } },
{ "add": { "index": "articles_2024", "alias": "articles_all" } }
]
}
# 查询 articles_all 会同时搜索两个索引
GET /articles_all/_search
带过滤器的别名
POST /_aliases
{
"actions": [
{
"add": {
"index": "articles",
"alias": "published_articles",
"filter": {
"term": { "status": "published" }
}
}
}
]
}
# 查询 published_articles 只返回已发布的文章
GET /published_articles/_search
切换别名
# 原子操作:移除旧索引,添加新索引
POST /_aliases
{
"actions": [
{ "remove": { "index": "articles_v1", "alias": "articles" } },
{ "add": { "index": "articles_v2", "alias": "articles" } }
]
}
查看别名
# 查看索引的别名
GET /articles/_alias
# 查看别名指向的索引
GET /_alias/articles_alias
# 查看所有别名
GET /_aliases
索引统计
索引统计信息
GET /articles/_stats
# 主要统计信息
{
"primaries": {
"docs": {
"count": 1000,
"deleted": 50
},
"store": {
"size_in_bytes": 1048576
},
"indexing": {
"index_total": 1050,
"index_time_in_millis": 5000
},
"search": {
"query_total": 100,
"query_time_in_millis": 500
}
}
}
索引段信息
GET /articles/_segments
# 查看底层 Lucene 段信息
索引恢复状态
GET /articles/_recovery
# 查看分片恢复状态
索引设置详解
常用设置
PUT /articles
{
"settings": {
# 分片设置
"number_of_shards": 3,
"number_of_replicas": 1,
# 刷新间隔(默认 1 秒)
"refresh_interval": "1s",
# 批量写入优化
"index.translog.durability": "async",
"index.translog.sync_interval": "5s",
"index.translog.flush_threshold_size": "512mb",
# 分析器设置
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["lowercase", "stop"]
}
}
}
}
}
写入性能优化
批量导入数据时可以临时调整:
# 导入前优化
PUT /articles/_settings
{
"number_of_replicas": 0,
"refresh_interval": "-1"
}
# 导入数据...
# 导入后恢复
PUT /articles/_settings
{
"number_of_replicas": 1,
"refresh_interval": "1s"
}
# 手动刷新
POST /articles/_refresh
# 强制合并段
POST /articles/_forcemerge?max_num_segments=1
索引生命周期管理
索引生命周期管理(ILM)可以自动管理索引的生命周期。
创建生命周期策略
PUT /_ilm/policy/articles_policy
{
"policy": {
"phases": {
"hot": {
"min_age": "0ms",
"actions": {
"rollover": {
"max_size": "50gb",
"max_age": "30d"
},
"set_priority": {
"priority": 100
}
}
},
"warm": {
"min_age": "30d",
"actions": {
"shrink": {
"number_of_shards": 1
},
"forcemerge": {
"max_num_segments": 1
},
"set_priority": {
"priority": 50
}
}
},
"cold": {
"min_age": "60d",
"actions": {
"freeze": {},
"set_priority": {
"priority": 0
}
}
},
"delete": {
"min_age": "90d",
"actions": {
"delete": {}
}
}
}
}
}
应用生命周期策略
PUT /_index_template/articles_template
{
"index_patterns": ["articles-*"],
"template": {
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1,
"index.lifecycle.name": "articles_policy",
"index.lifecycle.rollover_alias": "articles"
}
}
}
小结
本章我们学习了:
- 创建索引(基本创建、指定设置和映射)
- 索引模板的使用
- 查看、修改、删除索引
- 索引别名的创建和使用
- 索引统计和监控
- 索引设置详解
- 写入性能优化
- 索引生命周期管理
练习
- 创建一个商品索引,包含名称、价格、分类、库存等字段
- 为商品索引创建一个搜索别名
- 使用索引模板创建多个类似结构的索引
- 配置索引生命周期策略,实现自动删除 30 天前的数据