Redis 高级数据类型

除了五种基本数据类型，Redis 还提供了几种高级数据类型来解决特定的业务场景。本章介绍 Bitmap（位图）、HyperLogLog（基数估计）和 Geospatial（地理空间索引）三种高级数据类型，它们都是基于基本类型实现的扩展功能。

概述

数据类型	基础类型	典型场景	内存效率
Bitmap	String	签到、用户标签、布隆过滤器	极高
HyperLogLog	String	UV 统计、基数估计	极高（12KB 固定）
Geospatial	Sorted Set	附近的人、位置服务	高

Bitmap（位图）

什么是 Bitmap？

Bitmap 并不是一种独立的数据类型，而是基于 String 类型实现的一组位操作。由于 String 是二进制安全的，最大可达 512MB，因此一个 String 可以存储多达 $2^{32}$ 个位。

核心思想：将每个位（bit）当作一个布尔值使用，位的位置代表某个标识符，位的值（0 或 1）代表该标识符的状态。

为什么 Bitmap 省内存？

假设有 1 亿用户，记录每个用户是否签到：

存储方式	内存占用
Set（存储用户 ID）	约 400MB（假设平均 ID 长度 4 字节）
Hash（用户 ID -> 1）	约 500MB
Bitmap	约 12.5MB（1亿位 = 12.5MB）

Bitmap 的内存占用与用户数量线性相关，与实际签到人数无关。

基本命令

SETBIT 设置位

将指定偏移量的位设置为 0 或 1：

# 语法
SETBIT key offset value

# 设置第 100 位为 1（用户 100 已签到）
127.0.0.1:6379> SETBIT sign:2024-01-01 100 1
(integer) 0    # 返回该位之前的值

# 设置第 101 位为 1
127.0.0.1:6379> SETBIT sign:2024-01-01 101 1
(integer) 0

# 重复设置返回原值
127.0.0.1:6379> SETBIT sign:2024-01-01 100 1
(integer) 1    # 第 100 位已经是 1

参数说明：

key：键名
offset：位的偏移量（从 0 开始），最大 $2^{32}-1$
value：位的值，只能是 0 或 1

注意事项：

如果 key 不存在，会自动创建一个足够长的字符串
如果设置的位超出现有长度，Redis 会自动扩展字符串

GETBIT 获取位

获取指定偏移量的位值：

# 语法
GETBIT key offset

# 查看用户 100 是否签到
127.0.0.1:6379> GETBIT sign:2024-01-01 100
(integer) 1

# 查看用户 999（未设置过，返回 0）
127.0.0.1:6379> GETBIT sign:2024-01-01 999
(integer) 0

特性：超出范围的位总是返回 0，不会报错。

位运算命令

BITCOUNT 统计位数

统计字符串中被设置为 1 的位的数量：

# 语法
BITCOUNT key [start end [BYTE | BIT]]

# 统计所有位
127.0.0.1:6379> BITCOUNT sign:2024-01-01
(integer) 2    # 有 2 个用户签到

# 统计指定字节范围（默认按字节）
127.0.0.1:6379> BITCOUNT sign:2024-01-01 0 10
(integer) 2    # 前 11 个字节中的 1 的数量

# 统计指定位范围（Redis 7.0+）
127.0.0.1:6379> BITCOUNT sign:2024-01-01 0 100 BIT
(integer) 2    # 前 101 位中的 1 的数量

应用场景：快速统计签到人数、活跃用户数。

BITPOS 查找位

查找第一个设置为指定值（0 或 1）的位：

# 语法
BITPOS key bit [start [end [BYTE | BIT]]]

# 查找第一个 1
127.0.0.1:6379> BITPOS sign:2024-01-01 1
(integer) 100    # 第 100 位是第一个 1

# 查找第一个 0
127.0.0.1:6379> BITPOS sign:2024-01-01 0
(integer) 0      # 第 0 位是第一个 0

应用场景：查找第一个未使用的 ID、查找连续的空闲位。

BITOP 位运算

对一个或多个字符串执行位运算：

# 语法
BITOP operation destkey key [key ...]

# operation: AND（与）, OR（或）, XOR（异或）, NOT（非）

# 准备数据
127.0.0.1:6379> SETBIT bits1 0 1
127.0.0.1:6379> SETBIT bits1 1 1
127.0.0.1:6379> SETBIT bits2 1 1
127.0.0.1:6379> SETBIT bits2 2 1

# AND 运算（两个都有才为 1）
127.0.0.1:6379> BITOP AND result_and bits1 bits2
(integer) 1
127.0.0.1:6379> GETBIT result_and 1
(integer) 1    # 只有第 1 位都是 1

# OR 运算（任一有为 1）
127.0.0.1:6379> BITOP OR result_or bits1 bits2
(integer) 1

# XOR 运算（不同为 1）
127.0.0.1:6379> BITOP XOR result_xor bits1 bits2
(integer) 1

# NOT 运算（取反，只能对一个 key 操作）
127.0.0.1:6379> BITOP NOT result_not bits1
(integer) 1

时间复杂度：O(N)，N 是最长的字符串长度。对大 Bitmap 要谨慎使用。

应用场景详解

场景一：用户签到

import redis
import time

r = redis.Redis(decode_responses=True)

def sign(user_id, date=None):
    """用户签到"""
    if date is None:
        date = time.strftime('%Y-%m-%d')
    key = f"sign:{date}"
    # 用户 ID 作为位偏移
    r.setbit(key, user_id, 1)
    return True

def is_signed(user_id, date=None):
    """检查用户是否签到"""
    if date is None:
        date = time.strftime('%Y-%m-%d')
    key = f"sign:{date}"
    return bool(r.getbit(key, user_id))

def get_sign_count(date=None):
    """获取签到人数"""
    if date is None:
        date = time.strftime('%Y-%m-%d')
    key = f"sign:{date}"
    return r.bitcount(key)

def get_continuous_sign_days(user_id, end_date=None):
    """获取连续签到天数"""
    if end_date is None:
        end_date = time.strftime('%Y-%m-%d')
    
    from datetime import datetime, timedelta
    date = datetime.strptime(end_date, '%Y-%m-%d')
    continuous_days = 0
    
    for i in range(365):  # 最多检查一年
        key = f"sign:{date.strftime('%Y-%m-%d')}"
        if r.getbit(key, user_id):
            continuous_days += 1
            date -= timedelta(days=1)
        else:
            break
    
    return continuous_days

# 使用示例
sign(1001, '2024-01-15')
sign(1002, '2024-01-15')
print(f"签到人数: {get_sign_count('2024-01-15')}")  # 2
print(f"用户 1001 已签到: {is_signed(1001, '2024-01-15')}")  # True

场景二：用户在线状态

import redis
import time

r = redis.Redis()

def set_online(user_id):
    """设置用户在线"""
    key = "online:users"
    r.setbit(key, user_id, 1)

def set_offline(user_id):
    """设置用户离线"""
    key = "online:users"
    r.setbit(key, user_id, 0)

def is_online(user_id):
    """检查用户是否在线"""
    key = "online:users"
    return bool(r.getbit(key, user_id))

def get_online_count():
    """获取在线人数"""
    key = "online:users"
    return r.bitcount(key)

# 使用示例
set_online(1001)
set_online(1002)
set_offline(1001)
print(f"在线人数: {get_online_count()}")  # 1

场景三：统计活跃用户

import redis

r = redis.Redis()

def record_active(user_id, date):
    """记录活跃用户"""
    key = f"active:{date}"
    r.setbit(key, user_id, 1)

def get_daily_active(date):
    """获取日活用户数"""
    key = f"active:{date}"
    return r.bitcount(key)

def get_weekly_active(dates):
    """获取周活用户数（去重）"""
    keys = [f"active:{date}" for date in dates]
    r.bitop('OR', 'active:weekly', *keys)
    return r.bitcount('active:weekly')

def get_monthly_active(year_month):
    """获取月活用户数"""
    # 假设已经有每日数据，使用 OR 合并
    import calendar
    year, month = map(int, year_month.split('-'))
    _, days = calendar.monthrange(year, month)
    
    keys = [f"active:{year_month}-{d:02d}" for d in range(1, days + 1)]
    r.bitop('OR', f'active:{year_month}', *keys)
    return r.bitcount(f'active:{year_month}')

# 使用示例
record_active(1001, '2024-01-01')
record_active(1002, '2024-01-01')
record_active(1001, '2024-01-02')
print(f"1月1日日活: {get_daily_active('2024-01-01')}")  # 2

# 计算周活（去重）
get_weekly_active(['2024-01-01', '2024-01-02'])

场景四：布隆过滤器

Bitmap 可以实现简单的布隆过滤器：

import redis
import mmh3  # 需要安装: pip install mmh3

class BloomFilter:
    def __init__(self, redis_client, key, size, hash_count):
        self.r = redis_client
        self.key = key
        self.size = size
        self.hash_count = hash_count
    
    def _get_positions(self, value):
        """计算哈希位置"""
        positions = []
        value_str = str(value)
        for i in range(self.hash_count):
            # 使用不同的种子生成多个哈希值
            position = mmh3.hash(value_str, i) % self.size
            positions.append(position)
        return positions
    
    def add(self, value):
        """添加元素"""
        for pos in self._get_positions(value):
            self.r.setbit(self.key, pos, 1)
    
    def might_contain(self, value):
        """检查元素是否可能存在"""
        for pos in self._get_positions(value):
            if not self.r.getbit(self.key, pos):
                return False  # 一定不存在
        return True  # 可能存在

# 使用示例
bf = BloomFilter(r, 'bloom:emails', 10000000, 7)

# 添加已存在的邮箱
bf.add('[email protected]')

# 检查邮箱
print(bf.might_contain('[email protected]'))  # True
print(bf.might_contain('[email protected]'))  # False（或 True，假阳性）

性能优化

1. 合理分片

当 Bitmap 非常大时，可以按用户 ID 范围分片：

def get_shard_key(base_key, user_id, shard_size=1000000):
    """获取分片键"""
    shard_id = user_id // shard_size
    return f"{base_key}:shard:{shard_id}"

def set_bit_sharded(base_key, user_id, value):
    """分片设置位"""
    shard_key = get_shard_key(base_key, user_id)
    offset = user_id % 1000000
    r.setbit(shard_key, offset, value)

2. 批量操作

def batch_set_bits(key, user_ids):
    """批量设置位"""
    pipe = r.pipeline()
    for user_id in user_ids:
        pipe.setbit(key, user_id, 1)
    pipe.execute()

HyperLogLog

什么是 HyperLogLog？

HyperLogLog 是一种概率数据结构，用于估计集合的基数（不重复元素的数量）。它的特点是：

固定内存：无论统计多少元素，始终只占用约 12KB 内存
误差可控：标准误差约为 0.81%
只支持添加和计数：无法获取具体元素

为什么需要 HyperLogLog？

统计网站 UV（独立访客）是常见需求。如果使用 Set 存储：

UV 数量	Set 内存占用
100 万	约 50MB
1000 万	约 500MB
1 亿	约 5GB

使用 HyperLogLog，无论多少 UV，始终只需 12KB。

基本命令

PFADD 添加元素

向 HyperLogLog 添加元素：

# 语法
PFADD key element [element ...]

# 添加单个元素
127.0.0.1:6379> PFADD uv:2024-01-01 user_1001
(integer) 1    # 返回 1 表示基数估计值发生变化

# 添加多个元素
127.0.0.1:6379> PFADD uv:2024-01-01 user_1002 user_1003 user_1004
(integer) 1

# 添加重复元素
127.0.0.1:6379> PFADD uv:2024-01-01 user_1001
(integer) 0    # 返回 0 表示基数估计值未变化

返回值说明：

1：至少有一个新元素被添加，基数估计值增加
0：所有元素都已存在，基数估计值不变

PFCOUNT 获取基数

获取 HyperLogLog 的基数估计值：

# 获取单个 HyperLogLog 的基数
127.0.0.1:6379> PFCOUNT uv:2024-01-01
(integer) 4    # 估计有 4 个不重复用户

# 获取多个 HyperLogLog 的并集基数
127.0.0.1:6379> PFADD uv:2024-01-02 user_1001 user_1005
127.0.0.1:6379> PFCOUNT uv:2024-01-01 uv:2024-01-02
(integer) 5    # 两天去重后的估计值

注意：PFCOUNT 对多个 key 操作时，会合并计算并集的基数，但不会修改原始数据。

PFMERGE 合并

合并多个 HyperLogLog：

# 语法
PFMERGE destkey sourcekey [sourcekey ...]

# 合并两天的 UV
127.0.0.1:6379> PFMERGE uv:2024-01:week1 uv:2024-01-01 uv:2024-01-02
OK
127.0.0.1:6379> PFCOUNT uv:2024-01:week1
(integer) 5

应用场景详解

场景一：网站 UV 统计

import redis
import time

r = redis.Redis(decode_responses=True)

def record_visit(user_id, page=None):
    """记录用户访问"""
    date = time.strftime('%Y-%m-%d')
    if page:
        key = f"pv:{page}:{date}"
    else:
        key = f"uv:{date}"
    r.pfadd(key, user_id)

def get_uv(date=None):
    """获取 UV"""
    if date is None:
        date = time.strftime('%Y-%m-%d')
    return r.pfcount(f"uv:{date}")

def get_monthly_uv(year_month):
    """获取月度 UV"""
    import calendar
    year, month = map(int, year_month.split('-'))
    _, days = calendar.monthrange(year, month)
    
    # 合并每日 UV
    keys = [f"uv:{year_month}-{d:02d}" for d in range(1, days + 1)]
    dest_key = f"uv:{year_month}"
    r.pfmerge(dest_key, *keys)
    return r.pfcount(dest_key)

# 使用示例
record_visit('user_001')
record_visit('user_002')
record_visit('user_001')  # 重复访问
print(f"今日 UV: {get_uv()}")  # 约 2

场景二：搜索词统计

import redis

r = redis.Redis()

def record_search(query, user_id):
    """记录搜索"""
    today = time.strftime('%Y-%m-%d')
    key = f"search:uv:{query}:{today}"
    r.pfadd(key, user_id)

def get_search_uv(query, date=None):
    """获取搜索词的独立搜索人数"""
    if date is None:
        date = time.strftime('%Y-%m-%d')
    key = f"search:uv:{query}:{date}"
    return r.pfcount(key)

# 使用示例
record_search('Redis教程', 'user_001')
record_search('Redis教程', 'user_002')
print(f"'Redis教程' 搜索人数: {get_search_uv('Redis教程')}")

场景三：多维度统计

import redis

r = redis.Redis()

def record_action(user_id, action, dimension=None):
    """记录用户行为"""
    date = time.strftime('%Y-%m-%d')
    key_parts = [action, date]
    if dimension:
        key_parts.insert(1, dimension)
    key = f"stats:{':'.join(key_parts)}"
    r.pfadd(key, user_id)

def get_action_uv(action, dimension=None, date=None):
    """获取行为的 UV"""
    if date is None:
        date = time.strftime('%Y-%m-%d')
    key_parts = [action, date]
    if dimension:
        key_parts.insert(1, dimension)
    key = f"stats:{':'.join(key_parts)}"
    return r.pfcount(key)

# 使用示例
record_action('user_001', 'purchase', 'electronics')
record_action('user_002', 'purchase', 'electronics')
record_action('user_001', 'purchase', 'books')
print(f"电子产品购买人数: {get_action_uv('purchase', 'electronics')}")

误差与精度

HyperLogLog 的标准误差约为 0.81%。实际应用中，当基数很大时，误差相对更小：

实际基数	估计范围（95%置信区间）
1,000	982 - 1,018
10,000	9,820 - 10,180
100,000	98,200 - 101,800
1,000,000	982,000 - 1,018,000

适用场景：

对精度要求不高的大数据统计
需要节省内存的场景
实时统计场景

不适用场景：

需要精确计数的场景
需要获取具体元素的场景

Geospatial（地理空间索引）

什么是 Geospatial？

Geospatial 是 Redis 基于 Sorted Set 实现的地理空间索引功能。它允许：

存储地理位置（经纬度）
计算两点之间的距离
查找指定范围内的位置

底层原理：Redis 使用 GeoHash 算法将经纬度转换为 52 位的整数，作为 Sorted Set 的 score 值。GeoHash 将二维的经纬度编码为一维的字符串，相似的经纬度会有相似的前缀。

基本命令

GEOADD 添加位置

添加地理位置到集合：

# 语法
GEOADD key longitude latitude member [longitude latitude member ...]

# 注意：经度在前，纬度在后

# 添加城市
127.0.0.1:6379> GEOADD cities 116.405285 39.904989 "北京"
(integer) 1
127.0.0.1:6379> GEOADD cities 121.472644 31.231706 "上海"
(integer) 1
127.0.0.1:6379> GEOADD cities 113.280637 23.125178 "广州"
(integer) 1

# 批量添加
127.0.0.1:6379> GEOADD cities 114.085947 22.547 "深圳" 120.153576 30.287459 "杭州"
(integer) 2

经纬度范围：

经度：-180 到 180
纬度：-85.05112878 到 85.05112878（GeoHash 的限制）

更新位置：对已存在的成员再次执行 GEOADD 会更新其坐标。

GEOPOS 获取位置

获取成员的经纬度：

# 语法
GEOPOS key member [member ...]

127.0.0.1:6379> GEOPOS cities 北京
1) 1) "116.40528327226638794"
   2) "39.90498968127083119"

# 获取多个成员
127.0.0.1:6379> GEOPOS cities 北京 上海
1) 1) "116.40528327226638794"
   2) "39.90498968127083119"
2) 1) "121.47264426946640015"
   2) "31.23170610410372507"

GEODIST 计算距离

计算两个成员之间的距离：

# 语法
GEODIST key member1 member2 [unit]

# unit: m（米）, km（千米）, mi（英里）, ft（英尺）

# 北京到上海的距离
127.0.0.1:6379> GEODIST cities 北京 上海 km
"1067.5983"    # 约 1068 公里

# 北京到广州
127.0.0.1:6379> GEODIST cities 北京 广州 km
"1884.6208"

GEORADIUS（已废弃）按半径查询

注意：GEORADIUS 在 Redis 6.2 后被标记为废弃，建议使用 GEOSEARCH。

GEOSEARCH 范围搜索（Redis 6.2+）

查找指定范围内的成员：

# 语法
GEOSEARCH key [FROMMEMBER member | FROMLONLAT longitude latitude] [BYRADIUS radius unit | BYBOX width height unit] [WITHCOORD] [WITHDIST] [WITHHASH] [COUNT count [ANY]] [ASC | DESC]

# 示例：从北京出发，查找 500 公里内的城市
127.0.0.1:6379> GEOSEARCH cities FROMMEMBER 北京 BYRADIUS 500 km WITHDIST
1) 1) "北京"
   2) "0.0000"
2) 1) "杭州"
   2) "1128.8591"

# 示例：从坐标点查找 200 公里内的城市
127.0.0.1:6379> GEOSEARCH cities FROMLONLAT 116.40 39.90 BYRADIUS 200 km WITHDIST WITHCOORD
1) 1) "北京"
   2) "0.0000"
   3) 1) "116.40528327226638794"
      2) "39.90498968127083119"

# 按矩形范围查找
127.0.0.1:6379> GEOSEARCH cities FROMLONLAT 116.40 39.90 BYBOX 500 500 km WITHDIST

# 限制返回数量并按距离排序
127.0.0.1:6379> GEOSEARCH cities FROMMEMBER 北京 BYRADIUS 1000 km COUNT 3 ASC WITHDIST

参数说明：

参数	说明
`FROMMEMBER`	以指定成员为中心
`FROMLONLAT`	以指定经纬度为中心
`BYRADIUS`	按圆形范围搜索
`BYBOX`	按矩形范围搜索
`WITHDIST`	返回距离
`WITHCOORD`	返回坐标
`WITHHASH`	返回 GeoHash 值
`COUNT`	限制返回数量
`ASC/DESC`	按距离排序

GEOHASH 获取 GeoHash

获取成员的 GeoHash 值：

127.0.0.1:6379> GEOHASH cities 北京 上海
1) "wx4g0bgv6d0"
2) "wtw3sj5zbj0"

GeoHash 字符串越长，位置越精确。前缀相同的位置距离较近。

应用场景详解

场景一：附近的人/店铺

import redis
import math

r = redis.Redis(decode_responses=True)

def add_user_location(user_id, longitude, latitude):
    """更新用户位置"""
    r.geoadd('user:locations', longitude, latitude, f"user:{user_id}")

def find_nearby_users(user_id, radius=5, unit='km'):
    """查找附近的用户"""
    results = r.geosearch(
        'user:locations',
        member=f"user:{user_id}",
        radius=radius,
        unit=unit,
        withdist=True,
        count=20,
        sort='ASC'
    )
    return [
        {
            'user_id': result[0].split(':')[1],
            'distance': float(result[1])
        }
        for result in results if result[0] != f"user:{user_id}"
    ]

# 使用示例
add_user_location(1, 116.40, 39.90)
add_user_location(2, 116.41, 39.91)
add_user_location(3, 116.42, 39.92)
nearby = find_nearby_users(1, radius=10)
print(f"附近的用户: {nearby}")

场景二：附近店铺搜索

import redis

r = redis.Redis(decode_responses=True)

def add_shop(shop_id, name, longitude, latitude, category):
    """添加店铺"""
    key = f"shops:{category}"
    r.geoadd(key, longitude, latitude, shop_id)
    # 存储店铺详情
    r.hset(f"shop:{shop_id}", mapping={
        'name': name,
        'category': category
    })

def search_nearby_shops(longitude, latitude, category, radius=5):
    """搜索附近店铺"""
    key = f"shops:{category}"
    results = r.geosearch(
        key,
        longitude=longitude,
        latitude=latitude,
        radius=radius,
        unit='km',
        withdist=True,
        withcoord=True,
        count=50,
        sort='ASC'
    )
    
    shops = []
    for result in results:
        shop_id = result[0]
        distance = float(result[1])
        coord = result[2]
        details = r.hgetall(f"shop:{shop_id}")
        shops.append({
            'shop_id': shop_id,
            'name': details.get('name'),
            'category': details.get('category'),
            'distance': distance,
            'longitude': float(coord[0]),
            'latitude': float(coord[1])
        })
    return shops

# 使用示例
add_shop('shop_001', '星巴克中关村店', 116.31, 39.98, 'coffee')
add_shop('shop_002', '瑞幸五道口店', 116.32, 39.99, 'coffee')
shops = search_nearby_shops(116.31, 39.98, 'coffee', radius=2)

场景三：打车/外卖配送

import redis

r = redis.Redis(decode_responses=True)

def update_driver_location(driver_id, longitude, latitude):
    """更新司机位置"""
    r.geoadd('drivers:location', longitude, latitude, driver_id)

def find_available_drivers(longitude, latitude, radius=5, count=10):
    """查找附近的司机"""
    results = r.geosearch(
        'drivers:location',
        longitude=longitude,
        latitude=latitude,
        radius=radius,
        unit='km',
        withdist=True,
        count=count,
        sort='ASC'
    )
    return [
        {
            'driver_id': result[0],
            'distance': float(result[1])
        }
        for result in results
    ]

def calculate_eta(driver_longitude, driver_latitude, dest_longitude, dest_latitude):
    """计算预计到达时间"""
    # 添加临时坐标
    r.geoadd('temp:calc', driver_longitude, driver_latitude, 'driver')
    r.geoadd('temp:calc', dest_longitude, dest_latitude, 'dest')
    
    distance = float(r.geodist('temp:calc', 'driver', 'dest', 'km'))
    
    # 假设平均速度 30km/h
    eta_minutes = (distance / 30) * 60
    
    # 清理临时数据
    r.zrem('temp:calc', 'driver', 'dest')
    
    return round(eta_minutes, 1)

# 使用示例
update_driver_location('driver_001', 116.40, 39.90)
drivers = find_available_drivers(116.41, 39.91, radius=3)

场景四：地理围栏

import redis

r = redis.Redis(decode_responses=True)

def check_in_fence(user_longitude, user_latitude, fence_center_lon, fence_center_lat, radius_km):
    """检查用户是否在围栏内"""
    # 添加用户位置
    r.geoadd('temp:fence', user_longitude, user_latitude, 'user')
    r.geoadd('temp:fence', fence_center_lon, fence_center_lat, 'center')
    
    distance = float(r.geodist('temp:fence', 'user', 'center', 'km'))
    
    # 清理
    r.zrem('temp:fence', 'user', 'center')
    
    return distance <= radius_km

# 使用示例
# 公司位置：116.40, 39.90，半径 500 米
in_office = check_in_fence(116.401, 39.901, 116.40, 39.90, 0.5)
print(f"在公司范围内: {in_office}")

性能优化

1. 分类存储

将不同类别的位置存储在不同的 key 中：

# 好的做法
r.geoadd('shops:restaurant', lng, lat, 'shop_001')
r.geoadd('shops:cafe', lng, lat, 'cafe_001')

# 避免
r.geoadd('shops:all', lng, lat, 'restaurant:001')
r.geoadd('shops:all', lng, lat, 'cafe:001')

2. 合理设置半径

搜索半径越大，性能消耗越高。建议：

实时搜索：半径 5km 以内
分批加载：先小半径，再逐步扩大

3. 使用 COUNT 限制结果

# 限制返回 20 个结果
GEOSEARCH key FROMLONLAT 116.40 39.90 BYRADIUS 10 km COUNT 20

数据类型选择总结

场景	推荐类型	理由
用户签到	Bitmap	极省内存，支持位运算
在线用户	Bitmap	快速统计，省内存
UV 统计	HyperLogLog	固定内存，适合大数据
附近的人/店	Geospatial	原生支持距离计算
去重计数（精确）	Set	精确但占内存
去重计数（近似）	HyperLogLog	省内存但有误差

小结

本章我们学习了三种 Redis 高级数据类型：

Bitmap：基于 String 的位操作，适合签到、用户标签、布隆过滤器等场景，内存效率极高。
HyperLogLog：概率数据结构，用于估计集合基数，固定占用 12KB 内存，适合 UV 统计等大数据场景。
Geospatial：基于 Sorted Set 的地理空间索引，支持位置存储、距离计算、范围查询，适合位置服务场景。

练习

使用 Bitmap 实现一个连续签到奖励系统
使用 HyperLogLog 实现网站的日/周/月 UV 统计
使用 Geospatial 实现一个简单的"附近的人"功能
对比 Set 和 HyperLogLog 在统计 1000 万 UV 时的内存占用

概述​

Bitmap（位图）​

什么是 Bitmap？​

基本命令​

SETBIT 设置位​

GETBIT 获取位​

位运算命令​

BITCOUNT 统计位数​

BITPOS 查找位​

BITOP 位运算​

应用场景详解​

场景一：用户签到​

场景二：用户在线状态​

场景三：统计活跃用户​

场景四：布隆过滤器​

性能优化​

HyperLogLog​

什么是 HyperLogLog？​

基本命令​

PFADD 添加元素​

PFCOUNT 获取基数​

PFMERGE 合并​

应用场景详解​

场景一：网站 UV 统计​

场景二：搜索词统计​

场景三：多维度统计​

误差与精度​

Geospatial（地理空间索引）​

什么是 Geospatial？​

基本命令​

GEOADD 添加位置​

GEOPOS 获取位置​

GEODIST 计算距离​

GEORADIUS（已废弃）按半径查询​

GEOSEARCH 范围搜索（Redis 6.2+）​

GEOHASH 获取 GeoHash​

应用场景详解​

场景一：附近的人/店铺​

场景二：附近店铺搜索​

场景三：打车/外卖配送​

场景四：地理围栏​

性能优化​

数据类型选择总结​

小结​

练习​

参考资料​

概述

Bitmap（位图）

什么是 Bitmap？

基本命令

SETBIT 设置位

GETBIT 获取位

位运算命令

BITCOUNT 统计位数

BITPOS 查找位

BITOP 位运算

应用场景详解

场景一：用户签到

场景二：用户在线状态

场景三：统计活跃用户

场景四：布隆过滤器

性能优化

HyperLogLog

什么是 HyperLogLog？

基本命令

PFADD 添加元素

PFCOUNT 获取基数

PFMERGE 合并

应用场景详解

场景一：网站 UV 统计

场景二：搜索词统计

场景三：多维度统计

误差与精度

Geospatial（地理空间索引）

什么是 Geospatial？

基本命令

GEOADD 添加位置

GEOPOS 获取位置

GEODIST 计算距离

GEORADIUS（已废弃）按半径查询

GEOSEARCH 范围搜索（Redis 6.2+）

GEOHASH 获取 GeoHash

应用场景详解

场景一：附近的人/店铺

场景二：附近店铺搜索

场景三：打车/外卖配送

场景四：地理围栏

性能优化

数据类型选择总结

小结

练习

参考资料