MariaDB 中文社区

容器化与编排(Docker / k8s / Helm)

从本地 Compose 到生产 K8s Operator,配套 health-check / PVC / 备份 CronJob

选哪个?

场景推荐
本地开发 / CIDocker 单容器
一个小项目部 serverDocker Compose
中型团队多服务Compose + Caddy/Traefik 或托管 RDS
跨环境一致、有 K8sMariaDB Operator + Helm
完全托管想省心不要自管,去 云 RDS

警告:在 K8s 自管数据库比自管无状态服务难一个数量级。备份、HA、网络分区、卷迁移每个都是坑。如果团队 < 5 人,强烈建议托管。

Docker(开发用)

docker run -d --name mariadb-dev \
  -e MARIADB_ROOT_PASSWORD=dev \
  -e MARIADB_DATABASE=app \
  -p 3306:3306 \
  -v mariadb-data:/var/lib/mysql \
  mariadb:11.4

用 tmpfs 跑测试(10× 速度)

docker run -d --rm --name mariadb-test \
  --tmpfs /var/lib/mysql:rw,size=2g \
  -e MARIADB_ROOT_PASSWORD=test \
  -e MARIADB_DATABASE=test \
  -p 3307:3306 \
  mariadb:11.4

自定义 my.cnf

# my.cnf
[mariadb]
innodb_buffer_pool_size=1G
innodb_redo_log_capacity=512M
innodb_flush_log_at_trx_commit=1
slow_query_log=1
long_query_time=1

# 挂载
docker run -d --name mariadb-dev \
  -v $PWD/my.cnf:/etc/mysql/conf.d/custom.cnf:ro \
  -v mariadb-data:/var/lib/mysql \
  mariadb:11.4

Docker Compose

# docker-compose.yml
services:
  mariadb:
    image: mariadb:11.4
    restart: unless-stopped
    environment:
      MARIADB_ROOT_PASSWORD: ${ROOT_PASS}
      MARIADB_DATABASE: app
      MARIADB_USER: app
      MARIADB_PASSWORD: ${APP_PASS}
    volumes:
      - mariadb-data:/var/lib/mysql
      - ./my.cnf:/etc/mysql/conf.d/custom.cnf:ro
    ports:
      - "127.0.0.1:3306:3306"   # 只绑本机
    healthcheck:
      test: ["CMD", "mariadb-admin", "ping", "-h", "localhost", "-u", "root", "-p${ROOT_PASS}"]
      interval: 10s
      timeout: 5s
      retries: 5
      start_period: 30s

  app:
    image: yourapp:latest
    depends_on:
      mariadb:
        condition: service_healthy
    environment:
      DATABASE_URL: mysql://app:${APP_PASS}@mariadb:3306/app?charset=utf8mb4

volumes:
  mariadb-data:

添加自动备份

  backup:
    image: mariadb:11.4
    depends_on:
      mariadb:
        condition: service_healthy
    volumes:
      - ./backups:/backups
    environment:
      MYSQL_PWD: ${ROOT_PASS}
    entrypoint: ["sh", "-c"]
    command:
      - |
        while true; do
          ts=$$(date +%F-%H%M)
          mariadb-dump -h mariadb -u root --single-transaction --routines --triggers \
            --all-databases | gzip > /backups/dump-$$ts.sql.gz
          find /backups -name 'dump-*.sql.gz' -mtime +7 -delete
          sleep 86400
        done

Kubernetes(生产)

两个主流方案:

方案维护方复杂度
mariadb-operatormmontes / 社区简单
Bitnami ChartsBitnami中等
KubeDBAppsCode (商业)复杂

mariadb-operator 入门

helm repo add mariadb-operator https://helm.mariadb.com/mariadb-operator
helm install mariadb-operator mariadb-operator/mariadb-operator \
  --namespace mariadb-operator --create-namespace

定义一个 MariaDB CR:

apiVersion: k8s.mariadb.com/v1alpha1
kind: MariaDB
metadata:
  name: mariadb-prod
  namespace: app
spec:
  rootPasswordSecretKeyRef:
    name: mariadb-root
    key: password

  database: app
  username: app
  passwordSecretKeyRef:
    name: mariadb-app
    key: password

  image: mariadb:11.4

  port: 3306
  replicas: 3                    # 1 主 2 从

  replication:
    enabled: true
    primary:
      podIndex: 0
      automaticFailover: true
    replica:
      waitPoint: AfterSync
      kind: AfterSync

  storage:
    size: 100Gi
    storageClassName: gp3

  myCnf: |
    [mariadb]
    bind-address=*
    default_storage_engine=InnoDB
    innodb_buffer_pool_size=4G
    innodb_flush_log_at_trx_commit=1
    sync_binlog=1
    max_connections=500
    slow_query_log=1
    long_query_time=1
    binlog_format=ROW
    log-bin

  resources:
    requests:
      memory: 6Gi
      cpu: 2
    limits:
      memory: 8Gi
      cpu: 4

  metrics:
    enabled: true                # 自带 exporter + ServiceMonitor

  service:
    type: ClusterIP

  primaryService:
    type: ClusterIP

自动备份 CR

apiVersion: k8s.mariadb.com/v1alpha1
kind: Backup
metadata:
  name: nightly-backup
  namespace: app
spec:
  mariaDbRef: { name: mariadb-prod }
  schedule:
    cron: "0 3 * * *"            # 每日 3am
    suspend: false
  maxRetention: 720h             # 30 天
  storage:
    s3:
      bucket: my-backups
      prefix: mariadb-prod
      endpoint: s3.amazonaws.com
      region: us-east-1
      accessKeyIdSecretKeyRef: { name: s3-creds, key: access }
      secretAccessKeySecretKeyRef: { name: s3-creds, key: secret }

Restore CR(一键恢复)

apiVersion: k8s.mariadb.com/v1alpha1
kind: Restore
metadata:
  name: restore-from-2026-05-17
spec:
  mariaDbRef: { name: mariadb-restore-target }
  s3:
    bucket: my-backups
    key: mariadb-prod/2026-05-17/dump.sql.gz

Bitnami Helm Chart

helm install mariadb bitnami/mariadb \
  --set auth.rootPassword=$ROOT_PASS \
  --set auth.database=app \
  --set primary.persistence.size=100Gi \
  --set primary.resources.requests.memory=4Gi \
  --set primary.configuration=" \
[mariadb]
innodb_buffer_pool_size=3G
slow_query_log=1
" \
  --set metrics.enabled=true

简单,但 HA 复杂度自管。

健康检查的细节

Liveness vs Readiness

livenessProbe:
  exec:
    command:
      - mariadb-admin
      - ping
      - -h
      - localhost
  initialDelaySeconds: 60
  periodSeconds: 30
  timeoutSeconds: 5
  failureThreshold: 3

readinessProbe:
  exec:
    command:
      - sh
      - -c
      - "mariadb -h localhost -u root -p$MARIADB_ROOT_PASSWORD -e 'SELECT 1' && \
         mariadb -h localhost -u root -p$MARIADB_ROOT_PASSWORD -e 'SHOW REPLICA STATUS\\G' | grep -q 'Slave_IO_Running: Yes' || true"
  initialDelaySeconds: 30
  periodSeconds: 10

startup probe(启动慢的库)

startupProbe:
  exec:
    command: [mariadb-admin, ping, -h, localhost]
  failureThreshold: 60
  periodSeconds: 5

防止 liveness 在启动时杀掉 pod。

存储

选项
EBS gp3 / Cloud SSD数据持久跨节点迁移要 detach/attach
Local NVMe最快节点挂数据丢
OpenEBS / Longhorn跨节点复制性能开销
Rook Ceph全功能复杂

经验:用云厂商 CSI driver + EBS-style 块存储,配合 MariaDB 复制做 HA,是简单稳定的组合。

暴露给应用

ClusterIP(推荐)

应用 pod 同集群,走内部 DNS:

mariadb-prod.app.svc.cluster.local:3306

LoadBalancer

只在需要外部访问时用(如 BI 工具)。强烈建议加 IP 白名单 + 强 TLS

Ingress / Gateway

MariaDB 是 TCP 协议,不能走 HTTP Ingress。要用 NLB / TCP LoadBalancer。

监控

metrics:
  enabled: true

操作员自动跑 mysqld_exporter,PrometheusRule 配 ServiceMonitor 即可。

Dashboard:用 Grafana 的 "MySQL Overview" dashboard,MariaDB 兼容。

升级

spec:
  image: mariadb:11.4   # 改成 mariadb:11.4.5(patch)
  updateStrategy:
    type: RollingUpdate

跨小版本:rolling,正常。

跨大版本(11.4 → 11.8 → 12.x):要 backup → 新集群 → 切流。不要原地升。

常见坑

  1. CrashLoopBackOff 第一次启动:probe 启太早。加 startupProbe
  2. OOM Killed:limits 设太严,innodb_buffer_pool_size 配过大。limit memory ≥ bufferpool + 1G overhead
  3. PVC 跨节点拉不上:CSI 驱动配错或节点 AZ 不一致。把 storageClass 改成 region 内可漫游。
  4. 复制延迟暴涨:从库节点 CPU/IO 不足;或 binlog_format=STATEMENT 触发慢操作。
  5. 服务发现失败:headless service vs 普通 service 用错。Operator 通常自己处理。

延伸

本页目录