Today my production application is crashed, then I found the redis master node that deployment in kuberentes cluster is down. This is the log output:
1:M 24 Dec 2024 09:05:19.294 * Starting automatic rewriting of AOF on 100% growth
1:M 24 Dec 2024 09:05:19.313 * Creating AOF incr file appendonly.aof.854.incr.aof on background rewrite
1:M 24 Dec 2024 09:05:19.314 * Background append only file rewriting started by pid 1203630
1203630:C 24 Dec 2024 09:05:19.804 * Successfully created the temporary AOF base file temp-rewriteaof-bg-1203630.aof
1203630:C 24 Dec 2024 09:05:19.805 * Fork CoW for AOF rewrite: current 1 MB, peak 1 MB, average 0 MB
1:M 24 Dec 2024 09:05:19.817 * Background AOF rewrite terminated with success
1:M 24 Dec 2024 09:05:19.821 * Successfully renamed the temporary AOF base file temp-rewriteaof-bg-1203630.aof into appendonly.aof.854.base.rdb
1:M 24 Dec 2024 09:05:19.837 * Removing the history file appendonly.aof.853.incr.aof in the background
1:M 24 Dec 2024 09:05:19.842 * Removing the history file appendonly.aof.853.base.rdb in the background
1:M 24 Dec 2024 09:05:19.860 * Background AOF rewrite finished successfully
1:M 25 Dec 2024 14:11:51.028 # Fail to fsync the AOF file: Permission denied
1:signal-handler (1735135933) Received SIGTERM scheduling shutdown...
1:M 25 Dec 2024 14:12:13.693 # User requested shutdown...
1:M 25 Dec 2024 14:12:13.693 * Calling fsync() on the AOF file.
1:M 25 Dec 2024 14:12:13.693 # Redis is now ready to exit, bye bye...
the master node was using NFS as the storage. this redis node runs fine for more than 3 months. why suddenly output this error log? BTW, the system did not facing the peak traffic. This is how the redis deployment in kuberentes cluster:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: reddwarf-redis-master
namespace: reddwarf-cache
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/component: master
app.kubernetes.io/instance: reddwarf-redis
app.kubernetes.io/name: redis
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/component: master
app.kubernetes.io/instance: reddwarf-redis
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: redis
helm.sh/chart: redis-17.8.0
annotations:
checksum/configmap: 58007696c1a7ce9392e17b3fbc0176ecb231bd1e89e424833ae87abe6180ba70
checksum/health: 617c81318616bf8cc32a4e21b4435b4c46859317d4453830d73eb1039493b9ac
checksum/scripts: 92ddb483b2fac55ea06a3c2c48cf2d866e10764d4b75ac10fbf25f346ed20c9b
checksum/secret: 35b6246ede69a46a5b9831e24c6072747d26e626c21eb90a24df49de3531e70c
kubectl.kubernetes.io/restartedAt: '2024-07-04T14:47:55Z'
spec:
volumes:
- name: start-scripts
configMap:
name: reddwarf-redis-scripts
defaultMode: 493
- name: health
configMap:
name: reddwarf-redis-health
defaultMode: 493
- name: config
configMap:
name: reddwarf-redis-configuration
defaultMode: 420
- name: redis-tmp-conf
emptyDir: {}
- name: tmp
emptyDir: {}
containers:
- name: redis
image: >-
registry.cn-hangzhou.aliyuncs.com/reddwarf-public/redis:7.0.8-debian-11-r13
command:
- /bin/bash
args:
- '-c'
- /opt/bitnami/scripts/start-scripts/start-master.sh
ports:
- name: redis
containerPort: 6379
protocol: TCP
env:
- name: BITNAMI_DEBUG
value: 'false'
- name: REDIS_REPLICATION_MODE
value: master
- name: ALLOW_EMPTY_PASSWORD
value: 'no'
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: reddwarf-redis
key: redis-password
- name: REDIS_TLS_ENABLED
value: 'no'
- name: REDIS_PORT
value: '6379'
resources: {}
volumeMounts:
- name: start-scripts
mountPath: /opt/bitnami/scripts/start-scripts
- name: health
mountPath: /health
- name: redis-data
mountPath: /data
- name: config
mountPath: /opt/bitnami/redis/mounted-etc
- name: redis-tmp-conf
mountPath: /opt/bitnami/redis/etc/
- name: tmp
mountPath: /tmp
livenessProbe:
exec:
command:
- sh
- '-c'
- /health/ping_liveness_local.sh 5
initialDelaySeconds: 20
timeoutSeconds: 6
periodSeconds: 5
successThreshold: 1
failureThreshold: 5
readinessProbe:
exec:
command:
- sh
- '-c'
- /health/ping_readiness_local.sh 1
initialDelaySeconds: 20
timeoutSeconds: 2
periodSeconds: 5
successThreshold: 1
failureThreshold: 5
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
securityContext:
runAsUser: 1001
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
serviceAccountName: reddwarf-redis
serviceAccount: reddwarf-redis
securityContext:
fsGroup: 1001
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/component: master
app.kubernetes.io/instance: reddwarf-redis
app.kubernetes.io/name: redis
topologyKey: kubernetes.io/hostname
schedulerName: default-scheduler
volumeClaimTemplates:
- kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: redis-data
creationTimestamp: null
labels:
app.kubernetes.io/component: master
app.kubernetes.io/instance: reddwarf-redis
app.kubernetes.io/name: redis
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 8Gi
volumeMode: Filesystem
status:
phase: Pending
serviceName: reddwarf-redis-headless
podManagementPolicy: OrderedReady
updateStrategy:
type: RollingUpdate
revisionHistoryLimit: 10
persistentVolumeClaimRetentionPolicy:
whenDeleted: Retain
whenScaled: Retain