Thiết kế website giá rẻ

Question

We are experiencing frequent outages with our AWS Aurora MySQL RDS (version 8.0.36) cluster since upgrading from MySQL 5.7. Over the past two months, the production database has gone down three times, and each time recovery has been difficult.

Env Details

Version: 8.0.36
Cluster Configuration: RDS Proxy in front of the DB cluster
Upgrade Path: MySQL 5.7 → MySQL 8.0.36

Steps taken

Added DB Proxy: Implemented a DB proxy in front of the DB cluster to manage connections.
Reached Out to AWS Support: Engaged AWS support for assistance.
Disabled Parallel Query: Turned off ParallelQuery.
Query Optimization: Analyzed and optimized heavy queries (does not
appear to be a query-related issue).

Problem
When the issue occurs, the database enters a failing state, showing the following behaviors:

Cannot be recovered without intervention.
Logs indicate an assertion failure in log0grover.cc.
InnoDB initialization fails, suggesting potential corruption in the InnoDB tablespace.

Log

241220  9:45:23 server_audit: Audit STARTED.
Found DAS config file, trying to load DAS switcher from DAS config file.
2024-12-20 09:45:23 70369552721040:[DAS][INFO]: Calculated persistence threads 4
aurora_enable_das:0
241220  9:45:23 server_audit: server_audit_incl_users set to ''.
241220  9:45:23 server_audit: server_audit_excl_users set to ''.
2024-12-20T09:45:23.681788Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started. (ha_innodb.cc:14071)
2024-12-20T09:45:23.681848Z 1 [Note] [MY-013547] [InnoDB] Atomic write disabled (ha_innodb.cc:14095)
2024-12-20T09:45:23.681884Z 1 [Note] [MY-012932] [InnoDB] PUNCH HOLE support available (srv0start.cc:2138)
2024-12-20T09:45:23.681902Z 1 [Note] [MY-012944] [InnoDB] Uses event mutexes (srv0start.cc:2183)
2024-12-20T09:45:23.681913Z 1 [Note] [MY-012945] [InnoDB] GCC builtin __atomic_thread_fence() is used for memory barrier (srv0start.cc:2184)
2024-12-20T09:45:23.681945Z 1 [Note] [MY-012948] [InnoDB] Compressed tables use zlib 1.2.13 (srv0start.cc:2202)
2024-12-20T09:45:23.684704Z 1 [Note] [MY-012951] [InnoDB] Using hardware accelerated crc32 and polynomial multiplication. (srv0start.cc:2234)
2024-12-20T09:45:23.685080Z 1 [Note] [MY-012203] [InnoDB] Directories to scan './' (srv0start.cc:2272)
2024-12-20T09:45:23.686379Z 1 [Note] [MY-012955] [InnoDB] Initializing buffer pool, total size = 19.503906G, instances = 4, chunk size =4.875977G  (srv0start.cc:2355)
 (buf0buf.cc:1803)
 (buf0buf.cc:1803)
 (buf0buf.cc:1803)
 (buf0buf.cc:1803)
 (buf0buf.cc:1803)
 (buf0buf.cc:1803)
 (buf0buf.cc:1803)
 (buf0buf.cc:1803)
 (buf0buf.cc:1803)
 (buf0buf.cc:1803)
 (buf0buf.cc:1803)
 (buf0buf.cc:1803)
 (buf0buf.cc:1803)
 (buf0buf.cc:1803)
 (buf0buf.cc:1803)
2024-12-20T09:45:29.083910Z 1 [ERROR] [MY-013183] [InnoDB] Assertion failure: log0grover.cc:807:0 thread 70370132519312 (ut0dbg.cc:58)
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/8.0/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
2024-12-20T09:45:29Z UTC - mysqld got signal 6 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
BuildID[sha1]=569da3248b99719cc62f4997c36af088a01878a6
Thread pointer: 0x400043494000
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 400052c060b8 thread_stack 0x40000
/rdsdbbin/oscar/bin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x2c) [0x2b76b2c]
/rdsdbbin/oscar/bin/mysqld(print_fatal_signal(int)+0x37c) [0x1473fdc]
/rdsdbbin/oscar/bin/mysqld(my_server_abort()+0x98) [0x14745d8]
/rdsdbbin/oscar/bin/mysqld(my_abort()+0x14) [0x2b6ced4]
/rdsdbbin/oscar/bin/mysqld(ut_dbg_assertion_failed(char const*, char const*, unsigned long)+0x2b0) [0x2ea0570]
/rdsdbbin/oscar/bin/mysqld(onGroverErrorImpl(int, char const*, int)+0x244) [0x10395e4]
/rdsdbbin/oscar/bin/mysqld(log_grover_open(char const*, char const*, char const*, char const*, char const*, char const*, std::vector<engineShmSegAddrName, std::allocator<engineShmSegAddrName> >&)+0x14c8) [0x2ed1a48]
/rdsdbbin/oscar/bin/mysqld(srv_start(bool)+0x1698) [0x2e59f38]
/rdsdbbin/oscar/bin/mysqld() [0x2cfbcdc]
/rdsdbbin/oscar/bin/mysqld(dd::bootstrap::DDSE_dict_init(THD*, dict_init_mode_t, unsigned int)+0xf8) [0x280bfd8]
/rdsdbbin/oscar/bin/mysqld(dd::upgrade_57::do_pre_checks_and_initialize_dd(THD*)+0x104) [0x2b3f6e4]
/rdsdbbin/oscar/bin/mysqld() [0x15b6b68]
/rdsdbbin/oscar/bin/mysqld() [0x30fe508]
/lib64/libpthread.so.0(+0x7230) [0x40002fde0230]
/lib64/libc.so.6(+0xdb7dc) [0x4000302117dc]

Has anyone encountered this issue with Aurora MySQL 8.0.x?

Are there recommended steps to verify and fix InnoDB corruption in this specific context?

Any insights on whether upgrading to a more recent MySQL 8.0.x version could address this?

I can provide more logs if needed

Thiết kế website giá rẻ

Danh mục

AWS Aurora MySQL RDS 8.0.36 Outages Post-Upgrade (Assertion Failure in log0grover.cc) [closed]