Saturday, December 9, 2023

Read How Github.com Upgrade Their 1200+ MySQL Server to MySQL 8.0


GitHub has just shared details about their strategy for upgrading over 1200 of their MySQL servers to version 8.0. The world's largest software repository, developed from the beginning using Ruby on Rails, has been using MySQL as its database. After years of using version 5.x, they successfully completed the upgrade to MySQL 8, and they narrate the process in their blog.


As an overview, GitHub's MySQL instances, totaling over 1200, are distributed across Azure data centers and bare metal in their own data center. In total, they handle 300 TB of data and serve 5.5 million queries per second across more than 50 database clusters.


For those interested in learning more about their upgrade process, please read their blog post here.

GitHub undertook a meticulous process to upgrade over 1200 MySQL servers to version 8.0, with a focus on maintaining high availability and adhering to Service Level Objectives (SLOs) and Service Level Agreements (SLAs). Here is a summary of their key points:

Preparation for the Upgrade:

Started in July 2022 with various milestones before upgrading production databases.

Prepared infrastructure for the upgrade, determining default values for MySQL 8.0 and benchmarking performance.

Ensured application compatibility by adding MySQL 8.0 to Continuous Integration (CI) and enabling pre-built containers for debugging.

Communication and Transparency:

Utilized GitHub Projects for a rolling calendar to communicate and track the upgrade schedule.

Created issue templates to coordinate the upgrade between application and database teams.

Upgrade Plan:

Implemented a gradual upgrade strategy with checkpoints and rollbacks.

Step 1 involved rolling replica upgrades with careful monitoring.

Step 2 updated replication topology with both 5.7 and 8.0 replicas.

Step 3 promoted MySQL 8.0 replica to primary through a graceful failover.

Step 4 upgraded internal-facing instance types.

Step 5 involved cleanup, removing 5.7 servers after successful validation.

Ability to Rollback:

Maintained the ability to rollback to MySQL 5.7 for read-replicas by keeping enough online.

Ensured backward data replication between 8.0 and 5.7 for the primary database.

Overcame challenges related to character collation and role management during rollback.

Final Steps:

Upgraded ancillary servers for backups or non-production workloads.

Conducted thorough validation, including a complete 24-hour traffic cycle, before removing 5.7 servers.

Challenges Overcome:

Addressed character collation incompatibility by setting default encoding and collation.

Managed role management issues during the upgrade window.

Maintained high confidence in backward replication for critical applications like GitHub.com monolith with consistent Rails configuration.

GitHub's detailed approach ensured a successful and controlled transition to MySQL 8.0, reflecting their commitment to high standards of availability and performance.

0 comments:

Post a Comment