MySQL Cluster Auto-Recovery

MySQL Cluster Auto-Recovery

To cut the story short, I had to accept that there is nothing I can do with the networking issues and there are likely to be significant “bursts” of dropped packets that will cause disconnection of our MySQL8 cluster nodes. The first thing was to adjust sysctl values related to networking – as the MySQL cluster is fairly utilised, with a limit of 500 connections, it was worth reviewing file handle and network-related values – net.core.somaxconn, net.core.netdev_max_backlog, net.core.rmem*, net.core.wmem*, etc. The winner is:

It returns the following set of useful information that can be used for alerts and dashboards:

In terms of the netdata configuration, one has to add a SELECT privilege for the netdata user:

I have then expanded two files:

In the first one, I’ve added:

a new query into the Service definition

And a small change of the health.d/mysql.conf file

and we are getting Slack notifications whenever a node goes down and leaves the cluster.

Source: magicofsecurity.com