Recovering from a corrupted MySQL install due to a dying hard disk

Might as well write these steps down in case I ever need them again:

Background: A test box started making a faint high-pitched squealing sound, and then powering off. Happened semi-randomly, but most commonly during periods of hard disk access (such as boot-up). Fixed the hardware problem by replacing the Power Supply Unit ($25 from MSY).


Then the above PSU semi-random-poweroff problem in turn caused the box’s 5-year-old hard disk to start playing up (age, plus having the power repeatedly die mid-write probably doesn’t help any).

Steps for moving to a new hard disk:

  • Buy a new hard disk of equal or larger size.
  • Install it in one of the USB external single hard disk enclosures that supports both an SATA or an IDE hard disk ($23 at MSY).
  • Download, burn, and boot from System Rescue CD (v1.5.6 is the current latest). Take the default boot menu option, and the default keymap.
  • After it boots, turn on and plug in the USB drive.
  • At the shell, see which disk has which device name: fdisk -l
  • Recover from the old to the new disk: ddrescue -b 2M /dev/hda /dev/sda ./ddres.txt
  • The above copied about 17 Mb per second, and claimed zero disk errors were found.
  • Poweroff, and swapped the old disk out of the machine and the new disk in.
  • Boot the new disk, was dropped into a shell during boot due to finding file system errors carried over from the old disk. Run fsck on the affected partion: fsck /dev/hda8 -y
  • Reviewing the SMART warnings in the syslog from the old disk seemed to indicate that it probably was dying, confirming that swapping the disks was the correct course of action.

Then found the previous HDD corruption had in turn corrupted a MySQL database (aren’t cascading failures great?). This manifested itself as at least 10 different MySQL errors/warnings/problems:

Error in /var/log/syslog when starting mysqld: Failed to open log (file ‘/var/log/mysql/mysql-bin.000348’, errno 2) The cheat was to delete the last line, the one referencing the /var/log/mysql/mysql-bin.000348 file, from the /var/log/mysql/mysql-bin.index file. Note that in this case I knew that the last log file contained no updates that mattered, so it really was no loss.
vim /var/log/mysql/mysql-bin.index

MySQL server would no longer start, instead giving a “Fatal error: Can’t open and lock privilege tables: Can’t find file: ‘host’ (errno: 2)” message in the logs. For me, the problem was that the /var/lib/mysql/mysql/host.MYI file was missing. What fixed it: Repair the host table.
mysqld_safe --skip-grant-tables &
mysql
mysql> use mysql
mysql> REPAIR TABLE host USE_FRM;
mysql> exit

Try to reset the host table with useful starting data, to fix this error when starting mysql client: error: ‘Access denied for user ‘debian-sys-maint’@’localhost’ (using password: YES)’
mysql_fix_privilege_tables
mysqladmin shutdown

To fix these errors in /var/log/syslog :
[ERROR] /usr/sbin/mysqld:Fatal error: Can’t open and lock privilege tables: Table ‘./mysql/db’ is marked as crashed and should be repaired
[ERROR] /usr/sbin/mysqld: Table ‘./mysql/db’ is marked as crashed and should be repaired

cd /var/lib/mysql/mysql
myisamchk db
myisamchk *.MYI

To fix these warnings in the syslog:
myisamchk: warning: Table is marked as crashed
MyISAM-table ‘db.MYI’ is usable but should be fixed

myisamchk -r db

To fix this error in /var/log/syslog :
[ERROR] /usr/sbin/mysqld: Incorrect information in file: ‘./mysql/tables_priv.frm’
This did not work: repair table tables_priv USE_FRM;
Cheated: just copied /var/lib/mysql/mysql/tables_priv.* from another working machine.
chmod -x,o-r,g+w tables_priv.*
chown mysql.mysql tables_priv.*

Recurrence of this error:
ERROR 1045 (28000): Access denied for user ‘debian-sys-maint’@’localhost’ (using password: YES)
And a new one:
Access denied for user ‘root’@’localhost’ (using password: NO)
… and at this point a “desc user;” showed that the user table file must have had a doubly-claimed inode with another table during the fsck, as it was a completely different user table schema from that found on another machine.
Cheated again: just copied /var/lib/mysql/mysql/user.* over from another working machine.
chown mysql.mysql user.*
/etc/init.d/mysql start

Then to fix: ERROR 1045 (28000): Access denied for user ‘debian-sys-maint’@’localhost’ (using password: YES)
cat /etc/mysql/debian.cnf
Copy the “password” field’s value for the “debian-sys-maint” user to the clipboard.
mysql
mysql> use mysql
mysql> GRANT ALL PRIVILEGES ON *.* TO 'debian-sys-maint'@'localhost' IDENTIFIED BY 'insert_password_copied_above_from_clipboard' WITH GRANT OPTION;
mysql> exit

To fix this warning in /var/log/syslog: WARNING: mysqlcheck has found corrupt tables
Force a check of all tables:
mysqlcheck -A

Fix for this error when granting privileges: ERROR 126 (HY000): Incorrect key file for table ‘./mysql/db.MYI’; try to repair it
mysql
mysql> GRANT ALL PRIVILEGES ON dbname.* to 'dbuser'@'localhost' IDENTIFIED BY "fakefake";
ERROR 126 (HY000): Incorrect key file for table './mysql/db.MYI'; try to repair it
mysql> use mysql
mysql> REPAIR TABLE db USE_FRM;
+----------+--------+----------+----------------------------------------------------+
| Table    | Op     | Msg_type | Msg_text                                           |
+----------+--------+----------+----------------------------------------------------+
| mysql.db | repair | info     | Wrong bytesec: 255- 37- 32 at 0; Skipped           |
| mysql.db | repair | info     | Found block that points outside data file at 424   |
....
| mysql.db | repair | info     | Found block that points outside data file at 24960 |
| mysql.db | repair | status   | OK                                                 |
+----------+--------+----------+----------------------------------------------------+
151 rows in set (0.01 sec)
mysql> GRANT ALL PRIVILEGES ON dbname.* to 'dbuser'@'localhost' IDENTIFIED BY "fakefake";
Query OK, 0 rows affected (0.00 sec)
mysql> exit
mysqladmin shutdown
/etc/init.d/mysql start

To fix these errors in the syslog on mysqld startup:
/etc/mysql/debian-start[4592]: ERROR 1017 (HY000) at line 116: Can’t find file: ‘columns_priv’ (errno: 2)
/etc/mysql/debian-start[4592]: ERROR 1017 (HY000) at line 516: Can’t find file: ‘proc’ (errno: 2)

mysql
mysql> use mysql
mysql> REPAIR TABLE proc USE_FRM;
mysql> REPAIR TABLE columns_priv USE_FRM;
mysql> exit

… and after all the above, the box powers on, and stays on, the disk errors are gone, the mysqld service starts cleanly, and from a quick cursory glance, the data still looks okay.

1 thought on “Recovering from a corrupted MySQL install due to a dying hard disk

  1. Thanks for sharing! thorough and step by step troubleshooting steps. At least i’ll know what to do when this problem arises

Comments are closed.