Frightening:
The saga unfolds...continue reading.
On 2013-03-22, the server that hosts the git.kde.org virtual machine was taken down for security updates. Both virtual machines running on the server were shut down without incident, security updates were applied to the host, and the machine was rebooted.
When the host came back up and started the VMs, the VMs immediately showed evidence of file system corruption (the file system in question was ext4). It is not known at this time (and we’ll probably never know) whether this corruption had been silently ongoing for a long period of time, or was the result of something specific that occurred during the shutdown or reboot of the VM or host. There is some evidence to suggest the former, but nothing concrete.
As most of you reading this are well aware, KDE has a series of “anongit” machines whose purpose is to distribute the heavy load across the 1500 hosted Git repositories and to act as backups for the main server. However, when we checked the anongit machines, every single one of them had severely corrupted repositories and many or all repositories were missing.
How could this happen?
When the host came back up and started the VMs, the VMs immediately showed evidence of file system corruption (the file system in question was ext4). It is not known at this time (and we’ll probably never know) whether this corruption had been silently ongoing for a long period of time, or was the result of something specific that occurred during the shutdown or reboot of the VM or host. There is some evidence to suggest the former, but nothing concrete.
As most of you reading this are well aware, KDE has a series of “anongit” machines whose purpose is to distribute the heavy load across the 1500 hosted Git repositories and to act as backups for the main server. However, when we checked the anongit machines, every single one of them had severely corrupted repositories and many or all repositories were missing.
How could this happen?
Comment