Openstack Recovering Data from Failed Instances Disk
****************************
Qemu-nbd tools in Ubuntu
****************************
In some scenarios, instances are running but are inaccessible through SSH and do not respond to any command. The VNC console could be displaying a boot failure or kernel panic error messages. This could be an indication of file system corruption on the VM itself. If you need to recover files or inspect the content of the instance, qemu-nbd can be used to mount the disk.
We can get the path of the Instance by greping the Instance name from the common instance path.
>>egrep -i "Instance-name" /var/lib/nova/instances/*/*.xml
To access the instance's disk (/var/lib/nova/instances/xxxx-instance-uuid-xxxxxx/disk), use the following steps:
1.)Suspend the instance using the virsh command.
2.)Connect the qemu-nbd device to the disk.
3.)Mount the qemu-nbd device.
4.)Unmount the device after inspecting.
5.)Disconnect the qemu-nbd device.
6.)Resume the instance.
If you do not follow steps 4 through 6, OpenStack Compute cannot manage the instance any longer. It fails to respond to any command issued by OpenStack Compute, and it is marked as shut down.
Once you mount the disk file, you should be able to access it and treat it as a collection of normal directories with files and a directory structure. However, we do not recommend that you edit or touch any files because this could change the access control lists (ACLs) that are used to determine which accounts can perform what operations on files and directories. Changing ACLs can make the instance unbootable if it is not already.
Suspend the instance using the virsh command, taking note of the internal ID:
# virsh list
Id Name State
----------------------------------
1 instance-00000981 running
2 instance-000009f5 running
30 instance-0000274a running
# virsh suspend 30
Domain 30 suspended
Connect the qemu-nbd device to the disk:
# cd /var/lib/nova/instances/instance-0000274a
# ls -lh
total 33M
-rw-rw---- 1 libvirt-qemu kvm 6.3K Jan 15 11:31 console.log
-rw-r--r-- 1 libvirt-qemu kvm 33M Jan 15 22:06 disk
-rw-r--r-- 1 libvirt-qemu kvm 384K Jan 15 22:06 disk.local
-rw-rw-r-- 1 nova nova 1.7K Jan 15 11:30 libvirt.xml
# qemu-nbd -c /dev/nbd0 `pwd`/disk
Mount the qemu-nbd device.
The qemu-nbd device tries to export the instance disk's different partitions as separate devices. For example, if vda is the disk and vda1 is the root partition, qemu-nbd exports the device as /dev/nbd0 and /dev/nbd0p1, respectively:
# mount /dev/nbd0p1 /mnt/
You can now access the contents of /mnt, which correspond to the first partition of the instance's disk.
To examine the secondary or ephemeral disk, use an alternate mount point if you want both primary and secondary drives mounted at the same time:
# umount /mnt
# qemu-nbd -c /dev/nbd1 `pwd`/disk.local
# mount /dev/nbd1 /mnt/
# ls -lh /mnt/
total 76K
lrwxrwxrwx. 1 root root 7 Jan 15 00:44 bin -> usr/bin
dr-xr-xr-x. 4 root root 4.0K Jan 15 01:07 boot
drwxr-xr-x. 2 root root 4.0K Jan 15 00:42 dev
drwxr-xr-x. 70 root root 4.0K Jan 15 11:31 etc
drwxr-xr-x. 3 root root 4.0K Jan 15 01:07 home
lrwxrwxrwx. 1 root root 7 Jan 15 00:44 lib -> usr/lib
lrwxrwxrwx. 1 root root 9 Jan 15 00:44 lib64 -> usr/lib64
drwx------. 2 root root 16K Jan 15 00:42 lost+found
drwxr-xr-x. 2 root root 4.0K Feb 3 2012 media
drwxr-xr-x. 2 root root 4.0K Feb 3 2012 mnt
drwxr-xr-x. 2 root root 4.0K Feb 3 2012 opt
drwxr-xr-x. 2 root root 4.0K Jan 15 00:42 proc
dr-xr-x---. 3 root root 4.0K Jan 15 21:56 root
drwxr-xr-x. 14 root root 4.0K Jan 15 01:07 run
lrwxrwxrwx. 1 root root 8 Jan 15 00:44 sbin -> usr/sbin
drwxr-xr-x. 2 root root 4.0K Feb 3 2012 srv
drwxr-xr-x. 2 root root 4.0K Jan 15 00:42 sys
drwxrwxrwt. 9 root root 4.0K Jan 15 16:29 tmp
drwxr-xr-x. 13 root root 4.0K Jan 15 00:44 usr
drwxr-xr-x. 17 root root 4.0K Jan 15 00:44 var
Once you have completed the inspection, unmount the mount point and release the qemu-nbd device:
# umount /mnt
# qemu-nbd -d /dev/nbd0
/dev/nbd0 disconnected
Resume the instance using virsh:
# virsh list
Id Name State
----------------------------------
1 instance-00000981 running
2 instance-000009f5 running
30 instance-0000274a paused
# virsh resume 30
Domain 30 resumed
****************************
Libguestfs tools in Centos7
****************************
sudo yum install libguestfs-tools # Fedora/RHEL/CentOS
sudo apt-get install libguestfs-tools # Debian/Ubuntu
[boris@icehouse1 Downloads]$ guestfish --rw -a disk files
Welcome to guestfish, the guest filesystem shell for
editing virtual machine filesystems and disk images.
Type: 'help' for help on commands
'man' to read the manual
'quit' to quit the shell
> run
> list-filesystems
/dev/sda1: ext4
> mount /dev/sda1 /
> ls /
****************************
Guestmount tools in Centos7
****************************
[root@compute ea1aeb3xxxxxxxxxxxxxxxxx3157a9b81621]# ls
console.log disk disk.info libvirt.xml
[root@compute ea1aeb3xxxxxxxxxxxxxxxxx3157a9b81621]# ls -al
total 13790864
drwxr-xr-x. 2 nova nova 3864 Mar 18 15:07 .
drwxr-xr-x. 20 nova nova 3864 Mar 19 11:01 ..
-rw-rw----. 1 root root 0 Mar 19 11:01 console.log
-rw-r--r--. 1 qemu qemu 14094106624 Mar 19 12:09 disk
-rw-r--r--. 1 nova nova 79 Mar 18 15:07 disk.info
-rw-r--r--. 1 nova nova 2603 Mar 19 10:59 libvirt.xml
[root@compute ea1aeb3xxxxxxxxxxxxxxxxx3157a9b81621]# guestmount -a disk -i /mnt
[root@compute ea1aeb3xxxxxxxxxxxxxxxxx3157a9b81621]# ll /mnt/
total 136
dr-xr-xr-x. 2 root root 4096 Mar 18 15:44 bin
dr-xr-xr-x. 4 root root 4096 Apr 16 2014 boot
drwxr-xr-x. 10 root root 4096 Mar 19 10:22 cgroup
drwxr-xr-x. 2 root root 4096 Apr 16 2014 dev
drwxr-xr-x. 80 root root 4096 Mar 19 11:00 etc
drwxr-xr-x. 3 root root 4096 Mar 18 15:08 home
dr-xr-xr-x. 8 root root 4096 Apr 16 2014 lib
dr-xr-xr-x. 11 root root 12288 Mar 18 15:44 lib64
drwx------. 2 root root 16384 Apr 16 2014 lost+found
drwxr-xr-x. 2 root root 4096 Sep 23 2011 media
drwxr-xr-x. 2 root root 4096 Sep 23 2011 mnt
drwxr-xr-x. 2 root root 4096 Sep 23 2011 opt
drwxr-xr-x. 2 root root 4096 Apr 16 2014 proc
dr-xr-x---. 4 root root 24576 Mar 19 10:59 root
dr-xr-xr-x. 2 root root 12288 Mar 18 15:45 sbin
drwxr-xr-x. 2 root root 4096 Apr 16 2014 selinux
drwxr-xr-x. 2 root root 4096 Sep 23 2011 srv
drwxr-xr-x. 2 root root 4096 Apr 16 2014 sys
drwxrwxrwt. 3 root root 4096 Mar 19 11:00 tmp
drwxr-xr-x. 13 root root 4096 Apr 16 2014 usr
drwxr-xr-x. 19 root root 4096 Mar 19 10:14 var
[root@compute ea1aeb3xxxxxxxxxxxxxxxxx3157a9b81621]# guestunmount /mnt/
[root@compute ea1aeb3xxxxxxxxxxxxxxxxx3157a9b81621]#