Mounting existing SSD persistent disk to new instance (Google Cloud) - google-compute-engine

I don't know if it's because of limitation of SSD or something else, I can't see a partition on the disk to mount:
$ ll /dev/disk/by-id/
google-dataserver2 -> ../../sda
google-dataserver2-part1 -> ../../sda1
google-ssd1 -> ../../sdb
scsi-0Google_PersistentDisk_dataserver2 -> ../../sda
scsi-0Google_PersistentDisk_dataserver2-part1 -> ../../sda1
scsi-0Google_PersistentDisk_ssd1 -> ../../sdb
Since there is no -part1 in the google-ssd1 disk, I can't mount it.

It is possible that the filesystem starts at the begnining of the disk (instead of having a partition table at the start, and then the filesystem starting a few blocks later). Try mounting /dev/sdb or /dev/disk/by-id/google-ssd1 directly.

Related

Google Compute instance won't mount persistent disk, maintains ~100% CPU

During some routine use of my web server (saving posts via WordPress), my instance suddenly jumped up to 400% CPU usage and wouldn't come back down below 100%. Restarting and stopping/starting the instance didn't change anything.
Looking at the last bit of my serial output:
[ 0.678602] md: Waiting for all devices to be available before autodetect
[ 0.679518] md: If you don't use raid, use raid=noautodetect
[ 0.680548] md: Autodetecting RAID arrays.
[ 0.681284] md: Scanned 0 and added 0 devices.
[ 0.682173] md: autorun ...
[ 0.682765] md: ... autorun DONE.
[ 0.683716] VFS: Cannot open root device "sda1" or unknown-block(0,0): error -6
[ 0.685298] Please append a correct "root=" boot option; here are the available partitions:
[ 0.686676] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[ 0.688489] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.19.0-30-generic #34~14.04.1-Ubuntu
[ 0.689287] Hardware name: Google Google, BIOS Google 01/01/2011
[ 0.689287] ffffea00008ae400 ffff880024ee7db8 ffffffff817af477 000000000000111e
[ 0.689287] ffffffff81a7c6c0 ffff880024ee7e38 ffffffff817a9338 ffff880024ee7dd8
[ 0.689287] ffffffff00000010 ffff880024ee7e48 ffff880024ee7de8 ffff880024ee7e38
[ 0.689287] Call Trace:
[ 0.689287] [<ffffffff817af477>] dump_stack+0x45/0x57
[ 0.689287] [<ffffffff817a9338>] panic+0xc1/0x1f5
[ 0.689287] [<ffffffff81d3e5f3>] mount_block_root+0x210/0x2a9
[ 0.689287] [<ffffffff81d3e822>] mount_root+0x54/0x58
[ 0.689287] [<ffffffff81d3e993>] prepare_namespace+0x16d/0x1a6
[ 0.689287] [<ffffffff81d3e304>] kernel_init_freeable+0x1f6/0x20b
[ 0.689287] [<ffffffff81d3d9a7>] ? initcall_blacklist+0xc0/0xc0
[ 0.689287] [<ffffffff8179fab0>] ? rest_init+0x80/0x80
[ 0.689287] [<ffffffff8179fabe>] kernel_init+0xe/0xf0
[ 0.689287] [<ffffffff817b6d98>] ret_from_fork+0x58/0x90
[ 0.689287] [<ffffffff8179fab0>] ? rest_init+0x80/0x80
[ 0.689287] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 0.689287] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
(Not sure if it's obvious from that, but I'm using the standard Ubuntu 14.04 image)
I've tried taking snapshots and mounting them on new instances, and now I've even deleted the instance and mounted the disk on to a new one, still the same issue and exactly the same serial output.
I really hope my data has not been hopelessly corrupted. Not sure if anyone has any suggestions on recovering data from a persistent disk?
Note that the accepted answer for: Google Compute Engine VM instance: VFS: Unable to mount root fs on unknown-block did not work for me.
I posted this on another question, but this question is worded better, so I'll re-post it here.
What Causes This?
That is the million dollar question. After inspecting my GCE VM, I found out there were 14 different kernels installed taking up several hundred MB's of space. Most of the kernels didn't have a corresponding initrd.img file, and were therefore not bootable (including 3.19.0-39-generic).
I certainly never went around trying to install random kernels, and once removed, they no longer appear as available upgrades, so I'm not sure what happened. Seriously, what happened?
Edit: New response from Google Cloud Support.
I received another disconcerting response. This may explain the additional, errant kernels.
"On rare occasions, a VM needs to be migrated from one physical host to another. In such case, a kernel upgrade and security patches might be applied by Google."
How to recover your instance...
After several back-and-forth emails, I finally received a response from support that allowed me to resolve the issue. Be mindful, you will have to change things to match your unique VM.
Take a snapshot of the disk first in case we need to roll back any of the changes below.
Edit the properties of the broken instance to disable this option: "Delete boot disk when instance is deleted"
Delete the broken instance.
IMPORTANT: ensure not to select the option to delete the boot disk. Otherwise, the disk will get removed permanently!!
Start up a new temporary instance.
Attach the broken disk (this will appear as /dev/sdb1) to the temporary instance
When the temporary instance is booted up, do the following:
In the temporary instance:
# Run fsck to fix any disk corruption issues
$ sudo fsck.ext4 -a /dev/sdb1
# Mount the disk from the broken vm
$ sudo mkdir /mnt/sdb
$ sudo mount /dev/sdb1 /mnt/sdb/ -t ext4
# Find out the UUID of the broken disk. In this case, the uuid of sdb1 is d9cae47b-328f-482a-a202-d0ba41926661
$ ls -alt /dev/disk/by-uuid/
lrwxrwxrwx. 1 root root 10 Jan 6 07:43 d9cae47b-328f-482a-a202-d0ba41926661 -> ../../sdb1
lrwxrwxrwx. 1 root root 10 Jan 6 05:39 a8cf6ab7-92fb-42c6-b95f-d437f94aaf98 -> ../../sda1
# Update the UUID in grub.cfg (if necessary)
$ sudo vim /mnt/sdb/boot/grub/grub.cfg
Note: This ^^^ is where I deviated from the support instructions.
Instead of modifying all the boot entries to set root=UUID=[uuid character string], I looked for all the entries that set root=/dev/sda1 and deleted them. I also deleted every entry that didn't set an initrd.img file. The top boot entry with correct parameters in my case ended up being 3.19.0-31-generic. But yours may be different.
# Flush all changes to disk
$ sudo sync
# Shut down the temporary instance
$ sudo shutdown -h now
Finally, detach the HDD from the temporary instance, and create a new instance based off of the fixed disk. It will hopefully boot.
Assuming it does boot, you have a lot of work to do. If you have half as many unused kernels as me, then you might want to purge the unused ones (especially since some are likely missing a corresponding initrd.img file).
I used the second answer (the terminal-based one) in this askubuntu question to purge the other kernels.
Note: Make sure you don't purge the kernel you booted in with!
In order to recover your data, you need to create a brand new instance where you can ssh, and attach the corrupted disk to it as a secondary disk. More information can be found in this article. I would suggest taking a snapshot of the corrupted disk before attaching it, for backup purposes.

Memory usage issues with VPS (ubuntu): MySQL process dies

I'm running a VPS, with specs:
Ubuntu 12.04.5 LTS (GNU/Linux 3.13.0-32-generic x86_64)
512mb RAM
1 CPU
20gb SSD
If you're wondering it's a DigitalOcean droplet. It's running TS3, LAMP (with wordpress), OpenVPN, BYOBU, and OwnCloud.
Now my problem is with mySQL dying on me after like 30m to 1hour. Usually after a reboot, the memory usage is 54% and mySQL doesn't have a problem, but as the memory usage goes towards 80-89% I start to get issues.
System load: 0.01 Users logged in: 0
Usage of /: 22.1% of 19.56GB IP address for eth0: *****
Memory usage: 90% IP address for as0t0: *****
Swap usage: 0% IP address for as0t1: *****
Processes: 93
As you can see, the memory usage is VERY high, and I've noticed the trend that mySQL process dies as the memory usage gets higher. However the swap usage is 0%.
Is there a way to make mySQL and the other processes to use the swap?
Would letting mySQL make use of the swap stop letting it die after my memory usage gets so high?
After the high memory usage, the process dies and I get this error:
[2002] SQLSTATE[HY000] [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111)
The processor load never goes above 25% in most cases. The server also runs a fast SSD, so it wouldn't be a problem to use a swap, and I don't have that much traffic.
Fixed it, by making a swap file of size 256mb. mySQL doesn't stop now after having no available memory to work in.
After following this tutorial by Etel Sverdlov:
https://www.digitalocean.com/community/tutorials/how-to-add-swap-on-ubuntu-12-04
I was able to make a swap file. I'll copy the tutorial for the sake it gets deleted.
How To Add Swap on Ubuntu 12.04
About Linux Swapping
Linux RAM is composed of chunks of memory called pages. To free up pages of RAM, a “linux swap” can occur and a page of memory is copied from the RAM to preconfigured space on the hard disk. Linux swaps allow a system to harness more memory than was originally physically available.
However, swapping does have disadvantages. Because hard disks have a much slower memory than RAM, virtual private server performance may slow down considerably. Additionally, swap thrashing can begin to take place if the system gets swamped from too many files being swapped in and out.
Check for Swap Space
Before we proceed to set up a swap file, we need to check if any swap files have been enabled on the VPS by looking at the summary of swap usage.
sudo swapon -s
An empty list will confirm that you have no swap files enabled:
Filename Type Size Used Priority
Check the File System
After we know that we do not have a swap file enabled on the virtual server, we can check how much space we have on the server with the df command. The swap file will take 256MB— since we are only using up about 8% of the /dev/sda, we can proceed.
df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda 20907056 1437188 18421292 8% /
udev 121588 4 121584 1% /dev
tmpfs 49752 208 49544 1% /run
none 5120 0 5120 0% /run/lock
none 124372 0 124372 0% /run/shm
Create and Enable the Swap File
Now it’s time to create the swap file itself using the dd command :
sudo dd if=/dev/zero of=/swapfile bs=1024 count=256k
“of=/swapfile” designates the file’s name. In this case the name is swapfile.
Subsequently we are going to prepare the swap file by creating a linux swap area:
sudo mkswap /swapfile
The results display:
Setting up swapspace version 1, size = 262140 KiB
no label, UUID=103c4545-5fc5-47f3-a8b3-dfbdb64fd7eb
Finish up by activating the swap file:
sudo swapon /swapfile
You will then be able to see the new swap file when you view the swap summary.
swapon -s
Filename Type Size Used Priority
/swapfile file 262140 0 -1
This file will last on the virtual private server until the machine reboots. You can ensure that the swap is permanent by adding it to the fstab file.
Open up the file:
sudo nano /etc/fstab
Paste in the following line:
/swapfile none swap sw 0 0
Swappiness in the file should be set to 10. Skipping this step may cause both poor performance, whereas setting it to 10 will cause swap to act as an emergency buffer, preventing out-of-memory crashes.
You can do this with the following commands:
echo 10 | sudo tee /proc/sys/vm/swappiness
echo vm.swappiness = 10 | sudo tee -a /etc/sysctl.conf
To prevent the file from being world-readable, you should set up the correct permissions on the swap file:
sudo chown root:root /swapfile
sudo chmod 0600 /swapfile
All credit to: Etel Sverdlov at: https://www.digitalocean.com/community/tutorials/how-to-add-swap-on-ubuntu-12-04

Mount LVM2 and recover lost data from failed HDD?

HDD failure occurred.
So, a new primary HDD was added in and the old HDD was added in as a secondary one.
I'm trying to mount my secondary HDD but there are errors occurring.
I made /media/qwe/.
I then went on Putty and used these SSH commands:
root#chicken [/]# mount /dev/sdb2 /media/qwe
mount: unknown filesystem type 'LVM2_member'
But, I got an error.
root#chicken [/]# vgscan
Reading all physical volumes. This may take a while...
Found volume group "VolGroup" using metadata type lvm2
Found volume group "VolGroup" using metadata type lvm2
root#chicken [/]# vgs
VG #PV #LV #SN Attr VSize VFree
VolGroup 1 3 0 wz--n- 1.82t 0
VolGroup 1 3 0 wz--n- 1.82t 0
I use cPanel and WHM.
I am trying to recover the MySQL databases that were lost. I managed to mount the sdb1 bit, but I think that's the boot partition. I don't need that. I need to access the other files!
Any help?
You don't need file system to get you data back.
Start with taking an image from the failed disk

google compute engine mounting persistant disk issues

I am following this guide https://developers.google.com/compute/docs/troubleshooting#ssherrors specifically the section about recovering your persistent disk with another vm.
I am trying to follow this part:
mount /dev/disk/by-id/scsi-0Google_PersistentDisk_myinstance-debugging /mnt/myinstance
This is the error I get:
root#debugger:~# mount /dev/disk/by-id/scsi-0Google_PersistentDisk_marty-wll-debugging /mnt/marty-wll
mount: you must specify the filesystem type
I am unsure of the filesystem due to google-compute disks being used, and the system has already been deleted and attached to another machine following the google developers guide I referenced above.
parted scsi-0Google_PersistentDisk_marty-wll-debugging -l
root#debugger:/dev/disk/by-id# parted scsi-0Google_PersistentDisk_marty-wll-debugging -l
Model: Google PersistentDisk (scsi)
Disk /dev/sda: 10.7GB
Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Number Start End Size Type File system Flags
1 1049kB 10.7GB 10.7GB primary ext4
Model: Google PersistentDisk (scsi)
Disk /dev/sdb: 10.7GB
Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Number Start End Size Type File system Flags
1 1049kB 10.7GB 10.7GB primary ext4
gave me the information that its "ext4".
although when I issue the following command I still get an error:
root#debugger:~# mount -t ext4 /dev/disk/by-id/scsi-0Google_PersistentDisk_marty-wll-debugging /mnt/marty-wll
mount: wrong fs type, bad option, bad superblock on /dev/sdb,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so
dmesg of syslog said :
[ 2452.205447] EXT4-fs (sdb): VFS: Can't find ext4 filesystem
any ideas?
Thanks for pointing this out, I will update the docs. Try adding -part1 to the end of your device name. This will mount the partition, instead of the disk. For your specific case:
mount /dev/disk/by-id/scsi-0Google_PersistentDisk_myinstance-debugging-part1 /mnt/myinstance
Also, there are cleaner aliases, so this should work as well:
mount /dev/disk/by-id/google-myinstance-debugging-part1 /mnt/myinstance

Starting instance again after power off

How do I start instance on GCE again after power off.
Instance shows TERMINATED , but has PERSISTENT disk type.
if I use add instance with the same instance name it asks me for the
Select an new image with only choice of OS level, not my existing disk.
then fails with
ERROR: RESOURCE_ALREADY_EXISTS: The resource XXXX already exists
Is there way to start (or clone) copy of image once stopped?
Anything similar to AWS stop/start. I don't care about instance state or scratch to be saved, just start since I have boot disk stored and payed for.
Success, below is stop/start procedure, assuming that $PROJECT and $INSTANCE are set appropriately:
#--------- stop instance -----
#connect and shutdown
gcutil --project=$PROJECT ssh $INSTANCE
sudo shutdown -h now
# check
gcutil listinstances --project $PROJECT
#delete instance/keep boot disk , use -f to avoid confirmation
gcutil --project=$PROJECT deleteinstance $INSTANCE --nodelete_boot_pd
# check disks
gcutil listdisks --project=$PROJECT
#--------- start new instance -----
# launch instance using the existing disk (has to be in the same zone!)
gcutil --project=$PROJECT addinstance $INSTANCE --disk=$DISK,boot --zone=$ZONE --machine_type=n1-standard-1
#check that it's running
gcutil listinstances --project $PROJECT
You're on the right track. You just need to delete the existing TERMINATED instance before adding it again.
Even though the instance isn't running when it is TERMINATED, the resources (such as Persistent Disk) are still allocated to it.
Also, if this instance was created before December 5th, (when Compute Engine went GA), you'll need to add a kernel to the disk or it won't boot. See the transition guide for details.
(For a temporary work around to upgrading the kernel, see this Q/A: My Google Compute Engine instances hang during boot using the v1 API)