What does the boot time means in the QEMU? - qemu

When i boot linux on the qemu, there is the time stamp in the boot log as follows,
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Initializing cgroup subsys cpuacct
[ 0.000000] Linux version 3.10.0 (zlp#lab302i-ES) (gcc version 4.9.3 20150626 (Red Hat 4.9.3-2) (GCC) ) #33 PREEMPT Mon Dec 2 14:39:51 CST 2019
[ 0.000000] Config serial console: console=ttyS0,38400n8r
[ 0.000000] bootconsole [early0] enabled
[ 0.000000] CPU revision is: 00018900 (MIPS 5KE)
[ 0.000000] FPU revision is: 00738900
[ 0.000000] Software DMA cache coherency enabled
[ 0.000000] Determined physical RAM map:
[ 0.000000] memory: 0000000000001000 # 0000000000000000 (reserved)
[ 0.000000] memory: 00000000000ef000 # 0000000000001000 (ROM data)
[ 0.000000] memory: 000000000071c000 # 00000000000f0000 (reserved)
[ 0.000000] memory: 000000000f7f4000 # 000000000080c000 (usable)
[ 0.000000] Wasting 28840 bytes for tracking 515 unused pages
[ 0.000000] Reserving 0MB of memory at 0MB for crashkernel
[ 0.000000] Kernel command line: rd_start=0xffffffff80810000 rd_size=16642887 root=/dev/ram0 nokaslr console=ttyS0,38400n8r
[ 0.000000] PID hash table entries: 1024 (order: -1, 8192 bytes)
[ 0.000000] Dentry cache hash table entries: 32768 (order: 4, 262144 bytes)
[ 0.000000] Inode-cache hash table entries: 16384 (order: 3, 131072 bytes)
[ 0.000000] Cache parity protection disabled
[ 0.000000] allocated 262128 bytes of page_cgroup
[ 0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups
[ 0.000000] Memory: 252272k/253904k available (4743k kernel code, 1632k reserved, 1899k data, 320k init, 0k highmem)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[ 0.000000] Preemptible hierarchical RCU implementation.
[ 0.000000] NR_IRQS:256
[ 0.000000] Console: colour dummy device 80x25
[ 0.000000] Calibrating delay loop... 1145.06 BogoMIPS (lpj=2236416)
[ 0.074218] pid_max: default: 32768 minimum: 301
[ 0.074218] Security Framework initialized
[ 0.074218] AppArmor: AppArmor disabled by boot time parameter
[ 0.074218] Mount-cache hash table entries: 1024
[ 0.078125] Initializing cgroup subsys memory
[ 0.078125] Initializing cgroup subsys devices
[ 0.078125] Initializing cgroup subsys freezer
[ 0.078125] Initializing cgroup subsys blkio
[ 0.078125] Initializing cgroup subsys perf_event
[ 0.089843] devtmpfs: initialized
[ 0.093750] NET: Registered protocol family 16
...
In a real board, the time stamp means the boot time. Now in QEMU, what does it mean?
Is it the real time consumed by the QEMU to run the linux boot code? Is it accurate?
It seems that this time stamp is shorter than the real board. Is it because of the QEMU don't have the real I/O operations?
Can i evaluate the real time consumed or the time proportion of each part of the kernel boot of the real board by the QEMU boot log?

In QEMU it means guest time as perceived by linux. How it relates to the host time depends on what emulated clock device is used for time measurement.
Guest linux kernel typically reads some register of that emulated clock device and converts it into its system time using clock frequency that is configured somewhere or is measured by the kernel.
Emulated clock devices can expose clock based on one of the QEMU internal clock types, typically QEMU_CLOCK_VIRTUAL, but they're not limited to it.
When QEMU_CLOCK_VIRTUAL is used the way it ticks depends on whether QEMU is started with -icount switch or not. When -icount is not used QEMU_CLOCK_VIRTUAL ticks synchronously with the host clock when guest CPU is running. When -icount is used it specifies how many nanoseconds of QEMU_CLOCK_VIRTUAL time each executed guest instruction takes. QEMU_CLOCK_VIRTUAL then ticks proportionally to the number of guest instructions executed by the guest CPU. It also ticks when guest CPU does not execute guest instructions (e.g. is halted waiting for an interrupt) but I don't know the details.
Is it the real time consumed by the QEMU to run the linux boot code?
Is it accurate?
It has some relation to the time taken by the QEMU to run the linux boot code, but it's not precisely it and it depends on the guest kernel configuration, emulated hardware and the way QEMU is started.

Related

can't run '/etc/init.d/rcS': No such file or directory

I am trying to emulate a firmware image using qemu. During booting, I get the following error
can't run '/etc/init.d/rcS': No such file or directory
can't open /dev/ttyS0: No such file or directory
can't open /dev/ttyS0: No such file or directory
can't open /dev/ttyS0: No such file or directory
.
.
.
This is the content of the inittab file
# Startup the system
null::sysinit:/etc/init.d/rc.sysinit
# now run any rc scripts
::sysinit:/etc/init.d/rcS
# Put a getty on the serial port
ttyS0::respawn:/sbin/getty -L ttyS0 115200 vt100
# Stuff to do before rebooting
null::shutdown:/bin/umount -a -r
It is able to run the rc.sysinit, but not the rcS.
I have checked permissions of the rcS. Also, the filesystem is mounted as read-only cramfs. Could this be causing an issue?
This is the command I am running:
QEMU_AUDIO_DRV=none \qemu-system-arm -m 256M -M versatilepb
-kernel ~/linux-2.6.23/arch/arm/boot/zImage
-append "console=ttyAMA0,115200 root=/dev/ram rdinit=/sbin/init"
-initrd ~/tmpcramfs2
-nographic
These are the boot messages obtained on running the command:
Linux version 2.6.23 (hsailer#SvanteArrhenius) (gcc version 4.0.2) #1 Thu May 27 09:31:10 EDT 2021
CPU: ARM926EJ-S [41069265] revision 5 (ARMv5TEJ), cr=00093177
Machine: ARM-Versatile PB
Memory policy: ECC disabled, Data cache writeback
CPU0: D VIVT write-through cache
CPU0: I cache: 4096 bytes, associativity 4, 32 byte lines, 32 sets
CPU0: D cache: 65536 bytes, associativity 4, 32 byte lines, 512 sets
Built 1 zonelists in Zone order. Total pages: 65024
Kernel command line: console=ttyAMA0,115200 root=/dev/ram rdinit=/sbin/init
PID hash table entries: 1024 (order: 10, 4096 bytes)
Console: colour dummy device 80x30
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Memory: 256MB = 256MB total
Memory: 249600KB available (2508K code, 227K data, 100K init)
Mount-cache hash table entries: 512
CPU: Testing write buffer coherency: ok
NET: Registered protocol family 16
NET: Registered protocol family 2
Time: timer3 clocksource has been installed.
IP route cache hash table entries: 2048 (order: 1, 8192 bytes)
TCP established hash table entries: 8192 (order: 4, 65536 bytes)
TCP bind hash table entries: 8192 (order: 3, 32768 bytes)
TCP: Hash tables configured (established 8192 bind 8192)
TCP reno registered
checking if image is initramfs...it isn't (bad gzip magic numbers); looks like an initrd
Freeing initrd memory: 7184K
NetWinder Floating Point Emulator V0.97 (double precision)
Installing knfsd (copyright (C) 1996 okir#monad.swb.de).
JFFS2 version 2.2. (NAND) © 2001-2006 Red Hat, Inc.
JFS: nTxBlock = 2007, nTxLock = 16063
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
CLCD: Versatile hardware, VGA display
Clock CLCDCLK: setting VCO reg params: S=1 R=99 V=98
Console: switching to colour frame buffer device 80x60
Serial: AMBA PL011 UART driver
dev:f1: ttyAMA0 at MMIO 0x101f1000 (irq = 12) is a AMBA/PL011
console [ttyAMA0] enabled
dev:f2: ttyAMA1 at MMIO 0x101f2000 (irq = 13) is a AMBA/PL011
dev:f3: ttyAMA2 at MMIO 0x101f3000 (irq = 14) is a AMBA/PL011
fpga:09: ttyAMA3 at MMIO 0x10009000 (irq = 38) is a AMBA/PL011
RAMDISK driver initialized: 16 RAM disks of 8192K size 1024 blocksize
smc91x.c: v1.1, sep 22 2004 by Nicolas Pitre <nico#cam.org>
eth0: SMC91C11xFD (rev 1) at d098e000 IRQ 25 [nowait]
eth0: Ethernet addr: 52:54:00:12:34:56
armflash.0: Found 1 x32 devices at 0x0 in 32-bit bank
Intel/Sharp Extended Query Table at 0x0031
Using buffer write method
RedBoot partition parsing not available
afs partition parsing not available
armflash: probe of armflash.0 failed with error -22
mice: PS/2 mouse device common for all mice
input: AT Raw Set 2 keyboard as /class/input/input0
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
VFP support v0.3: implementor 41 architecture 1 part 10 variant 9 rev 0
input: ImExPS/2 Generic Explorer Mouse as /class/input/input1
RAMDISK: cramfs filesystem found at block 0
RAMDISK: Loading 7184KiB [1 disk] into ram disk... done.
VFS: Mounted root (cramfs filesystem) readonly.
Freeing init memory: 100K
can't run '/etc/init.d/rcS': No such file or directory
can't open /dev/ttyS0: No such file or directory
can't open /dev/ttyS0: No such file or directory
can't open /dev/ttyS0: No such file or directory
.
.
.
The errors about /dev/ttyS0 are because your inittab is specifying the wrong device name for the serial port for the (emulated) hardware you're running on. Your QEMU command specifies the 'versatilepb' board, whose serial devices are PL011s, which appear in /dev/ as /dev/ttyAMA0, /dev/ttyAMA1, etc. (/dev/ttyS0 is what the serial ports on an x86 PC appear as.) You need to fix that line of the inittab to refer to ttyAMA0 instead.
For the rcS error, I would suggest you start by double-checking all the things listed in all the responses to this older question.

Instance Doesnt boot correctly, hangs on - "a start job is running for LSB: Raise network Interface.."

My VM was shutdown due to end of Trial. However i have since made payment and started other instances.
GCE UI shows this system as successfully booted, however looking at the serial port it shows the following (see image)
Any ideas how to fix this ?
Screenshot of Boot Error:
[ 6.895575] ppdev: user-space parallel port driver
[ 6.951588] ip6_tables: (C) 2000-2006 Netfilter Core Team
[ 6.993046] AVX version of gcm_enc/dec engaged.
[ 6.996351] alg: No test for __gcm-aes-aesni (__driver-gcm-aes-aesni)
[ 7.001659] alg: No test for crc32 (crc32-pclmul)
[ OK ] Started LSB: start firewall.
[***] A start job is running for LSB: Raise network interf...17s / no limit)

Google Compute instance won't mount persistent disk, maintains ~100% CPU

During some routine use of my web server (saving posts via WordPress), my instance suddenly jumped up to 400% CPU usage and wouldn't come back down below 100%. Restarting and stopping/starting the instance didn't change anything.
Looking at the last bit of my serial output:
[ 0.678602] md: Waiting for all devices to be available before autodetect
[ 0.679518] md: If you don't use raid, use raid=noautodetect
[ 0.680548] md: Autodetecting RAID arrays.
[ 0.681284] md: Scanned 0 and added 0 devices.
[ 0.682173] md: autorun ...
[ 0.682765] md: ... autorun DONE.
[ 0.683716] VFS: Cannot open root device "sda1" or unknown-block(0,0): error -6
[ 0.685298] Please append a correct "root=" boot option; here are the available partitions:
[ 0.686676] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[ 0.688489] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.19.0-30-generic #34~14.04.1-Ubuntu
[ 0.689287] Hardware name: Google Google, BIOS Google 01/01/2011
[ 0.689287] ffffea00008ae400 ffff880024ee7db8 ffffffff817af477 000000000000111e
[ 0.689287] ffffffff81a7c6c0 ffff880024ee7e38 ffffffff817a9338 ffff880024ee7dd8
[ 0.689287] ffffffff00000010 ffff880024ee7e48 ffff880024ee7de8 ffff880024ee7e38
[ 0.689287] Call Trace:
[ 0.689287] [<ffffffff817af477>] dump_stack+0x45/0x57
[ 0.689287] [<ffffffff817a9338>] panic+0xc1/0x1f5
[ 0.689287] [<ffffffff81d3e5f3>] mount_block_root+0x210/0x2a9
[ 0.689287] [<ffffffff81d3e822>] mount_root+0x54/0x58
[ 0.689287] [<ffffffff81d3e993>] prepare_namespace+0x16d/0x1a6
[ 0.689287] [<ffffffff81d3e304>] kernel_init_freeable+0x1f6/0x20b
[ 0.689287] [<ffffffff81d3d9a7>] ? initcall_blacklist+0xc0/0xc0
[ 0.689287] [<ffffffff8179fab0>] ? rest_init+0x80/0x80
[ 0.689287] [<ffffffff8179fabe>] kernel_init+0xe/0xf0
[ 0.689287] [<ffffffff817b6d98>] ret_from_fork+0x58/0x90
[ 0.689287] [<ffffffff8179fab0>] ? rest_init+0x80/0x80
[ 0.689287] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 0.689287] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
(Not sure if it's obvious from that, but I'm using the standard Ubuntu 14.04 image)
I've tried taking snapshots and mounting them on new instances, and now I've even deleted the instance and mounted the disk on to a new one, still the same issue and exactly the same serial output.
I really hope my data has not been hopelessly corrupted. Not sure if anyone has any suggestions on recovering data from a persistent disk?
Note that the accepted answer for: Google Compute Engine VM instance: VFS: Unable to mount root fs on unknown-block did not work for me.
I posted this on another question, but this question is worded better, so I'll re-post it here.
What Causes This?
That is the million dollar question. After inspecting my GCE VM, I found out there were 14 different kernels installed taking up several hundred MB's of space. Most of the kernels didn't have a corresponding initrd.img file, and were therefore not bootable (including 3.19.0-39-generic).
I certainly never went around trying to install random kernels, and once removed, they no longer appear as available upgrades, so I'm not sure what happened. Seriously, what happened?
Edit: New response from Google Cloud Support.
I received another disconcerting response. This may explain the additional, errant kernels.
"On rare occasions, a VM needs to be migrated from one physical host to another. In such case, a kernel upgrade and security patches might be applied by Google."
How to recover your instance...
After several back-and-forth emails, I finally received a response from support that allowed me to resolve the issue. Be mindful, you will have to change things to match your unique VM.
Take a snapshot of the disk first in case we need to roll back any of the changes below.
Edit the properties of the broken instance to disable this option: "Delete boot disk when instance is deleted"
Delete the broken instance.
IMPORTANT: ensure not to select the option to delete the boot disk. Otherwise, the disk will get removed permanently!!
Start up a new temporary instance.
Attach the broken disk (this will appear as /dev/sdb1) to the temporary instance
When the temporary instance is booted up, do the following:
In the temporary instance:
# Run fsck to fix any disk corruption issues
$ sudo fsck.ext4 -a /dev/sdb1
# Mount the disk from the broken vm
$ sudo mkdir /mnt/sdb
$ sudo mount /dev/sdb1 /mnt/sdb/ -t ext4
# Find out the UUID of the broken disk. In this case, the uuid of sdb1 is d9cae47b-328f-482a-a202-d0ba41926661
$ ls -alt /dev/disk/by-uuid/
lrwxrwxrwx. 1 root root 10 Jan 6 07:43 d9cae47b-328f-482a-a202-d0ba41926661 -> ../../sdb1
lrwxrwxrwx. 1 root root 10 Jan 6 05:39 a8cf6ab7-92fb-42c6-b95f-d437f94aaf98 -> ../../sda1
# Update the UUID in grub.cfg (if necessary)
$ sudo vim /mnt/sdb/boot/grub/grub.cfg
Note: This ^^^ is where I deviated from the support instructions.
Instead of modifying all the boot entries to set root=UUID=[uuid character string], I looked for all the entries that set root=/dev/sda1 and deleted them. I also deleted every entry that didn't set an initrd.img file. The top boot entry with correct parameters in my case ended up being 3.19.0-31-generic. But yours may be different.
# Flush all changes to disk
$ sudo sync
# Shut down the temporary instance
$ sudo shutdown -h now
Finally, detach the HDD from the temporary instance, and create a new instance based off of the fixed disk. It will hopefully boot.
Assuming it does boot, you have a lot of work to do. If you have half as many unused kernels as me, then you might want to purge the unused ones (especially since some are likely missing a corresponding initrd.img file).
I used the second answer (the terminal-based one) in this askubuntu question to purge the other kernels.
Note: Make sure you don't purge the kernel you booted in with!
In order to recover your data, you need to create a brand new instance where you can ssh, and attach the corrupted disk to it as a secondary disk. More information can be found in this article. I would suggest taking a snapshot of the corrupted disk before attaching it, for backup purposes.

Hadoop Balancer command WARN messages - threads quota is exceeded

I am trying to run Hadoop balancer command as follows:
hadoop balancer -threshold 1
But I am getting several WARN messages as
Failed to move blk_1073742036_1212 with size=134217728 from 192.168.30.4:50010 to 192.168.30.2:50010 through 192.168.30.4:50010: block move is failed: Not able to receive block 1073742036 from /192.168.10.3:53115 because threads quota is exceeded.
And at the end...
No block has been moved for 5 iterations. Exiting...
Balancing took 4.092883333333333 minutes
I set ulimit values as follows:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 2065455
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 64000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 65535
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
But still I am getting the same error.
Could someone please give me suggestions on this. Appreciate your help.
Question was asked a long time ago, posting an answer for posterity's sake.
The Hadoop balancer has a bug where it prematurely exits iterations. This caused the balancer to be very slow. This was fixed in HDFS-6621 and officially released as part of Apache Hadoop 2.6.0. Since this is a bug in the Balancer itself, it is possible to run an updated version of the Balancer without upgrading your cluster.
Datanodes will limit the number of threads used for balancing so as to not eat up all the resources of the cluster/datanode. This is what causes the WARN statement you're seeing. By default the number of threads is 5. This was not configurable prior to Apache Hadoop 2.5.0. HDFS-6595 added this proeprty dfs.datanode.balance.max.concurrent.moves to allow you to control the number of threads used for balancing. Since this is a datanode side property, this will require an upgrade to your cluster if you want to use this setting.
If you're using a distribution packaged by a vendor (e.g. Hortonworks, Cloudera, etc), the mentioned fixes may have been back-patched to an earlier version. Check your vendors release notes to find out.

Why my kernel config options did not get enabled at the first make?

I was solving Task 02 of Eudyptula Challenge. I had to download latest kernel source, create a working config and change a kernel config flag, boot to newly configured kernel.
I downloaded source and followed below procedure:
1) make localmodconfig -> generated .config from my pc config
2) edited .config and enable required flag -> CONFIG_LOCALVERSION_AUTO=y.
3) make
4) make module
5) make module_install
6) make install
7) update-grub
When I submitted solution I got response that "Linus's tree is newer than this, or you forgot to set the requested configuration option :("
Though my kernel is latest release by Linus. So, i didn't updated or anything.
Then, I decided to build it again and did following:
1) make clean
2) make oldconfig
3) make modules
4) make modules_install
5) make insstall
6) update-grub
And sent logs for review. This time the log passed test.
Here are two dmesg log:
1)First time log:
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Initializing cgroup subsys cpuacct
[ 0.000000] Linux version 3.16.0-rc3 (sunil#ubuntu) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #1 SMP Thu Jul 3 00:03:50 PDT 2014
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.16.0-rc3 root=UUID=5560b107-9a97-4ca5-8f23-fe1d8798d37b ro quiet splash
[ 0.000000] KERNEL supported cpus:
[ 0.000000] Intel GenuineIntel
[ 0.000000] AMD AuthenticAMD
[ 0.000000] Centaur CentaurHauls
2) Second time log
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Initializing cgroup subsys cpuacct
[ 0.000000] Linux version 3.16.0-rc3-00149-g034a0f6-dirty (sunil#ubuntu) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #3 SMP Fri Jul 4 18:29:56 IST 2014
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.16.0-rc3-00149-g034a0f6-dirty root=UUID=5560b107-9a97-4ca5-8f23-fe1d8798d37b ro quiet splash
[ 0.000000] KERNEL supported cpus:
[ 0.000000] Intel GenuineIntel
[ 0.000000] AMD AuthenticAMD
[ 0.000000] Centaur CentaurHauls
So, why it was not accepted first time?
You have obviously modified something in the source code. Otherwise there would be no 'dirty' string in the version.