How to Enable Magic SysRq in Red Hat Linux

Red Hat Trouble SYSrq


SysRq

How to Enable Magic SysRq in Red Hat Linux

Magic SysRq key sequence is used to collect additional system information to aid in the troubleshooting of system hangs or panics.

NOTE: Disable ASR so that ASR will not restart the server before the SysRq information can be captured.

1. Ensure that β€œKernel Hacking” has been compiled into the running kernel.
CONFIG_MAGIC_SYSRQ=y This is required for SysRq to work. ItÒ€ℒs enabled on all RHEL by default.

2. Configure Red Hat to run in runlevel 3. This requires editing of /etc/inittab. (id:3:initdefault: )

3. Enable SysRq in /etc/sysctl.conf. ( kernel.sysrq = 1 ) Then do sysctl Γ’β‚¬β€œp for the system to read the sysctl file.

4. It is recommended that a serial console be set up to capture the SysRq information. Attached to the serial console will be another computer running a terminal emulator such as minicom or hyperterminal.

To enable the serial console port, edit the boot loader kernel load statement.

For GRUB this would be in /boot/grub/grub.conf. (kernel /vmlinuz-2.4.9-e.40 ro root=/dev/cciss/c0d0p2 console=ttyS0,9600n8 console=tty0 )

5. Attach the computer running the terminal emulator to the serial console port using a null modem cable. ttyS0 is COM1 and ttyS1 is COM2.
To match the 9600n8 setting in the previous step, the terminal emulator should be set to 9600bps, no parity, 8 data bits, 1 stop bit, hardware flow control.

6. Reboot the server.
7. There should be text output on the terminal emulation screen during a portion of the Red Hat boot up sequence. This will verify that the terminal emulator is set up and connected properly. There will be no keyboard interaction at the terminal emulator window. The terminal emultor is installed to aid in capturing all of the text output of the SysRq commands. All of the SysRq commands will be initiated at the server console.

8. To invoke the Magic SysRq, press . The most useful

run the
commands:

echo m > /proc/sysrq-trigger
echo t > /proc/sysrq-trigger
echo p > /proc/sysrq-trigger

This will dump debugging information the file /var/log/messages. If the system hangs during the copy, go to the keyboard attached and press the following keystrokes:

ALT-SysRq-m
ALT-SysRq-t
ALT-SysRq-p

Some keyboards may not have a key labeled SysRq but the SysRq key is also known as the β€˜Print Screen’ key. This will dump the debugging information even if the system is unresponsive. If it doesn’t dump it, you have either a hardware problem or a serious issue with the kernel .

Command keys are as follows:

β€˜m’ – Will dump current memory information to your console.
β€˜p’ – Will dump the current registers and flags for each processor to your console. Press ENTER for each processor.
β€˜t’ – Will dump a list of current tasks and their information to the console.

Run alt-SysRq-p multiple times so that we can be sure to get output from all CPUs on the machine.

Also, run alt-sysrq-m last as it has a possibility of locking the box up harder then it already is.

If the system was setup as a netdump client, the SysRq output will also get logged to the log file on the netdump server.

Addition command keys are:

β€˜r’ – Turns off keyboard raw mode and sets it to XLATE.
β€˜k’ – Secure Access Key (SAK) – Kills all programs on the current virtual console.
β€˜b’ – Will immediately reboot the system without syncing or unmounting the disks.
β€˜o’ – Will power off the system (if configured and supported).
β€˜s’ – Will attempt to sync all mounted filesystems.
β€˜u’ – Will attempt to remount all mounted filesystems as read-only.
β€˜0’ – β€˜9’ – Sets the console log level, controlling which kernel messages will be printed to the console. (β€˜0’, for example would make it so that only emergency messages such as PANICs or OOPSes would be displayed on the console.
β€˜e’ – Send a SIGTERM to all processes, except for init.
β€˜i’ – Send a SIGKILL to all process, except for init.
β€˜l’ – Send a SIGKILL to all processes, INCLUDING init.
β€˜h’ – Will display help.

One can cleanly reboot a hung/frozen system with the following keyboard combination (SysRq is enabled and system responds to the keys):

Alt-SysRq-R (keyboard in raw mode)
Alt-SysRq-S (save unsaved data to disk)
Alt-SysRq-E (send termination signal)
Alt-SysRq-I (send kill signal)
Alt-SysRq-U (remount all mounted file systems)
Alt-SysRq-B (reboots the system)

Here are the correct sequences if you set up in iLO the remote console hotkeys similar to mine or if you are doing it from the physical alternate console:

remote Physical Console What it should display
ilo cons (Alternate Cons)
——– β€”β€”β€”β€”β€”- β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”-
ctrl-T Ctrl-Alt-SysRq-p To display process information
ctrl-U Ctrl-Alt-SysRq-m To display memory information
ctrl-V Ctrl-Alt-SysRq-t To display call trace information
ctrl-W To switch to a non-graphics alternate console

SysRq also writes the info to /var/log/messages if it is able to do so.
If you are doing this from an iLO Remote console using the hotkeys I defined, you should see something similar to the following when you enter ctrl-T, ctrl-U, and ctrl-V:

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

[root@colard root]# SysRq : Show Regs

Pid/TGid: 0/0, comm: swapper
EIP: 0060:[] CPU: 1
EIP is at default_idle [kernel] 0x29 (2.4.21-20.ELsmp)
ESP: 080b:c01091c2 EFLAGS: 00000246 Tainted: P
EAX: 00000000 EBX: c0109100 ECX: c043bc80 EDX: c9b20000
ESI: c9b20000 EDI: c9b20000 EBP: c0109100 DS: 0068 ES: 0068 FS: 0000 GS: 0000
CR0: 8005003b CR2: b729f000 CR3: 376c9900 CR4: 000006f0
Call Trace: [] cpu_idle [kernel] 0x42 (0xc9b21fb0)
[] call_console_drivers [kernel] 0x63 (0xc9b21fc4) [] printk [kernel] 0x153 (0xc9b21ffc)

Zone:Normal freepages:108783 min: 1279 low: 4544 high: 6304
Zone:HighMem freepages:1209405 min: 255 low: 20990 high: 31485
Free pages: 1321089 (1209405 HighMem)
( Active: 78806/14876, inactive_laundry: 4493, inactive_clean: 0, free:
1321089
)
aa:0 ac:0 id:0 il:0 ic:0 fr:2901
aa:0 ac:0 id:0 il:0 ic:0 fr:2901
aa:0 ac:32699 id:3 il:1 ic:0 fr:108783
aa:18729 ac:27378 id:14873 il:4492 ic:0 fr:1209405 1*4kB 4*8kB 5*16kB 3*32kB 2*64kB 0*128kB 2*256kB 1*512kB 0*1024kB 1*2048kB 2*4096kB = 11604kB) 31*4kB 48*8kB 10*16kB 1*32kB 0*64kB 4*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 105*4096kB = 435132kB) 1363*4kB 495*8kB 121*16kB 21*32kB 2*64kB 3*128kB 0*256kB 2*512kB 1*1024kB 1*2048kB 1177*4096kB = 4837620kB) Swap cache: add 0, delete 0, find 0/0, race 0+0 51200 pages of slabcache 180 pages of kernel stacks 0 lowmem pagetables, 799 highmem pagetables
Free swap: 2044072kB
1572863 pages of RAM
1277945 pages of HIGHMEM
95987 reserved pages
94481 pages shared
0 pages swap cached

[] sys_read [kernel] 0x97 (0xe34d3f94)

bash S 00000002 1776 3221 2619 (NOTLB)
Call Trace: [] schedule [kernel] 0x2f4 (0xe3651e78)
[] vgacon_cursor [kernel] 0xf3 (0xe3651e9c) [] schedule_timeout [kernel] 0xbc (0xe3651ebc) [] write_chan [kernel] 0x151 (0xe3651ed4) [] read_chan [kernel] 0x291 (0xe3651ef4) [] do_tty_write [kernel] 0x14d (0xe3651f40) [] tty_read [kernel] 0x114 (0xe3651f74) [] sys_read [kernel] 0x97 (0xe3651f94)

hpdiags-bin S 00000001 4244 4593 2457 2459 (NOTLB)
Call Trace: [] schedule [kernel] 0x2f4 (0xf701bdc0)
[] schedule_timeout [kernel] 0xbc (0xf701be04) [] wait_for_connect [kernel] 0x1a8 (0xf701be3c) [] tcp_accept [kernel] 0x145 (0xf701be98) [] inet_accept [kernel] 0x35 (0xf701beb4) [] sys_accept [kernel] 0x86 (0xf701bed4) [] tcp_listen_start [kernel] 0x191 (0xf701bf2c) [] inet_listen [kernel] 0xb3 (0xf701bf50) [] sys_listen [kernel] 0x50 (0xf701bf68) [] sys_socketcall [kernel] 0xd9 (0xf701bf80)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Posts in this Series