Thursday, December 11, 2008

An Introduction to ARM Assembly Language (Jason Fuller)



An Introduction to ARM Assembly Language

Jason Fuller


Who is this document for?


This document is intended for anyone who occasionally needs to debug compiled ARM code at the assembly language level.


Why would I want to do that?


Because retail builds have compiler optimizations turned on, and compiler optimizations confuse the source-level debugger. For example, the values it displays for your local variables are often wrong because the real values are kept in registers, not on the stack where the debugger knows how to find them.


Registers


The ARM CPU has 15 registers:


r0 through r3 are used as general purpose registers, but they are also used to pass the first four parameters into a function. Their values are not guaranteed to be preserved across function calls.


r0 is also used to return the return value from a function.


r4 through r11 and r13 are general purpose registers. It is the responsibility of a called function to preserve these values, i.e., to ensure that the values of these registers are the same when the function exits as when it was entered.


sp is the Stack Pointer.

lr is the Link Register, which stores the return address when a function is called. But note that lr can also be used as a general-purpose register when it is not being used as the link register, so be careful.

pc is the Program Counter, i.e., the instruction pointer.



Condition Flags


There are four condition flags that are set as the result of executing instructions:


N Negative Set if result is negative

Z Zero Set if result is zero

C Carry Set if a carry occurs, or a bit is shifted off the end by a shift instruction

V Overflow Set if overflow occurs, i.e., a signed result is bigger than 31 bits



Instructions


A full list of ARM instructions can be found at http://www.arm.com/pdfs/QRC0001H_rvct_v2.1_arm.pdf, but it's not particularly readable or educational, so I'll go over the most common instructions here.


Throughout this document, I'll use C code in the right column to explain what the assembly instruction in the left column does. I use C syntax simply because everyone is familiar with it.


Instruction C language equivalent

------------- ---------------------------

mov rx, ry rx = ry // Move register ry into rx


mov rx, ry lsl #5 rx = ry << 5 // Logical Shift Left


mov rx, ry lsr #6 rx = ry >> 6 // Logical Shift Right


mov rx, #0x12 rx = 0x12 // Move x12 into rx


mov rx, #0x21, 28 rx = 0x21 rotated right 28 bits

(see below)


str rx, [ry, #0x12] DWORD *ry; ry[0x12/4] = rx

// Store rx into memory at ry + x12



str rx, [ry, rz] DWORD *ry; ry[rz/4] = rx

// Store rx into memory at ry + rz


ldr rx, [ry], -rz ry -= rz; rx = *ry



cmp rx, ry Compare rx to ry, and set the Condition flags

accordingly, for example, Z = (rx == ry)


cmp rx, #0x12 Compare rx to 0x12.


add rx, ry, #0x12 rx = ry + 0x12



sub rx, ry, #0x12 rx = ry – 0x12


mul rx, ry, #0x12 rx = ry * 0x12


orr rx, ry, #2 rx = ry | 2


and rx, ry, #2 rx = ry & 2


bic rx, ry, #5 rx = ry & ~5 // Bit Clear


bx rx rx(); // Jump to the address in rx



A little explanation is in order about instructions that use "shifter operands" such as mov rx, 0x21, 28. It may seem like an odd instruction but is actually quite common, because it allows the compiler to stuff 32 bit constant values into just 12 bits of an instruction : 8 bits for the constant (0x21) and 4 bits for the shift amount (28) which can be any even number between 0 and 31. Without this trick of stuffing the constant into the 32-bit instruction, the compiler would have to load a constant from memory, which is much more expensive.


By the way, the example we've been using:

mov rx, 0x21, 28 rx = 0x21 rotated right 28 bits

is essentially the same as :

rx = 0x21 << (32 – 28)

or

rx = 0x21 << 4



PC – relative addressing


Even though the use of shifter operands can sometimes allow the compiler to fit a constant into an instruction, sometimes the constant simply won't fit and must be loaded from memory. Where does the compiler put these constants? Right in the instruction stream, bteween where one function ends and the next one starts. This allows the compiler to load a constant using "pc-relative addressing", that is, using the program counter as if it were a pointer to data. For example:


ldr r1, [pc, #0x1C] DWORD *pc; r1 = pc[0x1c / 4];


The one catch is that the value of pc used in the pointer arithmetic is not the address of the instruction itself. It is the address of the instruction plus 8. (This is just an artifact of the way the chip works. By the time the instruction actually executes, the pc has already been incremented.) So, for example:


Address Instruction Disassembly

------- ----------- -----------

01F05640 e59f101c ldr r1, [pc, #0x1C]

01F05664 12345678 ??? // The data is at address 1F05640 + 8 + 1C


By the way, this is why when you're looking at a disassembly, some instructions will look weird, or show as ???. It's because they're not really instructions, they're data.



Suffixes


B and H

By default, instructions operate on 32-bit words. (Note that this definition of a word is different from the Win32 concept of a WORD, which is 16 bits.) However, an instruction that has the H suffix operates on halfwords (16 bits), and an instruction with the B suffix operates on bytes. For example:


strb r1, [r8, #0x28] BYTE r1 = ((BYTE*)r8)[0x28]



S

Some instructions take an optional S suffix, which means "update the condition flags based on the result of this instruction".



Conditional suffixes


All ARM instructions are conditional, i.e., they can all be modified by a suffix indicating under what conditions the instructions should be executed. Here are the most common condition suffixes:


Suffix Condition under which instruction is to execute

------- --------------------------------------------------------

eq equal Z == 1

ne not equal Z == 0

hi unsigned higher C == 1 && Z == 0

ls unsigned lower or same C==0 || Z==1

ge signed greater or equal N==V

lt signed less than N != V

gt signed greater than Z==0 && N==V

le signed less than or equal Z==1 && N != V


Don't worry too much about the third column. The suffixes work the way you would expect them to. For example:


cmp r1, #0x5

addeq r2, r1, r3 if (r1 == 5) r2 = r1 + r3;

movne r3, #0x18 if (r1 != 5) r3 = 0x18;

movgt r3, #0x18 if (r1 > 5) r3 = 0x18;

bleq Function if (r1 == 5) Function();



If the condition is not true, then the instruction does nothing.


Conditional instructions allow the compiler to compile simple "if then else" statements without using any "jump" instructions. For example:


if (r1 == 5)

r2 = 6;

else

r3 = 7;


would become :


cmp r1, 5

moveq r2, #6

movne r3, #7

whereas the x86 compiler (generally) would have to generate two "jump" instructions: one to jump over the "then" clause if the condition was not met, and one to jump over the "else" clause if it was.




Data alignment


The ARM CPU can only access DWORDs in memory that are aligned on addresses that are divisible by 4. Likewise, it can only access 16-bit values on addresses divisible by 2. An unaligned access will result in a Datatype Misalignment exception.




Function Calls


One of the most important mechanisms to understand is how a function call happens.


Step 1 - Parameters:


The caller sets up the parameters. r0 through r3 are used to transfer the first four parameters of a function. If there are more than four, the rest are pushed on the stack. (The rightmost parameter is pushed first).


Step 2 - Call:


The function is called by executing the Branch and Link instruction:


bl MyFunction lr = pc + 4; MyFunction();


Note how this instruction sets lr to be the address to return to when the function is done.


For C++ method calls, you'll see the bx instruction instead of bl. Bx jumps to the address in a register. For example, if r0 is your C++ "this" pointer:


ldr r2, [r0] r2 = &vtable

ldr r3, [r2, #0xC] r3 = vtable[3] // == fourth method

mov lr, pc Manually set up lr since we're not using bl

bx r3 Jump to r3, i.e., call the fourth method


Step 3 – Preserve registers:


The first instruction in a function usually looks something like this:


stmdb sp!, {r4 - r6, lr} push(lr); push(r6);

push(r5); push(r4);


This is the "Store Multiple Decrement Before" instruction, which is my favorite CPU instruction of all time. It pushes an entire specified set of registers onto the stack in one instruction. This serves two purposes. First, it preserves the values of r4 through r11, and r13 (remember it is the responsibility of the called function to preserve these). Second, it safely stores away the return address, lr.


Note that the order in which it pushes the registers may be the opposite of what you expect. The nice thing about this, though, is that if you are looking at a memory dump of the stack, the registers will be in the same order in memory as they are listed in the code.


Step 4 - Locals:


Next, the callee decrements the stack pointer in order to reserve space on the stack for its local variables:


sub sp, sp, #0xC sp -= 12;


Note that the size it reserves may not be what you expect from looking at the local variables in the C code. This is because the optimizing compiler may not need to store a local on the stack at all; it may be able to get away with using registers.


Step 5 – Body


Next, the body of the function is executed. Somewhere along the way, (the optimizing compiler is free to decide where) the compiler sets r0 to the return value of the function.


Step 6 – Return


If space was allocated on the stack for locals, it is released:

add sp, sp, #0xC


Then the registers that we saved away at the beginning of the function are restored using the "Load Multiple Increment After" instruction:


ldmia sp!, {r4 – r6, lr} pop(r4); pop(r5); pop(r6); pop(lr);


And finally we jump to the return address:


bx lr



Note, if you are debugging an app compiled for a version of Windows Mobile prior to v5.0, a different code sequence will be used:


ldmia sp!, {r4 – r6, pc} pop(r4); pop(r5); pop(r6); pop(pc);


Note that this is the same list of registers as used in the stmdb instruction at the beginning of the function, except that now lr has been replaced by pc. So the value of lr when the function was entered, i.e., the return address, is now loaded into pc, the program counter. This has the effect of jumping to the return address. In other words, returning from the function.




The Frame Pointer


The Frame Pointer is not a register; it's just a concept. The Frame Pointer is the value of the stack pointer while the body of a function is executing. In other words, it's a pointer to a stack frame. What's in a stack frame? Well, from steps 3 and 4 above, a stack frame contains local variables, followed by the registers that the function needed to preserve.


Every function call in a call stack has a frame pointer. Platform Builder will show you the frame pointers if you right-click in the Call Stack window and check "Frame Pointer". In Visual Studio 2005, you need to double-click on the frame in the Call Stack window, then look at the value of "sp" in the Registers window. (If the Registers window says "No Data Available", right-click the Registers window and check "Device Registers".)




Finding local variables


Now that you understand the basics of ARM assembly, you can use them to do some cool debugging tricks. For example, the debugger often can't give you the value of local variables in a retail build, because the debugger assumes locals are on the stack, and the optimizing compiler often keeps them in registers instead. But, by looking at the disassembly window, and figuring out what the disassembly is doing and how it relates to the C source code, you can find where the compiler is storing your local variables.


Understanding the optimized code the compiler generates can be tricky, and there's no cookbook to finding your locals, but here's one tip:


Before a function is called, the parameters have to be loaded into r0, r1, r2, and r3. So it's fairly easy to look at what the compiler is loading into these registers and to match them up to the C code. For example:


RECT rc;

GetWindowRect(hWnd, &rc);


becomes:


add r1, sp, #0x20 // This tells us that rc is

// located at sp + 0x20


mov r0, r4 // This tells us that hWnd was in r4

// before this snippet of code

bl GetWindowRect



Finding local variables in other stack frames


Now let's tackle a harder problem. Suppose you want to know the value of a local that lives in a function that is not at the top of the call stack? For example, suppose your call stack looks like this:


Generic.exe!MyRegisterClass

Generic.exe!InitInstance

Generic.exe!WinMain

Generic.exe!WinMainCRTStartup


And you want to know what the value of hInstance was in WinMain. The first step is to do what we did before: read the disassembly and figure out where the value was before InitInstance was called:


int WINAPI WinMain(HINSTANCE hInstance,

HINSTANCE hPrevInstance,

LPTSTR lpCmdLine,

int nCmdShow)

{

00011960 stmdb sp!, {r4, lr}

00011964 sub sp, sp, #0x1C

00011968 mov r4, r0

MSG msg;


// Perform application initialization:

if (!InitInstance(hInstance, nCmdShow))

0001196C mov r1, r3

00011970 bl |InitInstance ( 117f8h )|



Since hInstance is the first parameter to WinMain, hInstance must have been in r0 when WinMain was entered. The mov r4,r0 instruction tells you that hInstance was in r4 when InitInstance was called. Of course, it's not still in r4 by the time we got to MyRegisterClass. So how do you figure out what r4 used to be?


Remember that it is the responsibility of a called function to preserve the values of r4 through r11 and r13. So, one solution is to just step in the debugger back out to WinMain and then look at r4. But there are a number of reasons why this might not be possible: you may be looking at a post-mortem Watson dump; the debugger might be misbehaving; you might need to do more investigation before you step back out and lose your current state, etc. So then what do you do?


Since it is the responsibility of a called function to preserve r4 – r11, that means that one of the functions in the call stack (above the function you care about) must have preserved your r4. So, starting at WinMain (since that's where your local lives) walk up the stack one frame at a time looking for a function that preserved r4 on the stack.


So we start at InitInstance. Find the beginning of the function:


BOOL InitInstance(HINSTANCE hInstance, int nCmdShow)

{

000117F8 stmdb sp!, {r4 - r7, lr}

000117FC sub sp, sp, #0x75, 30


And there it is. InitInstance preserves r4. But where exactly did it put r4? Remember what the stack looks like:


MyRegisterClass locals MyRegisterClass frame pointer

MyRegisterClass preserved registers

InitInstance locals InitInstance frame pointer

InitInstance preserved registers

WinMain locals WinMain frame pointer

WinMain preserved registers


So first you need to find the frame pointer for WinMain, which Platform Builder's call stack window will give you. Or, if you are using Visual Studio 2005, double-click WinMain's frame in the Call Stack window, then look at the value of sp in the Registers window. (If for some reason you don't have a debugger, you can even figure out the frame pointer yourself by starting with the current value of the stack pointer and looking at the prolog of each function in the stack to see how big each frame is.)


So now we know WinMain's frame pointer, which points to its locals. The "sub sp,sp,#0x75,30" instruction tells us that WinMain has 0x1D4 bytes of locals (0x1D4 == 0x75 << (32-30)). So WinMain's preserved registers start at the frame pointer + 0x1d4. And since r4 is the first of the preserved registers, r4 lives at the frame pointer + 0x1d4. And there you have it, you found the value of WinMain's local hInstance variable.


By the way, if you reach the very top of the call stack, and none of the functions preserved r4, that means that none of them trash r4, and so the current value of r4 is what you are looking for.



Wednesday, November 26, 2008

my VIM setting

" All system-wide defaults are set in $VIMRUNTIME/debian.vim (usually just
" /usr/share/vim/vimcurrent/debian.vim) and sourced by the call to :runtime
" you can find below. If you wish to change any of those settings, you should
" do it in this file (/etc/vim/vimrc), since debian.vim will be overwritten
" everytime an upgrade of the vim packages is performed. It is recommended to
" make changes after sourcing debian.vim since it alters the value of the
" 'compatible' option.

" This line should not be removed as it ensures that various options are
" properly set to work with the Vim-related packages available in Debian.
runtime! debian.vim

" Uncomment the next line to make Vim more Vi-compatible
" NOTE: debian.vim sets 'nocompatible'. Setting 'compatible' changes numerous
" options, so any other options should be set AFTER setting 'compatible'.
"set compatible

" Vim5 and later versions support syntax highlighting. Uncommenting the next
" line enables syntax highlighting by default.
syntax on

" If using a dark background within the editing area and syntax highlighting
" turn on this option as well
"set background=dark

" Uncomment the following to have Vim jump to the last position when
" reopening a file
if has("autocmd")
au BufReadPost * if line("'\"") > 0 && line("'\"") <= line("$")
\| exe "normal g'\"" | endif
endif

" Uncomment the following to have Vim load indentation rules according to the
" detected filetype. Per default Debian Vim only load filetype specific
" plugins.
if has("autocmd")
filetype indent on
endif

" MY SETTING
set wm=8 " set wrapmargin
set nohls " turn off highlight on search
set et " turn on expand tab
" AUTO-COMMANDS
" " for Makefiles
" " added some special formatting in Makefiles
autocmd BufEnter ?akefile* set noet ts=8 sw=8 nocindent list lcs=tab:>-,trail:x
" for source code
autocmd BufEnter *.cpp,*.h,*.c,*.java,*.pl set et ts=3 sw=3 cindent
" change the filetype
autocmd BufEnter *.pro,*.prolog set et ts=3 sw=3 cindent ft=prolog
" for html
autocmd BufEnter *.html set et ts=3 sw=3 wm=8 nocindent

"set softtabstop=3
"set shiftwidth=3

" The following are commented out as they cause vim to behave a lot
" differently from regular Vi. They are highly recommended though.
set showcmd " Show (partial) command in status line.
"set showmatch " Show matching brackets.
"set ignorecase " Do case insensitive matching
set smartcase " Do smart case matching
"set incsearch " Incremental search
"set autowrite " Automatically save before commands like :next and :make
"set hidden " Hide buffers when they are abandoned
"set mouse=a " Enable mouse usage (all modes) in terminals

set foldmethod=syntax
set foldlevel=100
" added for taglists
let Tlist_Show_One_File=1
let Tlist_Exit_OnlyWindow=1
"
let g:winManagerWindowLayout='FileExplorer|TagList'
nmap wm :WMToggle<cr>
" Source a global configuration file if available
" XXX Deprecated, please move your changes here in /etc/vim/vimrc
if filereadable("/etc/vim/vimrc.local")
source /etc/vim/vimrc.local
endif

Big endian VS little endian, which one is better ?

http://www.cs.umass.edu/~Verts/cs32/endian.html

Which is Better?

You may see a lot of discussion about the relative merits of the two
formats, mostly religious arguments based on the relative merits of
the PC versus the Mac. Both formats have their advantages and
disadvantages.

In "Little Endian" form, assembly language instructions for picking up
a 1, 2, 4, or longer byte number proceed in exactly the same way for
all formats: first pick up the lowest order byte at offset 0. Also,
because of the 1:1 relationship between address offset and byte number
(offset 0 is byte 0), multiple precision math routines are
correspondingly easy to write.

In "Big Endian" form, by having the high-order byte come first, you
can always test whether the number is positive or negative by looking
at the byte at offset zero. You don't have to know how long the number
is, nor do you have to skip over any bytes to find the byte containing
the sign information. The numbers are also stored in the order in
which they are printed out, so binary to decimal routines are
particularly efficient.

Monday, November 24, 2008

install puppy linux on thumb drive without cdrom

install puppy linux on thumb drive without cdrom

Puppy linux can run as live CD mode and also boot from USB. In order
to make a "linux can be taken anywhere" system, i want to install it
on thumb
drive. The standard method is to download .iso file and burn a
bootable CD. Run the install app when boot up from CD. But the problem
is : i dont
have writable cd. i just want to intstall it on the USB disk.

it is simple:

1. format thumb drive as fat filesystem.
i am not sure if it must be fat16 format. because "no bootable
operating system" error occurred when i use fat32 instead. For linux
the comamnd is
mksys.dos -t fat16 /dev/sdb1 . if you are using windows the following
tools may need.
http://www.althack.com/2006/03/10/how-to-run-linux-on-a-usb-drive/

2. get the .iso file and extract it, and copy all the following files
to thumb drive.

-r--r--r-- 1 root root 2048 2008-11-02 10:22 boot.cat
-rw-r--r-- 1 root root 1008 2008-10-18 15:03 boot.msg
-rw-r--r-- 1 root root 1268722 2008-11-02 10:22 initrd.gz
-rw-rw-r-- 1 1026 1026 12241 2008-11-01 13:34 isolinux.bin
-rw-r--r-- 1 root root 112 2008-11-02 10:22 isolinux.cfg
-rwxr--r-- 1 root root 95563776 2008-11-02 10:22 pup_411.sfs
-rw-r--r-- 1 root root 1627180 2008-11-02 10:18 vmlinuz

3. change the filename of isolinux.cfg to syslinux.cfg and delete the
context "pmedia=cd"

4. run syslinux /dev/sdb1 (if this is the thumb drive)

5. reboot with BIOS configured to "boot from USB disk"

others:
the booting is a bit slow, but it will run fast when boot up. because
all the image is load to RAM during bootup stage. linux is running as
ramdisk
mode. so it is fast , but as you known the storage media is USB which
is flash device, so the stored process may waste a lot of time and
obviously
it is not allowed to write to flash too frequently. currently i dont
have enough time to focus on how does it optimize the system for the
FLASH
media. anything to be focused ?
a. udev may need, because it will only create fs and device nod in
RAM, not USB disk.
b. check the block device driver
c. x.org ?
d. how to create log
e. anything speciall for package management

anyway, it is a good and usefull linux distribution.

Friday, November 21, 2008

Linux Serial Console HOWTO

http://www.vanemery.com/Linux/Serial/serial-console.html


Introduction

Have you ever needed to connect a dumb terminal (like a Wyse 50) to a Linux host? Do you need to login to a Linux server from a laptop to perform administrative functions, because there is no monitor or keyboard attached to the server? If you are accustomed to administering routers, switches, or firewalls in this manner, then you may be interested in doing the same with some of your GNU/Linux hosts. This HOWTO will explain, step-by-step, how to setup a serial console for Red Hat 9, although most of it should apply to other distributions as well.

Why did I write this document? Although there are lots of documents available on the Internet dealing with Linux serial ports, most of them seemed to be either out of date, or focused on modem dial-in/dial-out. I wanted consise documentation on how to setup simple terminal access via RS-232-C serial ports for Red Hat 9.

Assumptions/Setup

I was using Red Hat 9 for this test. My test machine consisted of:

  • Motherboard: Gigabyte Technology GA-7VA motherboard (Rev. 2.0)
  • Chipset: VIA KT400A
  • CPU: AMD-K7 (Duron 1400)
  • RAM: 256MB DDR333
  • Serial Ports: 2 built-in ports with 16550A UARTs, DB-9 male
  • Linux kernel: 2.4.20-24.9

Step 1: Check your system's serial support

First, let's make sure that your operating system recognizes serial ports in your hardware. You should make a visual inspection and make sure that you have one or more serial ports on your motherboard or add-in PCI card. Most motherboards have two built-in ports, which are called COM1: and COM2: in the DOS/Windows world. You may need to enable them in BIOS before the OS can recognize them. After your system boots, you can check for serial ports with the following commands:

[root@oscar root]# dmesg | grep tty
ttyS0 at 0x03f8 (irq = 4) is a 16550A
ttyS1 at 0x02f8 (irq = 3) is a 16550A

[root@oscar root]# setserial -g /dev/ttyS[01]
/dev/ttyS0, UART: 16550A, Port: 0x03f8, IRQ: 4
/dev/ttyS1, UART: 16550A, Port: 0x02f8, IRQ: 3

As you can see, the two built-in serial ports are /dev/ttyS0 and /dev/ttyS1.


Step 2: Configure your inittab to support serial console logins

The /etc/inittab file must be reconfigured to allow serial console logins. You will note that the mingetty daemon is used to listen for virtual consoles (like the 6 that run by default with your keyboard and monitor). You will need to configure agetty or mgetty to listen on the serial ports, because they are capable of responding to input on physical serial ports. In the past, I have used both full-featured gettys. In this document, I will only discuss agetty, since it is already included in the default Red Hat 9 installation. It handles console/dumb terminal connections as well as dial-in modem connections.

What is a getty?

A getty is is a program that opens a tty port, prompts for a login name, and runs the /bin/login command. It is normally invoked by init.

Before you edit /etc/inittab, which is a very important config file, you should make a backup copy:

[root@oscar etc]# cp /etc/inittab /etc/inittab.org 

The required /etc/inittab additions are highlighted in red:

id:3:initdefault:

# System initialization.
si::sysinit:/etc/rc.d/rc.sysinit

l0:0:wait:/etc/rc.d/rc 0
l1:1:wait:/etc/rc.d/rc 1
l2:2:wait:/etc/rc.d/rc 2
l3:3:wait:/etc/rc.d/rc 3
l4:4:wait:/etc/rc.d/rc 4
l5:5:wait:/etc/rc.d/rc 5
l6:6:wait:/etc/rc.d/rc 6

# Trap CTRL-ALT-DELETE
ca::ctrlaltdel:/sbin/shutdown -t3 -r now

pf::powerfail:/sbin/shutdown -f -h +2 "Power Failure; System Shutting Down"

# If power was restored before the shutdown kicked in, cancel it.
pr:12345:powerokwait:/sbin/shutdown -c "Power Restored; Shutdown Cancelled"

# Run gettys in standard runlevels
1:2345:respawn:/sbin/mingetty tty1
2:2345:respawn:/sbin/mingetty tty2
3:2345:respawn:/sbin/mingetty tty3
4:2345:respawn:/sbin/mingetty tty4
5:2345:respawn:/sbin/mingetty tty5
6:2345:respawn:/sbin/mingetty tty6

# Run agetty on COM1/ttyS0 and COM2/ttyS1
s0:2345:respawn:/sbin/agetty -L -f /etc/issueserial 9600 ttyS0 vt100
s1:2345:respawn:/sbin/agetty -L -f /etc/issueserial 38400 ttyS1 vt100
#s1:2345:respawn:/sbin/agetty -L -i 38400 ttyS1 vt100


# Run xdm in runlevel 5
x:5:respawn:/etc/X11/prefdm -nodaemon


agetty options explained:

  • -L    force line to be local line with no need for carrier detect (when you have no modem).
  • -f    alternative /etc/issue file. This is what a user sees at the login prompt.
  • -i    do not display any messages at the login prompt.
  • 9600    serial line rate in bps. Set this to your dumb terminal or terminal emulator line rate.
  • ttyS0    this is the serial port identifier.
  • vt100    is the terminal emulation. You can use others, but VT100 is the most common or "standard". Another widely used termial type is VT102.

Possible serial line rates (sometimes called baud rates) for the 16550A UART:

  • 110 bps
  • 300 bps
  • 1200 bps
  • 2400 bps
  • 4800 bps
  • 9600 bps
  • 19,200 bps
  • 38,400 bps
  • 57,600 bps
  • 115,200 bps

I have tried all of these line rates. 9600 bps is generally O.K., and is a very common setting for networking hardware. 38,400 bps is the speed of the standard Linux console, so it is my second choice. If your dumb terminal or terminal emulator cannot handle 38,400 bps, then try 19,200 bps: it is reasonably speedy and you will not be annoyed.

Here was my custom issue file, /etc/issueserial. It uses escape sequences defined in the agetty manpage to add some useful information, such as the serial port number, line speed, and how many users are currently logged on:

 
Oscar
Connected on \l at \b bps
\U

Now, you must activate the changes that you made in /etc/inittab. This is done with the following command, which forces the init process to re-read the configuration file:

[root@oscar root]# init q 

Now, let's make sure that the agetty process is listening on the serial ports:

[root@oscar root]$ ps -ef | grep agetty
root 958 1 0 Dec13 ttyS0 00:00:00 /sbin/agetty -L -f /etc/issueserial 9600 ttyS0 vt100
root 1427 1 0 Dec13 ttyS1 00:00:00 /sbin/agetty -L -f /etc/issueserial 38400 ttyS1 vt100


Step 3: Test serial port login with an external dumb terminal or terminal emulator


Wyse 50b

I have tested this setup with a WYSE dumb terminal, a Linux laptop running Minicom, and Windows 2000/XP laptops running HyperTerminal. They all worked just fine.

Terminal settings:  should be 9600, N, 8, 1. Terminal emulation should be set to VT100 or VT102. Turn flow control off. If you want to use the 38,400 bps serial port on ttyS1, then your settings should be adjusted to 38400, N, 8, 1.

Cable:  To connect a laptop to the serial port on the Linux host, you need to have a null-modem cable. The purpose of a null-modem cable is to permit two RS-232 DTE devices to communicate with each other without modems between them. While you can construct this yourself, a good, sturdy manufactured null-modem cable is inexpensive and will last longer.

If you insist on making the cable yourself, then check out Nullmodem.Com for the wiring and pinout diagram.

Connectors:  Motherboard serial ports are typically male DB-9 connectors, but some serial ports use DB-25 connectors. You may need some DB-9 to DB-25 converters or gender-changers in order to connect to your terminal. For a typical laptop to server connection, a DB-9 null-modem cable should be sufficient.

Here is what you should see on the dumb terminal or terminal emulator:

Oscar
Connected on ttyS1 at 38400 bps
3 users

oscar.vanemery.com login:

Note:  If you want to be able to login via serial console as the root user, you will need to edit the /etc/securetty config file. The entries to add are highlighted in red:

console
ttyS0
ttyS1

vc/1
vc/2
vc/3
vc/4
vc/5
vc/6
vc/7
vc/8
vc/9
vc/10
vc/11
tty1
tty2
tty3
tty4
tty5
tty6
tty7
tty8
tty9
tty10
tty11


Step 4: Modifying the agetty settings

If you want to change the baud rate or some other agetty setting, you will need to perform these 3 steps:

  1. Modify the /etc/inittab configuration file
  2. Activate the config change by forcing init to re-read the config file
  3. Restart the agetty daemons

Here is an example of steps 2 and 3:

[root@oscar root]# init q
[root@oscar root]# pkill agetty


Optional:  Configure serial port as THE system console

You can use options in /etc/grub.conf to redirect console output to one of your serial ports. This can be handy if you do not have a keyboard or monitor available for the Linux host in question. You can also see all of the bootup and shutdown messages from your terminal. In this example, we will make the /dev/ttyS1 port be the console. The text to add to the config file is highlighted in red:

# grub.conf generated by anaconda
#boot=/dev/hda
default=0
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
title Red Hat Linux (2.4.20-24.9)
root (hd0,0)
kernel /vmlinuz-2.4.20-24.9 ro root=LABEL=/ console=ttyS1,38400
initrd /initrd-2.4.20-24.9.img

Now, if you drop your system into single user mode with the "init 1" command, you will still be able to administer the system from your serial-connected terminal. No monitor or keyboard is required!

Warning!:   The kudzu hardware detection program may "choke" on boot when the serial port becomes the console, instead of the video adapter. To remedy this situation, you should disable kudzu (assuming that your hardware is configured properly and won't be changing). This is how you would do that:

[root@oscar root]# chkconfig kudzu off
[root@oscar root]# chkconfig --list kudzu
kudzu 0:off 1:off 2:off 3:off 4:off 5:off 6:off

You should also know how to break into the Grub bootloader during system startup and edit the kernel line. By deleting the console argument from the kernel line, you can boot the system with the standard console, which uses the video card and attached keyboard. You have been warned!


Conclusion

Now, you should be able to login from the serial ports on your GNU/Linux host. This could be useful for maintenance or for serving a whole room full of dumb terminals. In the future, I will investigate a PCI multiport serial card in the latter role.

Have fun!


Saturday, November 15, 2008

the max nubmer of threads can be created in linux

/proc/sys/kernel/threads-max

kernel/fork.c: fork_init()
max_threads = mempages / ( 8 * THREAD_SIZE / PAGE_SIZE);

Itanium C++ ABI: Exception Handling


http://www.codesourcery.com/public/cxx-abi/abi-eh.html


Itanium C++ ABI: Exception Handling ($Revision: 1.22 $)


Contents


Introduction

In this document, we define the C++ exception handling ABI, at three levels:
  1. the base ABI, interfaces common to all languages and implementations;
  2. the C++ ABI, interfaces necessary for interoperability of C++ implementations; and
  3. the specification of a particular runtime implementation.

This specification is based on the general model described roughly in the Itanium Software Conventions and Runtime Architecture Guide. However, the Level I (base ABI) specification here contradicts that document in some particulars, and is being proposed as a modification. That document describes a framework which can be used by an arbitrary implementation, with a complete definition of the stack unwind mechanism, but no significant constraints on the language-specific processing. In particular, it is not sufficient to guarantee that two object files compiled by different C++ compilers could interoperate, e.g. throwing an exception in one of them and catching it in the other.

In Section I below, we will elaborate missing details from this base document, largely in the form of specifying the APIs to be used in accessing the language-independent stack unwind facilities, namely the unwind descriptor tables and the personality routines. This specification should be implemented by any Itanium psABI-compliant system.

In Section II below, we will specify the API of the C++ exception handling facilities, specifically for raising and catching exceptions. These APIs should be implemented by any C++ system compliant with the Itanium C++ ABI. Note that the level II and level III specifications are not completed at this time.


Definitions

The descriptions below make use of the following definitions:

landing pad
:
A section of user code intended to catch, or otherwise clean up after, an exception. It gains control from the exception runtime via the personality routine, and after doing the appropriate processing either merges into the normal user code or returns to the runtime by resuming or raising a new exception.


Base Documents

This document is based on the C++ ABI for Itanium, and the Level II specification below is considered to be part of that document (Chapter 4). See Base Documents in that document for further references.


Level I. Base ABI

This section defines the Unwind Library interface, expected to be provided by any Itanium psABI-compliant system. This is the interface on which the C++ ABI exception-handling facilities are built. We assume as a basis the unwind descriptor tables described in the base Itanium Software Conventions & Runtime Architecture Guide. Our focus here will on the APIs for accessing those structures.

It is intended that nothing in this section be specific to C++, though some parts are clearly intended to support C++ features.

The unwinding library interface consists of at least the following routines:

   _Unwind_RaiseException,   _Unwind_Resume,   _Unwind_DeleteException,   _Unwind_GetGR,   _Unwind_SetGR,   _Unwind_GetIP,   _Unwind_SetIP,   _Unwind_GetRegionStart,   _Unwind_GetLanguageSpecificData,   _Unwind_ForcedUnwind 
In addition, two datatypes are defined (_Unwind_Context and _Unwind_Exception) to interface a calling runtime (such as the C++ runtime) and the above routines. All routines and interfaces behave as if defined extern "C". In particular, the names are not mangled. All names defined as part of this interface have a "_Unwind_" prefix.

Lastly, a language and vendor specific personality routine will be stored by the compiler in the unwind descriptor for the stack frames requiring exception processing. The personality routine is called by the unwinder to handle language-specific tasks such as identifying the frame handling a particular exception.

1.1 Exception Handler Framework

Reasons for Unwinding

There are two major reasons for unwinding the stack:

  • exceptions, as defined by languages that support them (such as C++)
  • "forced" unwinding (such as caused by longjmp or thread termination).
The interface described here tries to keep both similar. There is a major difference, however.

  • In the case an exception is thrown, the stack is unwound while the exception propagates, but it is expected that the personality routine for each stack frame knows whether it wants to catch the exception or pass it through. This choice is thus delegated to the personality routine, which is expected to act properly for any type of exception, whether "native" or "foreign". Some guidelines for "acting properly" are given below.

  • During "forced unwinding", on the other hand, an external agent is driving the unwinding. For instance, this can be the longjmp routine. This external agent, not each personality routine, knows when to stop unwinding. The fact that a personality routine is not given a choice about whether unwinding will proceed is indicated by the _UA_FORCE_UNWIND flag.

To accomodate these differences, two different routines are proposed. _Unwind_RaiseException performs exception-style unwinding, under control of the personality routines. _Unwind_ForcedUnwind, on the other hand, performs unwinding, but gives an external agent the opportunity to intercept calls to the personality routine. This is done using a proxy personality routine, that intercepts calls to the personality routine, letting the external agent override the defaults of the stack frame's personality routine.

As a consequence, it is not necessary for each personality routine to know about any of the possible external agents that may cause an unwind. For instance, the C++ personality routine need deal only with C++ exceptions (and possibly disguising foreign exceptions), but it does not need to know anything specific about unwinding done on behalf of longjmp or pthreads cancellation.

The Unwind Process

The standard ABI exception handling / unwind process begins with the raising of an exception, in one of the forms mentioned above. This call specifies an exception object and an exception class.

The runtime framework then starts a two-phase process:

  • In the search phase, the framework repeatedly calls the personality routine, with the _UA_SEARCH_PHASE flag as described below, first for the current PC and register state, and then unwinding a frame to a new PC at each step, until the personality routine reports either success (a handler found in the queried frame) or failure (no handler) in all frames. It does not actually restore the unwound state, and the personality routine must access the state through the API.

  • If the search phase reports failure, e.g. because no handler was found, it will call terminate() rather than commence phase 2.

    If the search phase reports success, the framework restarts in the cleanup phase. Again, it repeatedly calls the personality routine, with the_UA_CLEANUP_PHASE flag as described below, first for the current PC and register state, and then unwinding a frame to a new PC at each step, until it gets to the frame with an identified handler. At that point, it restores the register state, and control is transferred to the user landing pad code.

Each of these two phases uses both the unwind library and the personality routines, since the validity of a given handler and the mechanism for transferring control to it are language-dependent, but the method of locating and restoring previous stack frames is language independent.

A two-phase exception-handling model is not strictly necessary to implement C++ language semantics, but it does provide some benefits. For example, the first phase allows an exception-handling mechanism to dismiss an exception before stack unwinding begins, which allows resumptive exception handling (correcting the exceptional condition and resuming execution at the point where it was raised). While C++ does not support resumptive exception handling, other languages do, and the two-phase model allows C++ to coexist with those languages on the stack.

Note that even with a two-phase model, we may execute each of the two phases more than once for a single exception, as if the exception was being thrown more than once. For instance, since it is not possible to determine if a given catch clause will rethrow or not without executing it, the exception propagation effectively stops at each catch clause, and if it needs to restart, restarts at phase 1. This process is not needed for destructors (cleanup code), so the phase 1 can safely process all destructor-only frames at once and stop at the next enclosing catch clause.

For example, if the first two frames unwound contain only cleanup code, and the third frame contains a C++ catch clause, the personality routine in phase 1 does not indicate that it found a handler for the first two frames. It must do so for the third frame, because it is unknown how the exception will propagate out of this third frame, e.g. by rethrowing the exception or throwing a new one in C++.

The API specified by the Itanium psABI for implementing this framework is described in the following sections.

1.2 Data Structures

Reason Codes

The unwind interface uses reason codes in several contexts to identify the reasons for failures or other actions, defined as follows:

     typedef enum { 	_URC_NO_REASON = 0, 	_URC_FOREIGN_EXCEPTION_CAUGHT = 1, 	_URC_FATAL_PHASE2_ERROR = 2, 	_URC_FATAL_PHASE1_ERROR = 3, 	_URC_NORMAL_STOP = 4, 	_URC_END_OF_STACK = 5, 	_URC_HANDLER_FOUND = 6, 	_URC_INSTALL_CONTEXT = 7, 	_URC_CONTINUE_UNWIND = 8     } _Unwind_Reason_Code; 
The interpretations of these codes are described below.

Exception Header

The unwind interface uses a pointer to an exception header object as its representation of an exception being thrown. In general, the full representation of an exception object is language- and implementation-specific, but it will be prefixed by a header understood by the unwind interface, defined as follows:

     typedef void (*_Unwind_Exception_Cleanup_Fn) 		(_Unwind_Reason_Code reason, 		 struct _Unwind_Exception *exc);      struct _Unwind_Exception { 	    uint64			 exception_class; 	    _Unwind_Exception_Cleanup_Fn exception_cleanup; 	    uint64			 private_1; 	    uint64			 private_2;     }; 

An _Unwind_Exception object must be double-word aligned. The first two fields are set by user code prior to raising the exception, and the latter two should never be touched except by the runtime.

The exception_class field is a language- and implementation-specific identifier of the kind of exception. It allows a personality routine to distinguish between native and foreign exceptions, for example. By convention, the high 4 bytes indicate the vendor (for instance HP\0\0), and the low 4 bytes indicate the language. For the C++ ABI described in this document, the low four bytes are C++\0.

The exception_cleanup routine is called whenever an exception object needs to be destroyed by a different runtime than the runtime which created the exception object, for instance if a Java exception is caught by a C++ catch handler. In such a case, a reason code (see above) indicates why the exception object needs to be deleted:

  • _URC_FOREIGN_EXCEPTION_CAUGHT = 1: This indicates that a different runtime caught this exception. Nested foreign exceptions, or rethrowing a foreign exception, result in undefined behaviour.

  • _URC_FATAL_PHASE1_ERROR = 3: The personality routine encountered an error during phase 1, other than the specific error codes defined.

  • _URC_FATAL_PHASE2_ERROR = 2: The personality routine encountered an error during phase 2, for instance a stack corruption.

    <b>NOTE</b>: Normally, all errors should be reported during phase 1 by returning from _Unwind_RaiseException. However, landing pad code could cause stack corruption between phase 1 and phase 2. For a C++ exception, the runtime should call terminate() in that case.

The private unwinder state (private_1 and private_2) in an exception object should be neither read by nor written to by personality routines or other parts of the language-specific runtime. It is used by the specific implementation of the unwinder on the host to store internal information, for instance to remember the final handler frame between unwinding phases.

In addition to the above information, a typical runtime such as the C++ runtime will add language-specific information used to process the exception. This is expected to be a contiguous area of memory after the _Unwind_Exception object, but this is not required as long as the matching personality routines know how to deal with it, and the exception_cleanup routine de-allocates it properly.

Unwind Context

The _Unwind_Context type is an opaque type used to refer to a system-specific data structure used by the system unwinder. This context is created and destroyed by the system, and passed to the personality routine during unwinding.

    struct _Unwind_Context 

1.3 Throwing an Exception

_Unwind_RaiseException
   _Unwind_Reason_Code _Unwind_RaiseException 	      ( struct _Unwind_Exception *exception_object ); 

Raise an exception, passing along the given exception object, which should have its exception_class and exception_cleanup fields set. The exception object has been allocated by the language-specific runtime, and has a language-specific format, except that it must contain an _Unwind_Exception struct (see Exception Header above). _Unwind_RaiseException does not return, unless an error condition is found (such as no handler for the exception, bad stack format, etc.). In such a case, an _Unwind_Reason_Code value is returned. Possibilities are:

  • _URC_END_OF_STACK: The unwinder encountered the end of the stack during phase 1, without finding a handler. The unwind runtime will not have modified the stack. The C++ runtime will normally call uncaught_exception() in this case.

  • _URC_FATAL_PHASE1_ERROR: The unwinder encountered an unexpected error during phase 1, e.g. stack corruption. The unwind runtime will not have modified the stack. The C++ runtime will normally call terminate() in this case.

If the unwinder encounters an unexpected error during phase 2, it should return _URC_FATAL_PHASE2_ERROR to its caller. In C++, this will usually be__cxa_throw, which will call terminate().

<b>NOTE</b>: The unwind runtime will likely have modified the stack (e.g. popped frames from it) or register context, or landing pad code may have corrupted them. As a result, the the caller of _Unwind_RaiseException can make no assumptions about the state of its stack or registers.

_Unwind_ForcedUnwind
    typedef _Unwind_Reason_Code (*_Unwind_Stop_Fn) 		(int version, 		 _Unwind_Action actions, 		 uint64 exceptionClass, 		 struct _Unwind_Exception *exceptionObject, 		 struct _Unwind_Context *context, 		 void *stop_parameter );      _Unwind_Reason_Code _Unwind_ForcedUnwind 	      ( struct _Unwind_Exception *exception_object, 		_Unwind_Stop_Fn stop, 		void *stop_parameter ); 

Raise an exception for forced unwinding, passing along the given exception object, which should have its exception_class and exception_cleanup fields set. The exception object has been allocated by the language-specific runtime, and has a language-specific format, except that it must contain an _Unwind_Exceptionstruct (see Exception Header above).

Forced unwinding is a single-phase process (phase 2 of the normal exception-handling process). The stop and stop_parameter parameters control the termination of the unwind process, instead of the usual personality routine query. The stop function parameter is called for each unwind frame, with the parameters described for the usual personality routine below, plus an additional stop_parameter.

When the stop function identifies the destination frame, it transfers control (according to its own, unspecified, conventions) to the user code as appropriate without returning, normally after calling _Unwind_DeleteException. If not, it should return an _Unwind_Reason_Code value as follows:

  • _URC_NO_REASON: This is not the destination frame. The unwind runtime will call the frame's personality routine with the _UA_FORCE_UNWIND and_UA_CLEANUP_PHASE flags set in actions, and then unwind to the next frame and call the stop function again.

  • _URC_END_OF_STACK: In order to allow _Unwind_ForcedUnwind to perform special processing when it reaches the end of the stack, the unwind runtime will call it after the last frame is rejected, with a NULL stack pointer in the context, and the stop function must catch this condition (i.e. by noticing the NULL stack pointer). It may return this reason code if it cannot handle end-of-stack.

  • _URC_FATAL_PHASE2_ERROR: The stop function may return this code for other fatal conditions, e.g. stack corruption.
If the stop function returns any reason code other than _URC_NO_REASON, the stack state is indeterminate from the point of view of the caller of_Unwind_ForcedUnwind. Rather than attempt to return, therefore, the unwind library should return _URC_FATAL_PHASE2_ERROR to its caller.

<b>NOTE</b>: Example: longjmp_unwind()

The expected implementation of longjmp_unwind() is as follows. The setjmp() routine will have saved the state to be restored in its customary place, including the frame pointer. The longjmp_unwind() routine will call _Unwind_ForcedUnwind with a stop function that compares the frame pointer in the context record with the saved frame pointer. If equal, it will restore the setjmp() state as customary, and otherwise it will return _URC_NO_REASON or_URC_END_OF_STACK.

<b>NOTE</b>: If a future requirement for two-phase forced unwinding were identified, an alternate routine could be defined to request it, and an actionsparameter flag defined to support it.

_Unwind_Resume
    void _Unwind_Resume (struct _Unwind_Exception *exception_object); 

Resume propagation of an existing exception e.g. after executing cleanup code in a partially unwound stack. A call to this routine is inserted at the end of a landing pad that performed cleanup, but did not resume normal execution. It causes unwinding to proceed further.

<b>NOTE 1</b>: _Unwind_Resume should not be used to implement rethrowing. To the unwinding runtime, the catch code that rethrows was a handler, and the previous unwinding session was terminated before entering it. Rethrowing is implemented by calling _Unwind_RaiseException again with the same exception object.

<b>NOTE 2</b>: This is the only routine in the unwind library which is expected to be called directly by generated code: it will be called at the end of a landing pad in a "landing-pad" model.

1.4 Exception Object Management

_Unwind_DeleteException
    void _Unwind_DeleteException 	      (struct _Unwind_Exception *exception_object); 

Deletes the given exception object. If a given runtime resumes normal execution after catching a foreign exception, it will not know how to delete that exception. Such an exception will be deleted by calling _Unwind_DeleteException. This is a convenience function that calls the function pointed to by the exception_cleanupfield of the exception header.

1.5 Context Management

These functions are used for communicating information about the unwind context (i.e. the unwind descriptors and the user register state) between the unwind library and the personality routine and landing pad. They include routines to read or set the context record images of registers in the stack frame corresponding to a given unwind context, and to identify the location of the current unwind descriptors and unwind frame.

_Unwind_GetGR
    uint64 _Unwind_GetGR 	    (struct _Unwind_Context *context, int index); 

This function returns the 64-bit value of the given general register. The register is identified by its index: 0 to 31 are for the fixed registers, and 32 to 127 are for the stacked registers.

During the two phases of unwinding, only GR1 has a guaranteed value, which is the Global Pointer (GP) of the frame referenced by the unwind context. If the register has its NAT bit set, the behaviour is unspecified.

_Unwind_SetGR
    void _Unwind_SetGR 	  (struct _Unwind_Context *context, 	   int index, 	   uint64 new_value); 

This function sets the 64-bit value of the given register, identified by its index as for _Unwind_GetGR. The NAT bit of the given register is reset.

The behaviour is guaranteed only if the function is called during phase 2 of unwinding, and applied to an unwind context representing a handler frame, for which the personality routine will return _URC_INSTALL_CONTEXT. In that case, only registers GR15, GR16, GR17, GR18 should be used. These scratch registers are reserved for passing arguments between the personality routine and the landing pads.

_Unwind_GetIP
    uint64 _Unwind_GetIP 	    (struct _Unwind_Context *context); 

This function returns the 64-bit value of the instruction pointer (IP).

During unwinding, the value is guaranteed to be the address of the bundle immediately following the call site in the function identified by the unwind context. This value may be outside of the procedure fragment for a function call that is known to not return (such as _Unwind_Resume).

_Unwind_SetIP
    void _Unwind_SetIP 	    (struct _Unwind_Context *context, 	     uint64 new_value); 

This function sets the value of the instruction pointer (IP) for the routine identified by the unwind context.

The behaviour is guaranteed only when this function is called for an unwind context representing a handler frame, for which the personality routine will return_URC_INSTALL_CONTEXT. In this case, control will be transferred to the given address, which should be the address of a landing pad.

_Unwind_GetLanguageSpecificData
    uint64 _Unwind_GetLanguageSpecificData 	    (struct _Unwind_Context *context); 

This routine returns the address of the language-specific data area for the current stack frame.

<b>NOTE</b>: This routine is not stricly required: it could be accessed through _Unwind_GetIP using the documented format of the UnwindInfoBlock, but since this work has been done for finding the personality routine in the first place, it makes sense to cache the result in the context. We could also pass it as an argument to the personality routine.

_Unwind_GetRegionStart
    uint64 _Unwind_GetRegionStart 	    (struct _Unwind_Context *context); 

This routine returns the address of the beginning of the procedure or code fragment described by the current unwind descriptor block.

This information is required to access any data stored relative to the beginning of the procedure fragment. For instance, a call site table might be stored relative to the beginning of the procedure fragment that contains the calls. During unwinding, the function returns the start of the procedure fragment containing the call site in the current stack frame.

1.6 Personality Routine

    _Unwind_Reason_Code (*__personality_routine) 	    (int version, 	     _Unwind_Action actions, 	     uint64 exceptionClass, 	     struct _Unwind_Exception *exceptionObject, 	     struct _Unwind_Context *context); 

The personality routine is the function in the C++ (or other language) runtime library which serves as an interface between the system unwind library and language-specific exception handling semantics. It is specific to the code fragment described by an unwind info block, and it is always referenced via the pointer in the unwind info block, and hence it has no psABI-specified name.

1.6.1 Parameters

The personality routine parameters are as follows:

version
Version number of the unwinding runtime, used to detect a mis-match between the unwinder conventions and the personality routine, or to provide backward compatibility. For the conventions described in this document, version will be 1.

actions
Indicates what processing the personality routine is expected to perform, as a bit mask. The possible actions are described below.

exceptionClass
An 8-byte identifier specifying the type of the thrown exception. By convention, the high 4 bytes indicate the vendor (for instance HP\0\0), and the low 4 bytes indicate the language. For the C++ ABI described in this document, the low four bytes are C++\0.

<b>NOTE</b>: This is not a null-terminated string. Some implementations may use no null bytes.

exceptionObject
The pointer to a memory location recording the necessary information for processing the exception according to the semantics of a given language (see theException Header section above).

context
Unwinder state information for use by the personality routine. This is an opaque handle used by the personality routine in particular to access the frame's registers (see the Unwind Context section above).

return value
The return value from the personality routine indicates how further unwind should happen, as well as possible error conditions. See the following section.

1.6.2 Personality Routine Actions

The actions argument to the personality routine is a bitwise OR of one or more of the following constants:

     typedef int _Unwind_Action;     static const _Unwind_Action _UA_SEARCH_PHASE = 1;     static const _Unwind_Action _UA_CLEANUP_PHASE = 2;     static const _Unwind_Action _UA_HANDLER_FRAME = 4;     static const _Unwind_Action _UA_FORCE_UNWIND = 8; 

_UA_SEARCH_PHASE
Indicates that the personality routine should check if the current frame contains a handler, and if so return _URC_HANDLER_FOUND, or otherwise return_URC_CONTINUE_UNWIND_UA_SEARCH_PHASE cannot be set at the same time as _UA_CLEANUP_PHASE.

_UA_CLEANUP_PHASE
Indicates that the personality routine should perform cleanup for the current frame. The personality routine can perform this cleanup itself, by calling nested procedures, and return _URC_CONTINUE_UNWIND. Alternatively, it can setup the registers (including the IP) for transferring control to a "landing pad", and return _URC_INSTALL_CONTEXT.

_UA_HANDLER_FRAME
During phase 2, indicates to the personality routine that the current frame is the one which was flagged as the handler frame during phase 1. The personality routine is not allowed to change its mind between phase 1 and phase 2, i.e. it must handle the exception in this frame in phase 2.

_UA_FORCE_UNWIND
During phase 2, indicates that no language is allowed to "catch" the exception. This flag is set while unwinding the stack for longjmp or during thread cancellation. User-defined code in a catch clause may still be executed, but the catch clause must resume unwinding with a call to _Unwind_Resume when finished.

1.6.3 Transferring Control to a Landing Pad

If the personality routine determines that it should transfer control to a landing pad (in phase 2), it may set up registers (including IP) with suitable values for entering the landing pad (e.g. with landing pad parameters), by calling the context management routines above. It then returns _URC_INSTALL_CONTEXT.

Prior to executing code in the landing pad, the unwind library restores registers not altered by the personality routine, using the context record, to their state in that frame before the call that threw the exception, as follows. All registers specified as callee-saved by the base ABI are restored, as well as scratch registers GR15, GR16, GR17 and GR18 (see below). Except for those exceptions, scratch (or caller-saved) registers are not preserved, and their contents are undefined on transfer. The accessibility of registers in the frame will be restored to that at the point of call, i.e. the same logical registers will be accessible, but their mappings to physical registers may change. Further, the state of stacked registers beyond the current frame is unspecified, i.e. they may be either in physical registers or on the register stack.

The landing pad can either resume normal execution (as, for instance, at the end of a C++ catch), or resume unwinding by calling _Unwind_Resume and passing it the exceptionObject argument received by the personality routine. _Unwind_Resume will never return.

_Unwind_Resume should be called if and only if the personality routine did not return _Unwind_HANDLER_FOUND during phase 1. As a result, the unwinder can allocate resources (for instance memory) and keep track of them in the exception object reserved words. It should then free these resources before transferring control to the last (handler) landing pad. It does not need to free the resources before entering non-handler landing-pads, since _Unwind_Resume will ultimately be called.

The landing pad may receive arguments from the runtime, typically passed in registers set using _Unwind_SetGR by the personality routine. For a landing pad that can call to _Unwind_Resume, one argument must be the exceptionObject pointer, which must be preserved to be passed to _Unwind_Resume.

The landing pad may receive other arguments, for instance a switch value indicating the type of the exception. Four scratch registers are reserved for this use (GR15, GR16, GR17 and GR18.)

1.6.4 Rules for Correct Inter-Language Operation

The following rules must be observed for correct operation between languages and/or runtimes from different vendors:

An exception which has an unknown class must not be altered by the personality routine. The semantics of foreign exception processing depend on the language of the stack frame being unwound. This covers in particular how exceptions from a foreign language are mapped to the native language in that frame.

If a runtime resumes normal execution, and the caught exception was created by another runtime, it should call _Unwind_DeleteException. This is true even if it understands the exception object format (such as would be the case between different C++ runtimes).

A runtime is not allowed to catch an exception if the _UA_FORCE_UNWIND flag was passed to the personality routine.

 Example: Foreign Exceptions in C++. In C++, foreign exceptions can be caught by a catch(...) statement. They can also be caught as if they were of a __foreign_exception class, defined in <exception>. The __foreign_exception may have subclasses, such as __java_exception and__ada_exception, if the runtime is capable of identifying some of the foreign languages.

The behavior is undefined in the following cases:

  • __foreign_exception catch argument is accessed in any way (including taking its address).

  • __foreign_exception is active at the same time as another exception (either there is a nested exception while catching the foreign exception, or the foreign exception was itself nested).

  • uncaught_exception(), set_terminate(), set_unexpected(), terminate(), or unexpected() is called at a time a foreign exception exists (for example, calling set_terminate() during unwinding of a foreign exception).

All these cases might involve accessing C++ specific content of the thrown exception, for instance to chain active exceptions.

Otherwise, a catch block catching a foreign exception is allowed:

  • to resume normal execution, thereby stopping propagation of the foreign exception and deleting it, or

  • to rethrow the foreign exception. In that case, the original exception object must be unaltered by the C++ runtime.

A catch-all block may be executed during forced unwinding. For instance, a longjmp may execute code in a catch(...) during stack unwinding. However, if this happens, unwinding will proceed at the end of the catch-all block, whether or not there is an explicit rethrow.

Setting the low 4 bytes of exception class to C++\0 is reserved for use by C++ runtimes compatible with the common C++ ABI.


Level II: C++ ABI

2.1 Introduction

The second level of specification is the minimum required to allow interoperability in the sense described above. This level requires agreement on:

  • Standard runtime initialization, e.g. pre-allocation of space for out-of-memory exceptions.

  • The layout of the exception object created by a throw and processed by a catch clause.

  • When and how the exception object is allocated and destroyed.

  • The API of the personality routine, i.e. the parameters passed to it, the logical actions it performs, and any results it returns (either function results to indicate success, failure, or continue, or changes in global or exception object state), for both the phase 1 handler search and the phase 2 cleanup/unwind.

  • How control is ultimately transferred back to the user program at a catch clause or other resumption point. That is, will the last personality routine transfer control directly to the user code resumption point, or will it return information to the runtime allowing the latter to do so?

  • Multithreading behavior.

2.2 Data Structures

2.2.1 C++ Exception Objects

A complete C++ exception object consists of a header, which is a wrapper around an unwind object header with additional C++ specific information, followed by the thrown C++ exception object itself. The structure of the header is as follows:

       struct __cxa_exception {  	std::type_info *	exceptionType; 	void (*exceptionDestructor) (void *);  	unexpected_handler	unexpectedHandler; 	terminate_handler	terminateHandler; 	__cxa_exception *	nextException;  	int			handlerCount; 	int			handlerSwitchValue; 	const char *		actionRecord; 	const char *		languageSpecificData; 	void *			catchTemp; 	void *			adjustedPtr;  	_Unwind_Exception	unwindHeader;       }; 

The fields in the exception object are as follows:

  • The exceptionType field encodes the type of the thrown exception. The exceptionDestructor field contains a function pointer to a destructor for the type being thrown, and may be NULL. These pointers must be stored in the exception object since non-polymorphic and built-in types can be thrown.

  • The fields unexpectedHandler and terminateHandler contain pointers to the unexpected and terminate handlers at the point where the exception is thrown. The ISO C++ Final Draft International Standard [lib.unexpected] (18.6.2.4) states that the handlers to be used are those active immediately after evaluating the throw argument. If destructors change the active handlers during unwinding, the new values are not used until unwinding is complete.

  • The nextException field is used to create a linked list of exceptions (per thread).

  • The handlerCount field contains a count of how many handlers have caught this exception object. It is also used to determine exception life-time (see Section ??? [was 11.12]).

  • The handlerSwitchValueactionRecordlanguageSpecificDatacatchTemp, and adjustedPtr fields cache information that is best computed during pass 1, but useful during pass 2. By storing this information in the exception object, the cleanup phase can avoid re-examining action records. These fields are reserved for use of the personality routine for the stack frame containing the handler to be invoked.

  • The unwindHeader structure is used to allow correct operation of exception in the presence of multiple languages or multiple runtimes for the same language. The _Unwind_Exception type is described in Section 1.2.

By convention, a __cxa_exception pointer points at the C++ object representing the exception being thrown, immediately following the header. The header structure is accessed at a negative offset from the __cxa_exception pointer. This layout allows consistent treatment of exception objects from different languages (or different implementations of the same language), and allows future extensions of the header structure while maintaining binary compatibility.

Version information is not required, since the general unwind library framework specifies an exception class identifier, which will change should the layout of the exception object change significantly.

2.2.2 Caught Exception Stack

Each thread in a C++ program has access to an object of the following class:

      struct __cxa_eh_globals { 	__cxa_exception *	caughtExceptions; 	unsigned int		uncaughtExceptions;       }; 

The fields of this structure are defined as follows:

  • The caughtExceptions field is a list of the active exceptions, organized as a stack with the most recent first, linked through the nextException field of the exception header.

  • The uncaughtExceptions field is a count of uncaught exceptions, for use by the C++ library uncaught_exceptions() routine.

This information is maintained on a per-thread basis. Thus, caughtExceptions is a list of exceptions thrown and caught by the current thread, anduncaughtExceptions is a count of exceptions thrown and not yet caught by the current thread. (This includes rethrown exceptions, which may still have active handlers, but are not considered caught.)

The __cxa_eh_globals for the current thread can be obtained by using either of the APIs:

  • __cxa_eh_globals *__cxa_get_globals(void) : 
    Return a pointer to the __cxa_eh_globals structure for the current thread, initializing it if necessary.

  • __cxa_eh_globals *__cxa_get_globals_fast(void) : 
    Return a pointer to the __cxa_eh_globals structure for the current thread, assuming that at least one prior call to __cxa_get_globals has been made from the current thread.

2.3 Standard Runtime Initialization

2.4 Throwing an Exception

This section specifies the process by which the C++ generated code and runtime library throw an exception, transferring control to the unwind library for handling.

2.4.1 Overview of Throw Processing

In broad outline, a possible implementation of the processing necessary to throw an exception includes the following steps:

  • Call __cxa_allocate_exception to create an exception object (see Section 2.4.2).

  • Evaluate the thrown expression, and copy it into the buffer returned by __cxa_allocate_exception, possibly using a copy constructor. If evaluation of the thrown expression exits by throwing an exception, that exception will propagate instead of the expression itself. Cleanup code must ensure that __cxa_free_exception is called on the just allocated exception object. (If the copy constructor itself exits by throwing an exception, terminate() is called.)

  • Call __cxa_throw to pass the exception to the runtime library (see Section 2.4.3). __cxa_throw never returns.

Based on this outline, throwing an object X as in:

	throw X; 
will produce code approximating the template:
	// Allocate -- never throws: 	temp1 = __cxa_allocate_exception(sizeof(X));  	// Construct the exception object: 	#if COPY_ELISION 	  [evaluate X into temp1] 	#else 	  [evaluate X into temp2] 	  copy-constructor(temp1, temp2) 	  // Landing Pad L1 if this throws