A blog about programming topics, mostly JVM, Linux kernel, and x86 related things.

Friday, November 9, 2007

Design of the LIR

So here's a diagram what the low-level intermediate representation (LIR) looks like:

The compiler only works on complete methods so you can think of struct compilation_unit as basically the same as a method. Instruction selector translates statement lists and expression trees into low-level struct insns that resemble instructions of the target machine architecture. Each instruction has zero or more struct operands that can refer to zero or more machine registers (think memory base + index addressing mode).

To glue the code emission phase, which works on struct insns, and the register allocator, I have added a struct register_info which is embedded in struct operand. A struct register_info knows its parent struct live_interval which is the basic unit of register allocation. When an interval is split, the changed struct register_infos are updated accordingly after which one struct var_info can effectively reside in multiple machine registers if spilling is needed during register allocation.

If all goes well, I probably can move on to interval splitting next week...

Friday, November 2, 2007

Data Structures for Live Interval Splitting

I am still trying to figure out how to do live interval splitting. While the actual splitting algorithm seems easy enough, I haven't yet figured out how to arrange the data structures (struct insn, struct var_info, and struct live_range). Right now, each struct insn has a one or more pointers to struct var_info which has a member ->reg that represents the machine register. This doesn't work with live interval splitting because even if two or more instructions operate on the same temporary, they may operate on different machine registers. That's because splitting requires register spilling and the same register is not necessarily available for both intervals (because of, for example, fixed register intervals).

One option is to move the ->reg member to struct operand (we need to have two actually, because of base + index addressing mode) and traverse all instructions after register allocation to figure out the machine register. Seems bit ugly though.