Monday, July 13, 2009

compile compile compile

I've been procrastinating a lot about the shader compiler project.

I seem to always find some excuse to turn on the TV, watch some web site, etc. I keep looking for distractions because the compiler thing is really not much fun !

I like computers because they are very linear, logic, unambiguous.. but high level programming languages aren't like that.
Sure, they are all well defined syntactically, but they are also very much a human interface.
What makes an high level language so powerful is also what makes it painful to implement 8)

Currently, I do most of my work around trees.. or rather the AST (Abstract Syntax Tree) that comes out of the parser.

The nice thing about functions in a shader is that they they get inlined. This seems to be the general rule, and it's what I decided to go for.
Inlining can bring to larger code, but simplifies things a lot as one doesn't need to keep a stack with values to save and restore before and after function calls.
I think that RenderMan SL inlines because it specifically does not support recursive functions.. which would require a stack system.

About working with trees, here is a simple example:

c = a + b
becomes

=
 c
 +
  a
  b
..where indentation determines parenting.
So, "c" and "+" are both child of "=". This makes it easy to then convert even complex operations utilizing basic instructions.

The assembly generator would scan the tree, going down to the deepest operator and working its way up:

  ;+ (tmp)
  ; a
  ; b
  add tmp, a, b

  ;=
  ; c
  ; + (tmp)
  mov c, tmp
..two operators, two instructions.

Here is a what happens with a function being called by a shader:

float dude()
{
  float  c = 2;
  return c + 1;
}

surface test5()
{
  Oi = dude();
}
The intermediate tree looks like this:

// "dude" is unused and kept in the tree only to use as a base for
// inlining
float
dude
(
  {
    float
    =
      c
      2
    return
    +
      c
      1

// "test5" is the shader, where all code is really generated
surface
test5
(
  {
    // this is "dude" being inlined
    (
      {
        float
        =
          c
          2
        =    // assignment derived from "return"
          Oi // destination for the return value
          +
            c
            1
"dude" gets inlined, and
return c + 1
becomes
Oi = c + 1

..and the resulting pseudo-asm is..


surface test5
  mov $s0 2     // c = 2
  add $s1 $s0 1 // tmp = c + 1
  mov Oi $s1    // Oi = tmp
  ret

..all but global variables (Oi, in RSL is a global and an output variable), become registers.
I prefix registers with $ and "s" in the name stands for scalar, as opposed to vector or matrix registers.

Still, there are many complications. Up-conversion is one thing that bugs me. That is, how a scalar can become a vector:

vector a = 1;
..this is very practical, but one more complication. I somewhat implemented this, but there is going to be more about it, I'm sure.

Another issue is the built in functions. For example the function "distance":

float a = distance( c, d );
For these I started building some definitions that get automatically included. So, "distance" gets its own defintion.

float distance( vector c, vector d )
{
 float tmp;
 __distance( tmp, c, d );
 return tmp;
}
..here the function "distance" explicitly calls the intrinsic "__distance". The intrinsic has a 1:1 correspondence to the asm/VM opcode and it should be easy to produce asm from that.

So..

__distance( tmp, c, d )
..should become..

__distance $s0, $v0, $v1

Intrinsics are not in yet, and they'll be next, once I'm done with function calls.. almost there !

Then conditionals should be interesting. I've already implemented some conditionals.. but things get complicated with shaders, because of the concept of varying and uniform types.. another topic for the future !

zzzzzzzzzzzzz