Sunday, 27 July 2014


Just been playing with edge equations this morning.

Here it's recursively determining the fill area of the triangle, red is no content, green is all fill, blue is partial fill. Dunno how useful it is in this form but it looks nifty.

If the 3 edge equation results for each corner of a tile are turned into bits then the equations for each case are simple bit logic.

        int ec0 = edgeCode(e, x0, y0);
        int ec1 = edgeCode(e, x0 + tsize, y0);
        int ec2 = edgeCode(e, x0, y0 + tsize);
        int ec3 = edgeCode(e, x0 + tsize, y0 + tsize);

        int and = ec0 & ec1 & ec2 & ec3;
        int orr = ec0 | ec1 | ec2 | ec3;

        all_filled = and == 7;
        all_empty = orr != 7;

Rather than rely on floating point compare (aka subtract) which adds further latency to the calculation and thus cannot be directly pipelined, I form form the edgeCode directly using integer arithmetic.

public static int edgeCode(float[] e, float x, float y) {
        float v0 = x * e[0] + y * e[1] + e[2];
        float v1 = x * e[3] + y * e[4] + e[5];
        float v2 = x * e[6] + y * e[7] + e[8];
        int c0 = Float.floatToRawIntBits(v0);
        int c1 = Float.floatToRawIntBits(v1);
        int c2 = Float.floatToRawIntBits(v2);

        return (c0 >>> 31) | ((c1 >>> 31) << 1) | ((c2 >>> 31) << 2);

(>>> is a LSR op in Java).

Since epiphany (and most decent ISAs) share float and int registers the above is going to translate directly into clean machine code. This stuff might need to live on the ARM too and is SIMDable.

Actually there's a bunch of optimisations possible that reduce that instruction count and if using power-of two tile sizes and fixed-point arithmetic everything can be reduced to simple integer addition; but I haven't explored that yet.

Obviously this is working toward toward one important requirement: the renderer will have to tile to take advantage of the LDS, and it also needs pipelineable/simdable algorithms. But that's enough for this weekend, things to do ...

No comments: