Friday, 17 October 2014

A sprung spring

Obviously pruned at the right time of year this time. Also got rid of most of the black spot, but not all.

One way ...

The other way ...

The biggest of the lot (bit past it's prime, and doesn't get enough sun, but wow getta load of the size of that).

And that's only half of them. About half of the total are scented and 'smelling roses'. The blurry purple one at the back of the first shot has a rich musk scent and there's a couple more of those.

I can't take credit for the choice of cultivars as they came with the house when I bought it, but one tries to keep them in shape and so far this year they're doing rather well. I did kill one last year but i think it was wild stock and not much to worry about.

Wednesday, 15 October 2014

SFA

Have been taking a break from hacking lately, so not much to talk about around here.

Well partly that is because I started back at work a month ago. Not really getting into that either and it's been pretty tiring. I don't feel like i'm getting anywhere but looking back I delivered a good pile of stuff in a very short time so my frame of reference is just out.

Also did a bit of reading, played some games, threw a grand final party - although the game was a bit of a washout, cleaned the house for that, did some gardening and pruned the roses (which are really fantastic right now with big bunches of hand-sized flowers). Nearly bought a PS4 for driveclub during Bathurst last Sunday but BigW were out of the bundle I was after - I guess I will now wait for the next reasonable deal because i'm still after it. If it was something i worked on I would consider the game+engine a masterpiece of engineering although obviously not the broken online part - I would really be interested to know how they fucked it up and whether it's a technology issue, but I guess we'll never know. Not being fully functional after a week points to a pretty large architectural or technology problem and not just a typo somewhere though. I guess writing a twitter+facebook+realtime racing thing isn't all that easy after-all.

Been doing a few semi-interesting things with JavaFX for work but not really enough to blog about so far.

'later.

Wednesday, 10 September 2014

Little gpu bits

I've mostly been taking it easy - i'm not going to be on leave forever (unfortunately) - but i've tried a couple of little things on the gpu code.

First I tried creating a tile-based implementation for the ARM/host version but this runs about 1/2 the speed of the line-oriented one. Not that I really optimised it but that's a lot to make up and i don't see the point; it's a convenient test-bed for experimenting though.

Then I tried creating tile-accurate indexing rather than using the bounding box. This improves the output a small amount on the purely arm version but takes a hit on the epiphany backend since the hit to the arm-side code exceeds the gains on the epiphany-side. It will depend on the workload and it might be worth it for larger triangles. Then again maybe the index isn't helping as much as I thought.

I also started (re)reading about some lighting stuff but didn't get very far.

Feeling pretty lazy today too.

Update: But not too lazy to poke a bit more it seems.

I made a "slight improvement" to the ARM based tile renderer and now it's a bit faster (10%) than the line-based one with a specific test-case. Being lazy the first time I was just processing the tile row by row rather than performing the rasteriser pass across the whole tile first and then processing the fragments afterwards. This just helps the compiler keep more setup data in registers for each loop and is closer to how i'm doing it on the epiphany.

Update: Haven't been able to get into it this last week. I think hayfever season is starting and even before the symptoms hit it just seems to wreck my sleep more than normal. Been really tired/lethargic and not really feeling like doing anything - it just feels like all i'm doing each day is hanging around waiting to escape from it into the unconsciousness of sleep again. Today I even feel like i'm "coming down with something" although i'm pretty sure i'm not and it's just some hayfever related nonsense. I've done a little gardening at least - preparing some garden beds, putting in a few seeds, and rejuvenating some pots.

But as a bit of a puzzle a few days ago I tried to see if i could get the rasteriser loop any faster. I think I can get the inner loop down to 8 cycles with some unrolling, double load/stores and some constant preloads. The previous best was 10 cycles but i'm not sure this new version is practical.

This came out of playing with the idea of breaking the work up into squares (4x4 or 8x8) rather than rows. This has overheads due to performing the edge tests multiple times outside of each pixel test but also reduces the overheads of calculating over the bounding box. But it's one of those things I need a solid afternoon to try out by coding it up.

These tile tests also allow one to determine full coverage outside of the loop - which removes the need for the edge testing calculations at all. So I tried to see if that could save anything in the inner loop; but so far the latency from the z buffer testing has prevented any gains being made. Even assuming I could pipeline that away I think I can only save 1 cycle.

I also toyed with creating an integer rasteriser that stores the framebuffer internally using bytes. For a flat shaded/z-buffered/non-blended triangle I think I can get that down to 7 cycles per pixel (and that's rendered, not just converted to fragments). Is that even useful? Who knows. But to test that idea out I need to work on a new design which will take another solid afternoon as well.

Boycott nvidia, cuda?

nvidia has taken the nuclear option to sue every other gpu maker in existence (apart from ati/amd with which they already have cross-licensing agreements i guess).

Patent trolling is usually the last gasp of a failing business. Which implies that despite (or because of) their overpriced hardware they are failing as a manufacturer; GPus are now commodity items and the margins no longer exist to run their type of high-margin business.

Patents are a cruel abomination which distort the workings of a "free market"; they directly codify rentier behaviour which costs society both economical and technological progress. The only beneficiaries are the unproductive leeches of society at the cost of everyone else.

If you're an engineer or scientist who is currently using or considering cuda for your work I suggest you reconsider both to protest this failure of a strategy and to protect the future value of your work.

Just for nvidia to consider this strategy shows they are not long for this world and choosing to use such a single-supplier would be foolhardy.

Saturday, 6 September 2014

Damocles is Mecenary

Oh boy ...

Seems that Just Add Water was working on a Damocles game!

"We've spent some time on pre-production, coming up with the overall direction, both visually, and as a story, as it's not a straight Damocles remake, it's using parts of the entire Mercenary story arc," JAW boss Stewart Gilray told VG247 today.

Despite the pre-production and close working relationship with original coder Paul Woakes, the project is currently on hold at JAW.

"We've had to shelve it for the moment unfortunately but it's something we are massively excited about coming back to," he confirmed.

Well I hope they don't shelve it for too long. Damocles was the only Amiga game I ever bought so it has a pretty bit spot in my heart.

Games

I haven't been playing games much lately - hacking on code is more rewarding and satisfying and if i'm stuck or had enough or too tired I've been reading junk on the net, or watching a tiny bit of TV.

I like the new Doctor Who and i'm pleased they let him keep his "independent" accent. Although I'm sure i'm not alone in thinking of his character from The Thick of It. "Missy" is still the batshit-crazy HR chick from Green Wing too - which was a great character and it was a nice surprise to see she wasn't just a one-off for the first show (I think the whole 'heaven' and his intro as 'i'm over 2000 years old' may turn into some sort of connection with a particular fictional sandal-wearing character from that era; well probably not but it would be interesting if they did). The americanised torchwood OTOH is just not really very good ... but what can you do eh - the original wasn't really very good either if we're honest but the dumbed-down McGuyver-science stuff is a pretty shitful and unnecessary addition to the show. "Gwen" is a bit of a yummy mummy though ;-)

Back to games - since I haven't been playing much i'm kind of not sure why i'm terribly interested in these but there are some games coming up that are looking pretty sweet all the same.

DRIVECLUB

Evolution make great car games and I was always a fan of the way they handled hills and long-range views adding a bit of flair from their origins in flight simulators (at least that's what i understand from reading it somewhere). The amount of processing power available now is just staggering and allowing for some really amazing graphics and world simulation.

It will be interesting to see how the social aspect works. People just seem to love that kinda shit for some reason and racing games seem like a good fit due to their competitive nature and accessibility and that repetitive play continually improves your times.

Hmmm ... I still have a copy of Motorstorm Apocalypse I haven't got around to opening yet, amongst half a dozen other games. I preferred the WRC games for the most part but the loading times were always shit - that's another big "next generation" thing they seem to have addressed in DRIVECLUB.

The Tomorrow Children

Visually stunning and aesthetically unique - something that simply wasn't possible just a few years ago because neither the hardware nor the mathematics existed. I just wish all that async compute stuff filtered down to the APUs (faster).

Like part of DRIVECLUB the multiplayer is not fully synchronous. Everyone occupies the same persistent world in real(ish?)-time but they don't have the scalability problem of trying to render 500 dolls at on screen at the same time by simply not showing other people unless they're interacting with the global state (e.g. they 'fade in' to pick up something, then fade out taking the something with them). This is a neat technological solution to the scalability issue but also addresses the confrontational aspect of most "traditional" multi-player games. Although player vs player games are quite popular a lot of people don't like them, me included, and this is one of many games adopting a different approach.

I'm not sure it will be the sort of game I would play because it looks like it will suck too much time and due to the multiplayer persistence force you to be constantly active and involved; but graphically and technically there is a lot of cool stuff going on there.

No Man's Sky. Or in Irish apparently "nomans-sky".

Technically very interesting again. This is probably something previously possible but nobody dared to try on quite this scale - or never managed to get the algorithms good enough to pull it off (assuming Hello Games can). Obviously i've been playing with noise lately which would be one of the underlying building blocks of making this work. There's absolutely no "random" in the noise algorithms although they are intended appear that way.

I think it's rather cool that although it is a persistent shared galaxy "the mathematical chances of ever meeting anyone else is approximately zero and therefore anyone you meet is simply a product of an over-active imagination" - to paraphrase a certain book.

Actually the imagination runs wild on this one with the potential scope of the game - whole galaxy which can never be fully explored, galaxy-wide civilisations, traders, pirates, conflicts, archaeological remains and relics, forgotten settlements or downtrodden settlers, whole planets to roam. Reality might not be quite so fantastical but i'm still interested to see what sort of game they come up and the universe they're algorithms will create and the potential is there for expansion toward loftier goals for years to come. I was interested to read that they have procedural room generation as well - will we be able to actually (finally) go inside every building we see? If you can do it for one there's nothing to stop you doing it for them all. That alone would be revolutionary.

I'm guessing that the main "impediment" to reaching the end-game will be the fuel required to travel from star to star and thus the main driving mechanism for the whole game will be acquiring that fuel. So my guess is the gameplay will revolve around performing tasks which earn the dough which can be used to buy the fuel to travel forward, but you're always limited in how far you can go. Although only 1 in 100 planes will have "advanced" life my guess is the barren ones will have rare/valuable minerals to be mined that will make them worth visiting too. If each solar system has one particular resource of interest and the livability of planets is based on the chemistry of the solar system it would make sense that resource abundance would also be correlated.

If they're smart they'll sell fuel canisters for real money - although I personally think it's pretty stupid to buy such "virtual goods" myself if people are dumb enough with their money then why not let them spend it? This would literally turn the game into a virtual tourism simulator which isn't such a despicable idea. Actually it will be interesting to see if they actually do this - although it could potentially imbalance most games of this type with so many planets in the galaxy it's effectively impossible to "wreck" this game that way.

Since you don't actually meet people my guess is the main 'multiplayer' aspect of the game will take place either IRL - via screenshots, blogs, faecebook, twatter, youtube, twitch and so on, or a similar in-game universal comms/atlas system. vidphones and "subspace" communications? That was definitely a mainstay in 30-60s sci-fi. Since they plan on some sort of multiplayer later on, and since you can't physically meet, my guess is it will be have to be some sort of tele-holo-deck type thing to fit in the game world.

TBH it's really hard to imagine that their geography/flora/fauna/architecture/spaceship/route algorithms will provide enough variety to satisfy punters but it's really hard to imagine the sort of big numbers they're playing with (actually it's impossible). Basing the models on the way the real world works - using an alternative but consistent chemistry and physics - at least has the potential to be just as wild and varied.

I think there were a couple of other interesting things coming up but they slip my mind for the moment.

If these more novel games get any success (or even if they don't) i'm sure more will try which will just further add to the breadth of the gameosphere.

Too noisy

Been playing a bit with simplex noise. Interesting how much you can create from the same basic function and kind of cathartic and easy on the brain.

The following were all created with the same basic noise. 4 frequencies are combined in different ways. I'm using Z as an animator so they all smoothly animate.

Blobs of liquid. This uses max(). The frequency is the same for each layer by the amplitude is altered. Note that this is purely 2D and the depth appears due to attenuation.

Smoke or writhing organic mass. This uses max(abs()) and a lower frequency.

Lava lamp ringlets. Scales to an integer and selects one of the bits from the integer. Again the depth is from attenuation and scaling in this case.

Friesian cow-hide. Or a coastline. Or a burning piece of paper. Threshold with multiple frequencies. Works very nicely as a blending mask for image transition.

I've mostly been playing with the hash function to try and create something epiphany efficient whilst still working sufficiently well. For 3D noise the current candidate uses 3 lookup tables to provide a basic hash of the x/y/z locations and they are combined using floating point multiplies and/or other bit ops. I'm only using random values which works most of the time although a better choice should be possible. It may be worth just going back to the permutation array of the original code as I realised I can implement that in only 256 bytes if i need to. I still don't know how it will run on the machine as the simplex setup code is pretty expensive too but I haven't looked at how to optimise it yet. Originally I was looking at the 2D noise because it was simpler but as 3D noise is just so much more useful I will target that instead.

I also created a 16-element spherical set of vectors for the base noise gradients. First I used an inscribed cube and some others I made up but then I finally found the code by Jon Leech (hint: it's at the bottom of the page!) which models electron repulsion to try to evenly space the points across the sphere. This does create a nicer result. 16 is used since it's a lot easier to calculate the modulus of 16 than it is for 12.

I do see patterns showing up particularly with the ringlet algorithm - lines at 45 degrees showing up as you zoom out - but this shows up for the original too. The noise is definitely not zero-mean. If I average over many frames I get fairly regular blobs at 45 degrees showing up also - although they are at 90 degrees to the ones that show up zoomed out, but again this is also present with the original Simplex noise hash function and gradient set.

Thursday, 4 September 2014

"Sentinel Saves Single Cycle Shocker!"

Whilst writing the last post I was playing with a tiny fragment to see how just testing the edge equations separate to the zbuffer loop would fare. It's a bit poor actually as even the simplest of loops will still require 9 cycles and so doesn't save anything - although one wouldn't be testing every location like this so it's pretty much irrelevant.

Irrelevant or not I did see an opportunity to save a single cycle ...

If one looks at this loop, it is performing 3 edge equations positive tests and if that fails it then has to perform a loop-bounds test.

0000015e:       orr.l     r3,r18,r19      /|                  1                                             |
00000162:       fadd.l    r18,r18,r24     \|                  1234                                          |
00000166:       add.s     r0,r0,#0x0001   /|                   1                                            |
00000168:       fadd.l    r19,r19,r25     \|                   1234                                         |
0000016c:       orr.l     r3,r3,r20       /|                    1                                           |
00000170:       fadd.l    r20,r20,r26     \|                    1234                                        |
00000174:       bgte.s    0x0000017a       |                     1                                          |
00000176:       sub.s     r2,r0,r2         |                      1                                         |
00000178:       bne.s     0x0000015e       |                       1                                        |

On in C.

  for (int x=x0; x < x1; x++) {
     if ((v0 >= 0) & (v1 >= 0) & (v2 >= 0))
       return x;
     v0 += v0x;
     v1 += v1x;
     v2 += v2x;
  }

Problem is it needs two branches and a specific comparison check.

Can a cycle be saved somehow?

No doubt, don't need Captain Obvious and the Rhetorical Brigadettes to work that one out.

Just as with the edge tests the sign bit of the loop counter can be used too: it can be combined with these so only one test-and-branch is needed in the inner loop. After the loop is finished the actual cause of loop termination can be tested separately and the required x value recovered.

It's not a sentinel it's just combining logic that needs to be uncombined and tested post-loop in a similar way to a sentinel.

000001ba:       orr.l     r3,r18,r19      /|                  1                                             |
000001be:       fadd.l    r18,r18,r24     \|                  1234                                          |
000001c2:       sub.s     r0,r0,#0x0001   /|                   1                                            |
000001c4:       fadd.l    r19,r19,r25     \|                   1234                                         |
000001c8:       orr.l     r3,r3,r20       /|                    1                                           |
000001cc:       fadd.l    r20,r20,r26     \|                    1234                                        |
000001d0:       orr.s     r1,r0,r3         |                     1                                          |
000001d2:       blt.s     0x000001ba       |                      1                                         |

Or something more or less the same in C with the pro/epilogues and probably broken edge cases:

  int ix = x1 - x0 - 1;
  while ((v0 >= 0) & (v1 >= 0) & (v2 >= 0) & (ix >= 0)) {
    ix -= 1;
    v0 += v0x;
    v1 += v1x;
    v2 += v2x;
  }
  if (ix >= 0) {
    return x0 + ix;
  }

I suppose it's more a case of "Pretty Perverse Post Pontificates Pointlessly!"

Or perhaps it's just another pointless end to a another pointless day.

Might go read till I fall asleep, hopefully it doesn't take long.