Notes are there but a little more on the array stuff.
Basically it just lets you pass a java array instead of a Buffer to the read/write buffer/image commands. It supports non-blocking operations.
I was looking through the aparapi source and noticed it was using GetPrimitiveArrayCritical and ReleasePrimitiveArrayCritical in places where I didn't think you should, so I tried using that for pinning the array rather than GetArrayElements which I tried last time (and decided not to include). This should be more efficient because the *Elements calls all seem to allocate memory and create a copy.
A really dumb test case bears this out and i'm getting a 3-4x performance improvement. Yay! I think that's enough to include into the api now so I did - even though it bloats out the api considerably with 7 overloaded entry points for each such method (Buffer, byte, short, int, long, float, double).
But there's something of a problem in that because of the asynchronous nature of the non-blocking commands the pinning required must be for an indeterminate time. Which is explained as undesirable in the JNI docs but without a deeper knowledge is hard to know if it matters on real jvms.
I'm using an event listener to find out when a non-blocking job is complete and then releasing the array as soon as possible but this adds measurable overhead and other potential complications. I may try to implement a more explicit management mechanism and see if that makes a lot of difference but it would have to be significant to be worth the extra hassles involved in using such an api.
But until I have a use for zcl, ... it might all just be on hold for the moment, because ...
Something I mentioned earlier was revisiting the elf-loader code with a different set of goals. And so i'm thinking of pausing this OpenCL stuff and moving in that direction. Right now apart from OpenCL there's no way to access the epiphany chips from Java and I for one have no interest in doing any frontend stuff in C (there is some work on the sumatra thing but that could be a while and serves a different purpose anyway).
The more I think about it the more it seems like the elf-loader code should provide a pretty good basis for a decent epiphany runtime that still lets you write the low-level code on the device but creates an easier way to manage their code compared to a fugly linker script and an object format that was designed for a system with processes running in virtual memory.
Quite a bit of work though so i've gotta be peachy keen to get it started.
Update: I found out that I was completely wrong on the way aparapi uses Get/Release PrimitiveArrayCritical() and as I originally thought holding a critical array across jni calls and/or threads is probably not a very good idea at all: even if it works. It basically just locks GC completely (and not much else). So I will probably have to resort to using a temporary malloc() buffer to honour the api for non-blocking transfers. That's if i don't just revert it all out of existence. I may not even be able to do use critical access even for the blocking transfers due to supporting java native kernels and various notification callbacks.