Sunday, 2 March 2014

OpenCL binding - a bit more tweaking

Had a bit of another look at the OpenCL binding I was working on. I wasn't happy that some of the public interfaces still uses long[] arrays to represent intptr_t arrays - specially for property lists. So I made a bit more java-ish. It's still a bit clumsy but it's about as good as it's going to get.

    public static native CLContext createContext(long[] properties,
            CLDevice[] devices,
            CLContextNotify notify) throws CLRuntimeException;
Becomes:
    public static native CLContext createContext(CLContextProperty[] properties,
            CLDevice[] devices,
            CLContextNotify notify) throws CLRuntimeException;
Properties all inherit from a base class:
public class CLProperty {
    protected final long tag;
    protected final long value;

    protected CLProperty(long tag, long value) {
        this.tag = tag;
        this.value = value;
    }
}
This is so the JNI code only needs to deal with one type of object. Then I have factory methods for the various property types.
public class CLContextProperty extends CLProperty {

    ...

    public static CLContextProperty CL_CONTEXT_PLATFORM(CLPlatform platform) {
        return new CLContextProperty(CL.CL_CONTEXT_PLATFORM, platform.p);
    }

    ...

}
Although I might make the names more java-friendly.

So based on the createContext interfaces above, one changes:

  cl = createContext(new long[] { CL.CL_CONTEXT_PLATFORM, platform.p, 0 },
                     new CLDevice[] { dev },
                     null);
to:
  cl = createContext(new CLContextProperty[] { CLContextProperty.CL_CONTEXT_PLATFORM(platform) },
                     new CLDevice[] { dev },
                     null);
It's not like it saves typing but it is type-safe, and you don't have to remember to put the closing 0 tag on the end of the list. Perhaps the factory methods should sit on CLContext for that matter.

Callbacks, Leaks, Lambdas

Another part I looked into implementing was the callback methods from C to Java, such as the one passed to createContext or buildProgram.

This is mostly straightforward - just pass a hook function to the OpenCL call which locates an environment and invokes the callback function on an interface. There is no need to support a 'user data' field for the java side, so that is just used to pass a global reference to the interface itself.

If one considers the generic interface used for build callbacks:

public interface CLNotify<T< {
    public void notify(T source);
}
The C hook is relatively straightforward ...
static void build_notify_hook(cl_program prog, void *data) {
    jobject jnotify = data;
    jobject source;
    JNIEnv *env;
    jlong lprog = (jlong)prog;

    if ((*vm)->GetEnv(vm, (void *)&env, JNI_VERSION_1_4) != 0
        && (*vm)->AttachCurrentThread(vm, (void *)&env, NULL) != 0) {
        fprintf(stderr, "Unable to attach java environment\n");
        return;
    }

    source = (*env)->NewObjectA(env, classid[PROGRAM], new_p[PROGRAM], (void *)&lprog);
    if (!source)
        return;

    (*env)->CallVoidMethodA(env, jnotify, CLNotify_notify, (void *)&source);
}
(FIXME: this may need to detach the thread also). (FIXME: this may need to de-ref jnotify)

One notices that the callback simply creates a new CLProgram object instance to the pass the pointer to Java. This means that OpenCL handles may map to more than one Java object: this goes some way to validating my decision to stick with simple holder objects rather than trying to keep some data copied to the Java side. Although it wouldn't be that difficult to track object instances if necessary: instead of calling NewObject() invoke a factory method which handles the object instances. Albeit at the cost of duplicating the reference tree in Java.

Another bonus i didn't realise is that the way lambdas are implemented allows these to be used from the Java side without the JNI needing to know anything about it. I think I did read about this at some point but it's been a while and I forgot about it. I had a look at a dissassemby of the class file and it's just using invokedymanic to create an interface object which is just a function pointer rather than having to create an instance of an abstract class.

So e.g. this works:

  prog.buildProgram(new CLDevice[]{dev}, null,
    (CLProgram source) -> {
        System.out.printf("Build notify, status = %d\nlog:\n",
            source.getBuildInfoInt(dev, CL_PROGRAM_BUILD_STATUS));
        System.out.println(source.getBuildInfoString(dev, CL_PROGRAM_BUILD_LOG));
    });

The one very big caveat for all of the above ... is that I haven't worked out a clean way to avoid leaking the notify instance object. This is because the OpenCL api specifies that these callback functions may be invoked asynchronously and/or from other threads.

Thinking aloud:

For the specific case of clBuildProgram and friends it looks like the notify function is only ever (and always) called once and I can thus deref the interface in the hook routine. If I pass both the CLProgram object and CLNotify interface to the hook routine I can keep the CLProgram instance unique anyway ... (And to be honest i'm not sure how useful this mechanism is to start with since it's easier just to compile synchronously and check the return code / exception).

But CLContext has it's own notify function too which needs to live as long as the CLContext so I can't use the same trick there. At first I thought of creating an set/remove listener interface that just keyed everything off the point value and tracking the listeners in Java. But that doesn't work because presumably it's possible to get a callback call without ever getting a context. I guess I could use the listener itself as a key and provide a static native clearNotify() method which must be called explicitly but it gets a bit messy for a few reasons.

struct notify_info {
  int id;
  jobject jnotify;
};

clCreateContext(..., jobject jnotify) {
...
  lock {
   info = malloc();
   info.id = getsequence();
   info.jnotify = NewGlobalRef(jnotify);
   listeners.add(info);
  }
...
  clCreateContext(..., create_context_hook, (void *)id);
...
}

create_context_hook(..., void *data) {
  int id = (int)data;

  lock {
     info = listeners.find(id);
     if (info) {
        ... invoke info.jnotify;
     }
  }
}

clear_context_notify(..., jobject jnotify) {
   lock {
      info = listeners.find(jnotify);
      if (info) {
         deleteGlobalRef(info.jnotify);
         listeners.remove(info);
      }
   }
}
Yeah, messy. A bunch of it could be (synchronous) static Java methods, but it just isn't particularly elegant either way.

Again i'm not sure how useful implementing this precise interface is anyway: it may just as well do to implement a completely separate system which funnels all events through a global event handler mechanism.

No comments: