Friday, 8 December 2017

jjmpeg?

Well i've had reason to visit jjmpeg again for something and although it's still doing the job, it's a very very long way behind in version support (0.10.x?). I've added a couple of things here and there (recently AVFormatContext.open_input so I could open compressed webcam streams) but i'm not particularly interested in dropping another release.

But ... along the way I started looking into writing a new version that will be up to date with current ffmpeg. It's a pretty slow burner and i'm going to be pretty busy with something (relatively interesting, moderately related) for the next couple of months.

But regardless here are a few what-if's should I continue with the effort.

  • The old generator + garabage collection support required 4 classes per object.

    1. Abstract autogenerated native WeakReference based accessors with native-oriented methods (passing `pointers').
    2. Manually written native accessors as above and the glue to make it all work.
    3. Abstract autogenerated public accessors with public-oriented methods (passing objects).
    4. Manually written public accessors as above and the glue to make it all work.

    Whilst most of it is autogenerated the generator sucks to maintain and it's a bit of a mess. I've also since learnt that cutting down the number of classes is desirable.

    So instead i'll use the "CObject" mechanism with the WeakReference being a simple native pointer holder object which also knows how to free it. In this case at most 2 custom classes are required - one for autogenerated code (if that happens) and any helper/custom code.

    A few things require reflection going this route but the overheads should be acceptable.

  • Native memory was wrapped in native ByteBuffer objects.

    Originally the goal was to have java just access fields directly but in practice this wasn't practical as the structures change depending on the compile options so you end up with both the C and Java code being system specific, and the Java code requires a compiler to implement it (C being handled by gcc). A side-goal was to make the Java library bit-size independent without resoring to long - although that's all ByteBuffer uses.

    Because the objects are just wrapped on the pointer there is the possibility that multiple objects can be created to reference the same underlying C object (e.g. getStreams().get(0) repeated). Whilst this isn't as bad as it sounds one has to ensure the objects aren't holding any of their own state java-side for example. It also turns out that a direct ByteBuffer isn't terribly fast either from the Java side or looking up from the C side (not sure why on the latter).

    CObject just uses a long directly, which also precludes the likelyhood of poking around C internals by accident or otherwise. It also ensures unqiue objects reference unique pointers - this requires some overhead but it isn't onerous.

  • Using two concrete classes per object allowed the internal details of passing pointers (ByteBuffer) around to be hidden from the luser.

    • But it requires a lot of scaffolding! The same method written at least 2 times!
    • Although the C call gets a ByteBuffer directly, looking up the host pointer still requires a JNIEnv callback.

    CObject likewise uses an accessor to retrieve the native pointer, but because it's the super-class of all objects the objects can simply be passed in directly. That is native methods just look like java methods and so there is no need for any trampolines between method interfaces. It does require a bit more support on the JNI side when returning objects, but it's trivial code.

    An alternative would be to use a long and pass that around but then you still need the public/native separation and all the hassle that entails.

  • A lot of the current binding is autogenerated.

    Once the generator was written this was fairly easy to maintain but getting the generator complete enough was a lot of work. The biggest issue is that the api just isn't very consistent, and some just don't map very nicely to a Java api. Things such as out parameters - or worse, absolute snot like like AVDictionary that should never have existed (onya libav!).

    Each case required special-case code in the generator, often extra support code, and sometimes a fall-back to manually writing the whole lot.

    Working with zcl - and in that case the OpenCL apis are much cleaner and consistent - I discovered it was somewhat less work just to do it manually and not really any harder to maintain afterwards. At least in the case of the original ffmpeg the inconistency was simply because it wasn't originally intended as a public library, and I suspect the newer versions might be a bit better.

    I'm still undecided about simple data accessors as a good case can be made for saving the typing if there are many to write. So perhaps they could still be autogenerated (to an abstract super class as now), or they could be parameterised like they are in zcl (i.e. internally getInt(field) with public wrappers). Another half and half option would be to use the C preprocessor to do most of the ugly work and still write the Java headers by hand. Probably the last one.

Does anyone else care?

No comments: