## Friday, 8 August 2014

### lambdas & streams

As part of the experiment with the histogram equalisation stuff I started writing a utility library for playing with images for prototyping code.

One thing I was curious about was whether I could use streams to simplify the prototyping by saving having to type and retype the typical processing loop:

```for (int y = 0; y < image.height; y++) {
for (int x = 0; x < image.width; x++) {
do something;
}
}
```

When i'm prototyping stuff I type this in ... a lot.

I had mixed results.

Because I wanted to support arbitrary 2d subregions of an image which might be a mapping of an arbitrary 2d subregion I had to create my own 'spliterator' to do the work. After a couple of aborted attempts I came up with one that just turns the widthxheight range into a linear stream and then maps that to the local (x,y) when retrieving the pixel values (i tried to avoid the divide first, but made a pigs breakfast of the maths).

It lets me write something like this to calculate the histogram over a sub-range of an image:

```  Image2D img;
byte[] hist = new byte;

img.bytes(0, 0, 16, 16).forEach((v) -> > {
hist[v] += 1;
});
```
Ok, so far so good. It's not necessarily the best way to do it - it can't be parallelised for instance, but this is fine, it saves a few keystrokes and it lets one access a whole bunch of stream functionality "for free".

The problem is with images you normally want to write to them or modify them. So you're back to just using a loop, or maybe a custom foreach which supplies coordinates to a lambda function: again this is fine but then you don't get any of the stream functionality for free here (although as in the next section: it's good enough?). You could just use an IntStream, ... but that doesn't really save any typing over a for loop.

Staying within the confines of the existing IntStream type for the sake of argument, the solution is a little clumsy. One first has to create a class which implements the functions required to be used as a collector.

```    static class ByteArray {
byte[] data;

public void add(int b);
public void addAll(ByteArray b);
}
```
With that in place it can be used to collect the results of the calculation. In this case performing the pixel mapping from one set of intensity values to another.
```  byte[] pixels = img.bytes()
.map((int v) -> map[operand])
.collect(ByteArray::new, ByteArray::add, ByteArray::addAll)
.data;

Image2D dst = new ByteImage(src.width, src.height, pixels);
```

This can run in parallel: the downside is that each stage needs to allocate its own buffers and then allocate copies of these up to the final result. Probably works but yeah, it's not that pretty or efficient.

### Indexed Stream

So I thought about it a little and perhaps a solution is to create another type of stream which indexes over the values. Some of the api usage gets a bit fatter if you want to use some of the basic stream facilities like sums and so on: but that's what .map() is for. I think it can get away without having to allocate the indexing object for each iteration: it is only needed when the stream range is split.

```  class IndexedInt {
int value;
int x;
int y;
}

dst = new ByteImage(src.width, src.height);
img.bytes().forEach((ii) -> {
dst.set(ii.x, ii.y, ii.value);
});
```

I dunno, I suppose that's better than a double-for-loop, once the not-insignificant scaffolding is in place.

Actually; why bother even passing the value in this case, it may as well just be calculating indices. It doesn't really make any difference to the code and having a general purpose 2D indexer is somewhat useful.

```  class Index2D {
int x;
int y;
}

dst = new ByteImage(src.width, src.height);
Index2D.range(0, 0, src.width, src.height)
.forEach((ii) -> {
dst.set(ii.x, ii.y, src.get(ii.x, ii.y));
});
```

Some of the functionality is a little less concise but the simplicity of the above is probably worth it.

```  double average = Index2D.range(0, 0, src.width, src.height)
.mapToInt((ii) -> img.get(ii.x, ii.y))
.average()
.getAsDouble();
```

Much of that could be hidden in helper functions and the external interface could remain an IntStream, for cases where the pixel locations are not required.

Seems like a lot of work just to get a free parallelisable 'sum' function though? The implementing classes still need a bunch of boilerplate/helpers and they could have just implemented most of that themselves. I don't find the forkJoin() approach to paralellisation (which is used by the streams code) to be very efficient either.

But this is my first real look at it : experiments ongoing.

### Parallel histogram

I mentioned earlier that the histogram calculation using a forEach isn't paralleisable as is (one could add a synchronized block inside the loop but one would have to be both naive and stupid to do so).

It can be parallelised using a collector. TBH it's a lot of boilerplate for such a simple function but the algorithm is identical to the one you would use in OpenCL even if it doesn't look the same.

First, one needs the class to hold the results and intermediate results.

```class Histogram {

int[] hist;

public Histogram() {
hist = new int;
}

public void add(int value) {
hist[value] += 1;
}

public void addHistogram(Histogram o) {
for (int i = 0; i < hist.length; i++)
hist[i] += o.hist[i];
}
}
```

And then the code:

```  int[] hist;
Image2D img;

hist = img.bytes().parallel()
.collect(Histogram::new, Histogram::add, Histogram::addHistogram)
.hist;
```

*shrug*. I guess it saves some typing?