Current Design
In the current implementation, iterating through an image is controlled by that image's grid. Each grid maintains a set of parameters which are used to create the iterator when an image is iterated through. So, if we consider an example.
In [8]: img = Image(N.zeros((5, 10, 15))) In [9]: for slice in img: ...: print slice.shape ...: ...: (10, 15) (10, 15) (10, 15) (10, 15) (10, 15)
We see that when an Image is used as an iterator, it by default iterates over the slices of the image. We can change the behavior of the iterator by setting iterator parameters on the image's grid. For example, to iterate over the last axis, we can do:
In [11]: img.grid.set_iter_param("axis", 2)
In [12]: for slice in img:
....: print slice.shape
....:
....:
(5, 10)
(5, 10)
(5, 10)
(5, 10)
(5, 10)
(5, 10)
(5, 10)
(5, 10)
(5, 10)
(5, 10)
(5, 10)
(5, 10)
(5, 10)
(5, 10)
(5, 10)
More complicated iteration methods can be obtained by setting various other parameters. They all involve calling img.grid.set_iter_param and then using the image itself as an iterator.
Issues with current design
- the dirty bomb problem: By relying on image state, we expose the entire codebase (plus 3rd party code) to the dangers of a single rogue image object (one that is misconfigured, or configured in an unexpected way). Since a single image object may travel arbitrarily far in the codebase and any other code that depends on it (3rd party users/developers), and since every bit of code that uses an object has a certain small chance of modifying it (if it's mutable) in an unexpected way, the chances of misconfiguration grows from slim to certain as more people develop on top of the library. This is an example of the general problem of rampant mutability in Python (which can be avoided with a development approach favoring object copying over mutation). Avoiding object mutation is a way to localize the potential damage caused by a bit of careless code, which we may not control, making the system more robust overall.
- lod violation: The Law of Demeter (LoD) for object oriented programming (http://www.ccs.neu.edu/home/lieber/LoD.html ) states that well behaved objects only talk to themselves and their immediate neighbors (where "talk" means "call a method of" and "neighbors" means other objects it has a direct reference to). Observing LoD is a great way to prevent interfaces from becoming strongly coupled with their implementations. In this case the violation comes when calling a method on an image's grid (i.e., the image is the direct neighbor; its grid is not, yet we're talking to the grid anyway). Ideally, the grid is a private attribute of image, and the image itself will have a set_iter_params method (assuming we stick with the path of mutation), which may or may not delegate to a grid internally, but who cares, since it won't change the calling interface. Then we can refactor the grid/iterator implementation at will without affecting client code.
Proposed Design
An alternate design (advocated here) is to have iterator classes which are explicitly used when iterating over an image is required. So, for example, to iterate by slices over the 2nd axis, the code would be
for slice in SliceIterator(img, axis=2): ...
In particular, having the image iterators being complete separate entities decouples them from the images and grids. This in turn will allow us to clarify the required interfaces of each of these classes and how they are used. This is an important step in solving the ImageMemoryOrdering problem.
The pros and cons for this method are:
Pros
- no image mutation (and be sure that the iterator objects themselves are also immutable)
- no LoD violation
Cons
- burdens the client with details: where do you import Slice_Iterator from? ...or was it Slicer, slices, or slice_iterator? is it a class or function? was image the first or second argument...? it seems perfectly reasonable to delegate slicing and dicing to a separate hierarchy of iterator classes. but then hiding those details behind a facade of image methods would make the interface more usable by optimizing the common use case.
Common Use Case
For example, we will clearly want to iterate over slices:
for slice in image.slices(): ...
We can pick a default slice axis that makes sense, which can be tweaked with an arg: image.slices(axis=2). Other common use cases might be: image.volumes(), image.voxels(), or a more general image.bricks(shape=(x,y,z), axis=n). There is no need to include the word "iterator" in each method name since we expect the results to be iterable anyway. Of course it should still be possible to import and directly use the iterator classes themselves, for power users, and those in the know:
for arbitrary_data in image.iterate(SomeCrazyIterator(...)):
...
Regarding using iterators to set the values of an image: data should never be fed into the next method of an iterator, since it is supposed to be a data source by definition. The push (vs pull) pattern muddies the waters. It seems like an abusive overloading of the iterator interface for something it's not intended to do. Also, if we're trying to chain iterators together, the incompatability between push and pull paradigms could cause confusion and bugs, basically creating a situation where we can construct push-chains and pull-chains, but require some third code where they meet in the middle to pull from the pull-chain and push into the push-chain. That's three new kinds of things. Rather than pushing data into an image (mutation again), we could let the image pull data into itself out of another iterator. That is, an image can be defined as the result of actualizing a given iterator. This would be a new way to construct an image object (rather than reading from a file or raw data stream).
There needs to be a way to specify what destination bricks/parcels the source bricks/parcels should go into when creating such a new image, and that the computation to calculate those slices is identical to the code already implemented in the iterator objects. Maybe the iterator objects should be considered as iteration definitions rather than actual iterators, so they could be asked to produce a sequence of arrays of slice objects (built-in Python slice objects). Then the destination image can use these to set its internal data array with results from the source iterator. Again a facade of image constructors can be written to hide the details for common cases. For example, to construct an image B in which the x-slices (if x is the default slice axis) are the y-slices of image A:
B = Image.from_slices(A.slices(axis=2))
or to turn A's y-slices into B's z-slices:
B = Image.from_slices(A.slices(axis=2), axis=3)
(assuming we use Fortran axis indices and the first is 1). then we could have Image.from_volumes, Image.from_voxels, etc., and in the most general case:
B = Image.from_iterator(A.iterate(OneIterator()), SomeOtherIterator(...))
Iterator objects should also be able to report their element shape, so that general constructor like this can barf if the source and destination don't match.
