| 154 | | As a regular user of MaskedArray, I became increasingly frustrated with the subclassing of masked arrays (even if I can only blame my inexperience). I needed to develop a class of arrays that could store some additional information along with numerical values, while keeping the possibility for missing data (picture storing a series of dates along with measurements). I started to implement such a class, but then quickly realized that any additional information disappeared when processing these subarrays (for example, adding a constant value to a subarray would erase its dates). I ended up writing the equivalent of numpy.core.ma for my particular class, ufuncs included. Everything went fine until I needed to subclass my new class, when more problems showed up: some attributes of the new subclass were lost during processing. I identified the culprit as MaskedArray, which returns masked ndarrays when I expected masked arrays of my class. I was preparing myself to rewrite numpy.core.ma when I forced myself to learn how to subclass ndarrays. As I became more familiar with the {{{__new__}}} and {{{__array_finalize__}}} methods, I started to wonder why masked arrays were objects, and not ndarrays, and whether it wouldn't be more convenient for subclassing if they did behave like regular ndarrays. |
| 155 | | |
| 156 | | The attachment is what I eventually come up with. The main differences with the initial {{{numpy.core.ma}}} package are that {{{MaskedArray}}} is now a subclass of {{{ndarray}}} and that the {{{_data}}} section can now be any subclass of {{{ndarray}}} (well, it should work in most cases, some tweaking might required here and there). Apart from a couple of issues listed below, the behavior of the new {{{MaskedArray}} class reproduces the old one. It is quite likely to be significantly slower, though: I was more interested into a clear organization than in performance, so I tended to use wrappers liberally. I'm sure we can improve that rather easily. Note that I didn't try to time any methods. |
| 157 | | I also attach a unittest suite (here), modeled after the standard numpy one, along with some utiliies for testing (here). The old {{{test_ma}}} can also be run with the new package but it does fail in some places, see below. |
| | 154 | As a regular user of MaskedArray, I became increasingly frustrated with the subclassing of masked arrays (even if I can only blame my inexperience). I needed to develop a class of arrays that could store some additional information along with numerical values, while keeping the possibility for missing data (picture storing a series of dates along with measurements). I started to implement such a class, but then quickly realized that any additional information disappeared when processing these subarrays (for example, adding a constant value to a subarray would erase its dates). I ended up writing the equivalent of {{{numpy.core.ma}}} for my particular class, ufuncs included. Everything went fine until I needed to subclass my new class, when more problems showed up: some attributes of the new subclass were lost during processing. I identified the culprit as MaskedArray, which returns masked ndarrays when I expected masked arrays of my class. I was preparing myself to rewrite numpy.core.ma when I forced myself to learn how to subclass ndarrays. As I became more familiar with the {{{__new__}}} and {{{__array_finalize__}}} methods, I started to wonder why masked arrays were objects, and not ndarrays, and whether it wouldn't be more convenient for subclassing if they did behave like regular ndarrays. |
| | 155 | |
| | 156 | The new {{{maskedarray}}} [attachment:maskedarray.py package ] is what I eventually come up with. The main differences with the initial {{{numpy.core.ma}}} package are that {{{MaskedArray}}} is now a subclass of {{{ndarray}}} and that the {{{_data}}} section can now be any subclass of {{{ndarray}}} (well, it should work in most cases, some tweaking might required here and there). Apart from a couple of issues listed below, the behavior of the new {{{MaskedArray}} class reproduces the old one. It is quite likely to be significantly slower, though: I was more interested into a clear organization than in performance, so I tended to use wrappers liberally. I'm sure we can improve that rather easily. Note that I didn't try to time any methods. |
| | 157 | I also attach a unittest suite ([attachment:test_maskedarray.py here]), modeled after the standard numpy one, along with some utilities for testing ([attachment:masked_testutils.py here]). The old {{{test_ma}}} can also be run with the new package but it does fail in some places, see below. |