Changes between Version 21 and Version 22 of MaskedArray

Show
Ignore:
Timestamp:
10/16/06 01:25:11 (7 years ago)
Author:
pierregm
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • MaskedArray

    v21 v22  
    152152= An alternative implementation of MaskedArray = 
    153153 
    154 As a regular user of MaskedArray, I became increasingly frustrated with the subclassing of masked arrays (even if I can only blame my inexperience). I needed to develop a class of arrays that could store some additional information along with numerical values, while keeping the possibility for missing data (picture storing a series of dates along with measurements). I started to implement such a class, but then quickly realized that any additional information disappeared when processing these subarrays (for example, adding a constant value to a subarray would erase its dates). I ended up writing the equivalent of numpy.core.ma for my particular class, ufuncs included. Everything went fine until I needed to subclass my new class, when more problems showed up: some attributes of the new subclass were lost during processing. I identified the culprit as MaskedArray, which returns masked ndarrays when I expected masked arrays of my class. I was preparing myself to rewrite numpy.core.ma when I forced myself to learn how to subclass ndarrays. As I became more familiar with the {{{__new__}}} and {{{__array_finalize__}}} methods, I started to wonder why masked arrays were objects, and not ndarrays, and whether it wouldn't be more convenient for subclassing if they did behave like regular ndarrays. 
    155  
    156 The attachment is what I eventually come up with. The main differences with the initial {{{numpy.core.ma}}} package are that {{{MaskedArray}}} is now a subclass of {{{ndarray}}} and that the {{{_data}}} section can now be any subclass of {{{ndarray}}} (well, it should work in most cases, some tweaking might required here and there). Apart from a couple of issues listed below, the behavior of the new {{{MaskedArray}} class reproduces the old one. It is quite likely to be significantly slower, though: I was more interested into a clear organization than in performance, so I tended to use wrappers liberally. I'm sure  we can improve that rather easily. Note that I didn't try to time any methods. 
    157 I also attach a unittest suite (here), modeled after the standard numpy one, along with some utiliies for testing (here). The old {{{test_ma}}} can also be run with the new package but it does fail in some places, see below. 
     154As a regular user of MaskedArray, I became increasingly frustrated with the subclassing of masked arrays (even if I can only blame my inexperience). I needed to develop a class of arrays that could store some additional information along with numerical values, while keeping the possibility for missing data (picture storing a series of dates along with measurements). I started to implement such a class, but then quickly realized that any additional information disappeared when processing these subarrays (for example, adding a constant value to a subarray would erase its dates). I ended up writing the equivalent of {{{numpy.core.ma}}} for my particular class, ufuncs included. Everything went fine until I needed to subclass my new class, when more problems showed up: some attributes of the new subclass were lost during processing. I identified the culprit as MaskedArray, which returns masked ndarrays when I expected masked arrays of my class. I was preparing myself to rewrite numpy.core.ma when I forced myself to learn how to subclass ndarrays. As I became more familiar with the {{{__new__}}} and {{{__array_finalize__}}} methods, I started to wonder why masked arrays were objects, and not ndarrays, and whether it wouldn't be more convenient for subclassing if they did behave like regular ndarrays. 
     155 
     156The new {{{maskedarray}}} [attachment:maskedarray.py package ] is what I eventually come up with. The main differences with the initial {{{numpy.core.ma}}} package are that {{{MaskedArray}}} is now a subclass of {{{ndarray}}} and that the {{{_data}}} section can now be any subclass of {{{ndarray}}} (well, it should work in most cases, some tweaking might required here and there). Apart from a couple of issues listed below, the behavior of the new {{{MaskedArray}} class reproduces the old one. It is quite likely to be significantly slower, though: I was more interested into a clear organization than in performance, so I tended to use wrappers liberally. I'm sure  we can improve that rather easily. Note that I didn't try to time any methods. 
     157I also attach a unittest suite ([attachment:test_maskedarray.py here]), modeled after the standard numpy one, along with some utilities for testing ([attachment:masked_testutils.py here]). The old {{{test_ma}}} can also be run with the new package but it does fail in some places, see below. 
    158158 
    159159=== Main differences === 
     
    208208 
    209209   
    210 Please note that it's still a work in progress (even if it seems to work quite OK when I use it). Suggestions, comments, improvements and general feedback are more than welcome ! 
    211  
    212  
    213  
    214  
     210Please note that it's still a work in progress (even if it seems to work quite OK when I use it). Suggestions, comments, improvements and general feedback are more than welcome ! At last, I'd like to thank Paul, Travis and Sacha for the original masked array package: without you, I would never have started that (it might be argued that I shouldn't have anyway, but that's another story...) 
     211 
     212 
     213 
     214