Changes between Version 13 and Version 14 of MaskedArray

Show
Ignore:
Timestamp:
02/28/06 18:47:21 (7 years ago)
Author:
sasha
Comment:

moved missing features section up front

Legend:

Unmodified
Added
Removed
Modified
  • MaskedArray

    v13 v14  
    1313      will now be redundant because nomask provides full array interface. For example "m is None or not sometrue(m)" 
    1414      can now be written as "not m.any()". 
     15 
     16= Missing features (work in progress) = 
     17Some current features of numpy are not yet implemented for ma, either because they were introduced to numpy only recently (eg {{{ndim}}} ?), or because they were never adapted to ma in the first place (eg, the {{{mlab}}} package). As Paul Dubois noted, it does not make sense to extend the handling of missing values to all numpy features (a typical example would be the FFT package). However, ma is still invaluable in many cases, and it's unfortunate that its use is currently a bit limited. 
     18 
     19A non exhaustive list of missing features is presented below. The features are organized by potential problems and naive suggestions to solve them. More features will be added as I run into them. 
     20 
     21=== Case 1 === 
     22  The function would work OK with masked arrays if it called {{{ma.asarray}}} instead of {{{numeric.asarray}}} (as it's currently the case). A fix could be to add a {{{mask=False_}}} property by default to any {{{ndarray}}}, and get rid of the {{{MaskedArray}}} class ? 
     23  * '''{{{diff}}}''' 
     24 
     25 
     26=== Case 2 === 
     27  The function can be applied only to the data part once missing values are adequately filled. If needed, the masked version is obtained easily by applying the initial mask to the result. An {{{use_missing}}} option could be introduced to allow the use of missing values (the output would be masked), or discard them (default option?). 
     28  * '''{{{ndim}}}''': The masked array could inherit {{{ndim}}} from its {{{data}}} part. Implemented in changeset:2185. 
     29  * '''{{{std, var}}}''': An example of implementation of the function (not the method) {{{std}}} is given [attachment:ma_examples.py there]. 
     30  * '''{{{trace}}}''': Fill with 0 if {{{use_missing}}} is False. 
     31  * '''{{{cumprod, cumsum}}}''':   
     32    * {{{use_missing=True}}}: The output is masked for indices [''i''...''N''], where ''i'' is the index of the first missing value, and ''N'' the nb of data (including missing). 
     33    * {{{use_missing=False}}}: Fill the initial missing values by 1 for {{{cumprod}}} or 0 for {{{cumsum}}}. 
     34  
     35=== Case 3 === 
     36  The function must be applied to both the data part and the mask. I expect it's the case for most of the functions in    '''{{{shape_base}}}''', '''{{{index_trick}}}'''. As an illustration, a quick and dirty adaptation of the concatenator {{{r_}}} can be: 
     37{{{ 
     38#!python 
     39mar_ = lambda seq:ma.array(data=[s.data for s in seq],mask=[s.mask for s in seq]) 
     40}}} 
     41  * '''{{{swapaxes, squeeze}}}''' 
     42 
     43=== Case 4 === 
     44  The trickiest case where missing values must be remain masked during the process. 
     45 
     46  * '''{{{median}}}''': The two functions in [attachment:ma_examples.py this attachment] (seem to) work well for 1- and 2D arrays. The problem gets more complex for higher dimensions. 
     47 
    1548 
    1649---- 
     
    114147 * Can arrays be used as truth values directly? 
    115148 
    116 == A non-exhaustive of features yet missing == 
    117 Some current features of numpy are not yet implemented for ma, either because they were introduced to numpy only recently (eg {{{ndim}}} ?), or because they were never adapted to ma in the first place (eg, the {{{mlab}}} package). As Paul Dubois noted, it does not make sense to extend the handling of missing values to all numpy features (a typical example would be the FFT package). However, ma is still invaluable in many cases, and it's unfortunate that its use is currently a bit limited. 
    118  
    119 A non exhaustive list of missing features is presented below. The features are organized by potential problems and naive suggestions to solve them. More features will be added as I run into them. 
    120  
    121 === Case 1 === 
    122   The function would work OK with masked arrays if it called {{{ma.asarray}}} instead of {{{numeric.asarray}}} (as it's currently the case). A fix could be to add a {{{mask=False_}}} property by default to any {{{ndarray}}}, and get rid of the {{{MaskedArray}}} class ? 
    123   * '''{{{diff}}}''' 
    124  
    125  
    126 === Case 2 === 
    127   The function can be applied only to the data part once missing values are adequately filled. If needed, the masked version is obtained easily by applying the initial mask to the result. An {{{use_missing}}} option could be introduced to allow the use of missing values (the output would be masked), or discard them (default option?). 
    128   * '''{{{ndim}}}''': The masked array could inherit {{{ndim}}} from its {{{data}}} part. Implemented in changeset:2185. 
    129   * '''{{{std, var}}}''': An example of implementation of the function (not the method) {{{std}}} is given [attachment:ma_examples.py there]. 
    130   * '''{{{trace}}}''': Fill with 0 if {{{use_missing}}} is False. 
    131   * '''{{{cumprod, cumsum}}}''':   
    132     * {{{use_missing=True}}}: The output is masked for indices [''i''...''N''], where ''i'' is the index of the first missing value, and ''N'' the nb of data (including missing). 
    133     * {{{use_missing=False}}}: Fill the initial missing values by 1 for {{{cumprod}}} or 0 for {{{cumsum}}}. 
    134   
    135 === Case 3 === 
    136   The function must be applied to both the data part and the mask. I expect it's the case for most of the functions in    '''{{{shape_base}}}''', '''{{{index_trick}}}'''. As an illustration, a quick and dirty adaptation of the concatenator {{{r_}}} can be: 
    137 {{{ 
    138 #!python 
    139 mar_ = lambda seq:ma.array(data=[s.data for s in seq],mask=[s.mask for s in seq]) 
    140 }}} 
    141   * '''{{{swapaxes, squeeze}}}''' 
    142  
    143 === Case 4 === 
    144   The trickiest case where missing values must be remain masked during the process. 
    145  
    146   * '''{{{median}}}''': The two functions in [attachment:ma_examples.py this attachment] (seem to) work well for 1- and 2D arrays. The problem gets more complex for higher dimensions. 
    147