| | 15 | |
| | 16 | = Missing features (work in progress) = |
| | 17 | Some current features of numpy are not yet implemented for ma, either because they were introduced to numpy only recently (eg {{{ndim}}} ?), or because they were never adapted to ma in the first place (eg, the {{{mlab}}} package). As Paul Dubois noted, it does not make sense to extend the handling of missing values to all numpy features (a typical example would be the FFT package). However, ma is still invaluable in many cases, and it's unfortunate that its use is currently a bit limited. |
| | 18 | |
| | 19 | A non exhaustive list of missing features is presented below. The features are organized by potential problems and naive suggestions to solve them. More features will be added as I run into them. |
| | 20 | |
| | 21 | === Case 1 === |
| | 22 | The function would work OK with masked arrays if it called {{{ma.asarray}}} instead of {{{numeric.asarray}}} (as it's currently the case). A fix could be to add a {{{mask=False_}}} property by default to any {{{ndarray}}}, and get rid of the {{{MaskedArray}}} class ? |
| | 23 | * '''{{{diff}}}''' |
| | 24 | |
| | 25 | |
| | 26 | === Case 2 === |
| | 27 | The function can be applied only to the data part once missing values are adequately filled. If needed, the masked version is obtained easily by applying the initial mask to the result. An {{{use_missing}}} option could be introduced to allow the use of missing values (the output would be masked), or discard them (default option?). |
| | 28 | * '''{{{ndim}}}''': The masked array could inherit {{{ndim}}} from its {{{data}}} part. Implemented in changeset:2185. |
| | 29 | * '''{{{std, var}}}''': An example of implementation of the function (not the method) {{{std}}} is given [attachment:ma_examples.py there]. |
| | 30 | * '''{{{trace}}}''': Fill with 0 if {{{use_missing}}} is False. |
| | 31 | * '''{{{cumprod, cumsum}}}''': |
| | 32 | * {{{use_missing=True}}}: The output is masked for indices [''i''...''N''], where ''i'' is the index of the first missing value, and ''N'' the nb of data (including missing). |
| | 33 | * {{{use_missing=False}}}: Fill the initial missing values by 1 for {{{cumprod}}} or 0 for {{{cumsum}}}. |
| | 34 | |
| | 35 | === Case 3 === |
| | 36 | The function must be applied to both the data part and the mask. I expect it's the case for most of the functions in '''{{{shape_base}}}''', '''{{{index_trick}}}'''. As an illustration, a quick and dirty adaptation of the concatenator {{{r_}}} can be: |
| | 37 | {{{ |
| | 38 | #!python |
| | 39 | mar_ = lambda seq:ma.array(data=[s.data for s in seq],mask=[s.mask for s in seq]) |
| | 40 | }}} |
| | 41 | * '''{{{swapaxes, squeeze}}}''' |
| | 42 | |
| | 43 | === Case 4 === |
| | 44 | The trickiest case where missing values must be remain masked during the process. |
| | 45 | |
| | 46 | * '''{{{median}}}''': The two functions in [attachment:ma_examples.py this attachment] (seem to) work well for 1- and 2D arrays. The problem gets more complex for higher dimensions. |
| | 47 | |
| 116 | | == A non-exhaustive of features yet missing == |
| 117 | | Some current features of numpy are not yet implemented for ma, either because they were introduced to numpy only recently (eg {{{ndim}}} ?), or because they were never adapted to ma in the first place (eg, the {{{mlab}}} package). As Paul Dubois noted, it does not make sense to extend the handling of missing values to all numpy features (a typical example would be the FFT package). However, ma is still invaluable in many cases, and it's unfortunate that its use is currently a bit limited. |
| 118 | | |
| 119 | | A non exhaustive list of missing features is presented below. The features are organized by potential problems and naive suggestions to solve them. More features will be added as I run into them. |
| 120 | | |
| 121 | | === Case 1 === |
| 122 | | The function would work OK with masked arrays if it called {{{ma.asarray}}} instead of {{{numeric.asarray}}} (as it's currently the case). A fix could be to add a {{{mask=False_}}} property by default to any {{{ndarray}}}, and get rid of the {{{MaskedArray}}} class ? |
| 123 | | * '''{{{diff}}}''' |
| 124 | | |
| 125 | | |
| 126 | | === Case 2 === |
| 127 | | The function can be applied only to the data part once missing values are adequately filled. If needed, the masked version is obtained easily by applying the initial mask to the result. An {{{use_missing}}} option could be introduced to allow the use of missing values (the output would be masked), or discard them (default option?). |
| 128 | | * '''{{{ndim}}}''': The masked array could inherit {{{ndim}}} from its {{{data}}} part. Implemented in changeset:2185. |
| 129 | | * '''{{{std, var}}}''': An example of implementation of the function (not the method) {{{std}}} is given [attachment:ma_examples.py there]. |
| 130 | | * '''{{{trace}}}''': Fill with 0 if {{{use_missing}}} is False. |
| 131 | | * '''{{{cumprod, cumsum}}}''': |
| 132 | | * {{{use_missing=True}}}: The output is masked for indices [''i''...''N''], where ''i'' is the index of the first missing value, and ''N'' the nb of data (including missing). |
| 133 | | * {{{use_missing=False}}}: Fill the initial missing values by 1 for {{{cumprod}}} or 0 for {{{cumsum}}}. |
| 134 | | |
| 135 | | === Case 3 === |
| 136 | | The function must be applied to both the data part and the mask. I expect it's the case for most of the functions in '''{{{shape_base}}}''', '''{{{index_trick}}}'''. As an illustration, a quick and dirty adaptation of the concatenator {{{r_}}} can be: |
| 137 | | {{{ |
| 138 | | #!python |
| 139 | | mar_ = lambda seq:ma.array(data=[s.data for s in seq],mask=[s.mask for s in seq]) |
| 140 | | }}} |
| 141 | | * '''{{{swapaxes, squeeze}}}''' |
| 142 | | |
| 143 | | === Case 4 === |
| 144 | | The trickiest case where missing values must be remain masked during the process. |
| 145 | | |
| 146 | | * '''{{{median}}}''': The two functions in [attachment:ma_examples.py this attachment] (seem to) work well for 1- and 2D arrays. The problem gets more complex for higher dimensions. |
| 147 | | |