| | 1 | = Mlabwrap proxy objects = |
| | 2 | |
| | 3 | == The problem with proxied objects == |
| | 4 | |
| | 5 | Matlab's structs are represented by python's Mlabobjectproxy objects in mlabwrap. |
| | 6 | The python variable {{{ps}}} in the following example is a Mlabobjectproxy object that proxies a matlab struct |
| | 7 | of the form {{{struct('a', [11, 12])}}}. |
| | 8 | |
| | 9 | {{{ |
| | 10 | #!python |
| | 11 | >>> ps = mlab.struct('a', [11, 12]) |
| | 12 | >>> ps |
| | 13 | <MlabObjectProxy of matlab-class: 'struct'; internal name: 'PROXY_VAL6__'; has parent: no> |
| | 14 | a: [2x1 double] |
| | 15 | |
| | 16 | |
| | 17 | >>> ps.a |
| | 18 | array([[ 11.], |
| | 19 | [ 12.]]) |
| | 20 | >>> ps.a[1] |
| | 21 | array([ 12.]) |
| | 22 | }}} |
| | 23 | |
| | 24 | Trouble arisses when we attempt indexed assignment to the array field {{{a}}} of the struct proxy object {{{ps}}}. For example: |
| | 25 | |
| | 26 | {{{ |
| | 27 | #!python |
| | 28 | >>> ps.a[1] = 13 |
| | 29 | >>> ps.a |
| | 30 | array([[ 11.], |
| | 31 | [ 12.]]) |
| | 32 | }}} |
| | 33 | |
| | 34 | This is not the desired effect. We would like {{{ps.a[1]}}} to equal {{{13}}} both on python's side and matlab's side. |
| | 35 | |
| | 36 | The problem is a result of how python interprets attribute references with the dot notation. |
| | 37 | |
| | 38 | For a matlab proxy object, the actual data is stored in the matlab workspace, and the proxy object transparently calls into matlab to fetch the required values. So, python interprets the expression {{{ps.a = 3}}} as {{{ps.__setattr__('a', 3)}}}, and similarly {{{x = ps.a}}} as {{{x = ps.__getattr__('a')}}}. Because `ps.a` is a proxy object, Mlabwrap can override {{{__setattr__}}} and {{{__getattr__}}} for the Mlabproxy objects. Therefore `a = ps.a` called the proxy object `__getattr__` method to fetch the data from the matlab workspace variable and copy it into the python array. Similarly, `ps.a = 3` takes the data (`3`) from python and copies it into the variable in the matlab workspace. |
| | 39 | |
| | 40 | In the example above python interprets {{{ps.a[1] = 3}}} as {{{ps.__getattr__('a').__setitem__(1, 3)}}} where {{{ps.__getattr__('a')}}} asks matlab for the value of {{{ps.a}}} and returns some python array, which is a copy of the array from matlab that contains the same data. The new value is therefore set in the python array copy, but not set in the matlab workspace array that the data came from. |
| | 41 | |
| | 42 | == A solution == |
| | 43 | |
| | 44 | The problem above arises because of the difference in storage between proxy objects and matlab objects that can be fully converted, such as arrays. A solution therefore would be to convert all matlab variables fully to python objects. In the case of matlab structs and objects, we can convert to numpy recarrays and do away with Mlabproxy objects altogether. |
| | 45 | |
| | 46 | When we pull something across from matlab into python simply return a recarray containing the same data. When we call a matlab function on a python variable just copy the variable's data into matlab. We don't proxy anything. |
| | 47 | |
| | 48 | * Matlab matrices are already converted into numpy arrays. |
| | 49 | |
| | 50 | Currently both matlab structs and objects are proxied in mlabwrap. Instead: |
| | 51 | |
| | 52 | * Matlab structs can be converted into numpy recarrays. Recarrays are isomorphic to matlab structs. Their attributes are ordered, allowing for arrays of records that may be indexed by position and attribute. Their attributes may be recarrays, allowing for abitrary recursive structures. |
| | 53 | |
| | 54 | * Matlab objects in memory or on disk are in fact structs of their fields along with a special attribute that defines their class name. Hence matlab objects can also be converted to recarrays. Method calls should not require special provisions since in matlab method calls use the same syntax as function calls. The function or method that is called depends on the object passed, so calling `method1(o)` in matlab, causes matlab to identify `o` as class `someclass`, and then inspect the `someclass` object definition directory for a function `method1`; if this does not exist, matlab looks for `method1` on the global path. To give another example, object {{{o}}} has method {{{some_method(self, arg1)}}}. In matlab we call this method on object {{{o}}} by {{{some_method(o, 3);}}}. |
| | 55 | |
| | 56 | A quick review of numpy recarrays: |
| | 57 | |
| | 58 | {{{ |
| | 59 | #!python |
| | 60 | >>> import numpy |
| | 61 | >>> ra_child1 = numpy.rec.array([1, 2, 3], names=['a', 'b', 'c']) |
| | 62 | >>> ra_child1 |
| | 63 | recarray((1, 2, 3), |
| | 64 | dtype=[('a', '<i4'), ('b', '<i4'), ('c', '<i4')]) |
| | 65 | >>> ra_child1['a'] |
| | 66 | 1 |
| | 67 | >>> ra_child1['c'] |
| | 68 | 3 |
| | 69 | >>> ra_child2 = numpy.rec.array([5, 6], names=['x', 'y']) |
| | 70 | >>> desc = numpy.dtype({'names' : ['child', 'n'], 'formats': ['object', 'float']}) |
| | 71 | >>> ra_parent = numpy.array([(ra_child1, 3.2), (ra_child2, 4.5)], dtype=desc) |
| | 72 | >>> ra_parent |
| | 73 | array([((1, 2, 3), 3.2000000000000002), ((5, 6), 4.5)], |
| | 74 | dtype=[('child', '|O4'), ('n', '<f8')]) |
| | 75 | >>> ra_parent[0] |
| | 76 | ((1, 2, 3), 3.2000000000000002) |
| | 77 | >>> ra_parent[1] |
| | 78 | ((5, 6), 4.5) |
| | 79 | >>> ra_parent['child'] |
| | 80 | array([(1, 2, 3), (5, 6)], dtype=object) |
| | 81 | >>> ra_parent['n'] |
| | 82 | array([ 3.2, 4.5]) |
| | 83 | >>> ra_parent[0]['child']['a'] |
| | 84 | 1 |
| | 85 | >>> ra_parent[1]['child']['x'] |
| | 86 | 5 |
| | 87 | }}} |
| | 88 | |
| | 89 | == Problems with the solution == |
| | 90 | |
| | 91 | This last type of conversion, matlab objects to python recarrays, still has problems. In particular, using recarrays to represent matlab objects by their underlying structs results in certain non-matlab behavior on the python side of things. In matlab we cannot directly access an object's fields. Instead we must either call an accessor method on the object to return a field, or overload the object's {{{subsref}}} method. {{{subsref}}} is called by matlab when the dot notation is used on an object. In this way we can use dot notation to access an object's fields. However, since this dot notation is actually calling an accessor method, this accessor method can really return anything it likes. We can't count on the dot notation to return the field it names. For example, in matlab we cannot count on |
| | 92 | {{{ |
| | 93 | x = o.field1; |
| | 94 | }}} |
| | 95 | actually setting {{{x}}} to the value of {{{o}}}'s {{{field1}}} since {{{o.field1}}} might actually call a method that returns {{{o}}}'s {{{field2}}}, or {{{8}}}. |
| | 96 | |
| | 97 | On the python side, if {{{po}}} is {{{o}}}'s python representation, we can definitely count on |
| | 98 | {{{ |
| | 99 | #!python |
| | 100 | x = po.field1 |
| | 101 | }}} |
| | 102 | setting x to the value of {{{o}}}'s {{{field1}}} since {{{po}}} is just a recarray of {{{o}}}'s fields. |
| | 103 | |
| | 104 | (More: add real funny subsref example.) |
| | 105 | |
| | 106 | A proposed solution to this problem (with the proposed solution) is to embrace this difference in behavior and make it clearly defined. On the python side, whenever an expression appears on the right side of an {{{mlab.}}} the user can expect matlab-like behavior. A few examples: |
| | 107 | {{{ |
| | 108 | #!python |
| | 109 | >>> I = mlab.eye(4) |
| | 110 | >>> B = mlab.inv(A) |
| | 111 | >>> x = mlab.some_accessor_method(po) |
| | 112 | }}} |
| | 113 | Otherwise, python behavior should be expected. |