Ticket #709 (closed defect: fixed)

Opened 4 years ago

Last modified 17 months ago

ValueError: array dimensions are not compatible for copy

Reported by: nils Owned by: somebody
Priority: high Milestone: 0.10.0
Component: scipy.linalg Version: devel
Keywords: Cc:

Description

python 2dof.py

>>> scipy.__version__
'0.7.0.dev4557'
>>> import numpy
>>> numpy.__version__
'1.2.0.dev5516'
Traceback (most recent call last):
  File "2dof.py", line 38, in <module>
    w,v = eig(A,B)
  File "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/linalg/decomp.py", line 156, in eig
    return _geneig(a1,b,left,right,overwrite_a,overwrite_b)
  File "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/linalg/decomp.py", line 91, in _geneig
    vr = _make_complex_eigvecs(w, vr, t)
  File "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/linalg/decomp.py", line 39, in _make_complex_eigvecs
    vnew.real = numpy.take(vin,ind[::2],1)
ValueError: array dimensions are not compatible for copy

Attachments

2dof.py (0.9 KB) - added by nils 4 years ago.
ugly-hack.patch (4.7 KB) - added by pv 4 years ago.
Fixes the problem in one way (3). Not probably tested well enough.
2dof.f (1.6 KB) - added by pv 4 years ago.
Pure-Fortran testcase for LAPACK concerning this problem

Change History

Changed 4 years ago by nils

Changed 4 years ago by pv

Triaging:

Minimal testcase:

from scipy.linalg import eig

c1 = -87584./9801
c2 = 50./99

A = [[1, 0,  0,  0],
     [0, 1,  0,  0],
     [0, 0,  c1, 0],
     [0, 0,  0, c1]]

B = [[0, 0,  1,   0],
     [0, 0,  0,   1],
     [1, 0,  0, -c2],
     [0, 1, c2,   0]]

eig(A,B)

The generalized eigenvalue pair (alpha, beta) as given by LAPACK GGEV are

alpha [ 0. +2.79469606e+00j  0. -2.79469606e+00j  0. +0.00000000e+00j 0. -2.18645248e-16j]
beta  [      8.59238852e-01       8.59238852e-01       0.00000000e+00      7.95804395e-17]

The actual generalized eigenvalue is lambda = alpha/beta, and here there is a division by zero, and a division by near zero.

What should eig return here? I note that Octave's QZ returns here

octave:19> qz(A,B)
ans =

   0.0000 + 3.2525i
   0.0000 - 3.2525i
   0.0000 - 2.7475i

ie. it omits the 0/0 "eigenvalues".

OTOH, scipy's eig sometimes returns NaNs?, eg. for the following ill-posed problem:

>>> eig([[1,0],[0,0]], [[1,0],[0,0]])
(array([  1. +0.j,  NaN NaNj]), array([[ 1.+0.j,  0.+0.j],
       [ 0.+0.j,  1.+0.j]]))

So, I guess, this ticket amounts to a bug in _make_complex_eigvecs not always handling NaNs? properly.

Changed 4 years ago by nils

I run your minimal testcase. No problem here:

>>> w
array([ 0.+3.25252525j,  0.-3.25252525j,  0.+2.74747475j,  0.-2.74747475j])
>>> vr
array([[ 0.00000000+1.j        ,  0.00000000-1.j        ,
         1.00000000+0.j        ,  1.00000000-0.j        ],
       [-1.00000000-0.j        , -1.00000000+0.j        ,
         0.00000000-1.j        ,  0.00000000+1.j        ],
       [ 0.30745342+0.j        ,  0.30745342-0.j        ,
         0.00000000-0.36397059j,  0.00000000+0.36397059j],
       [ 0.00000000+0.30745342j,  0.00000000-0.30745342j,
        -0.36397059+0.j        , -0.36397059-0.j        ]])

Octave's output is strange. The entries of the matrices A and B are real. Therefore the eigenvalues of the pencil are either real or appear as conjugate pairs.

Changed 4 years ago by pv

For which value of *p* (in the loop of 2dof.py) your original 2dof.py fails? Or does it fail at all? This seems like a rounding issue, so the exact value of *p* may well vary from one LAPACK installation to another.

Which Lapack are you using? I'm using the Netlib one (3.1.1-0.4ubuntu1) as shipped by Ubuntu, with Netlib blas, also as shipped by Ubuntu.

But this seems like a LAPACK issue to me: the ALPHAI vector returned from LAPACK is already bogus. One of the two eigenvalues in the latter complex pair is missing, and LAPACK fails to notice this. The problem is apparently somehow ill-conditioned.

I'm not sure what we should do:

  1. Bailing out with LinAlgError? is one possibility: I think we can detect this particular error by looking for ALPHAI(i+1) < 0 preceded by ALPHAI(i) <= 0 in the vector of eigenvalues.
  1. Or maybe we should attempt to fix bogus zero ALPHAIs by setting zeros following positives and zeros preceding negatives manually.
  1. Or just accept that LAPACK is flawed and return as much as we can without trying to correct its errors.

A patch ("ugly-hack.patch") doing 3) is attached: it deobfuscates the _make_complex_eigvecs routine and makes it to work also in cases where the eigenvalue vector is malformed. It won't rescue clobbered eigenvalues, but will handle cases where one of the two entries in the eigenvalue vector is malformed. The vector corresponding to the "NaN" eigenvalue will be garbage, though.

I'm not sure how well Scipy is covered by unit tests, so I'm a bit afraid committing this before it's been better tested... And we'll pick one of the three options.

If someone manages to reduce this to a simple Fortran test case, maybe also upstream LAPACK might be interested in fixing this? It doesn't seem like this is the intended behaviour.

Changed 4 years ago by pv

Fixes the problem in one way (3). Not probably tested well enough.

Changed 4 years ago by nils

I am using the source distribution lapack-3.1.1 and ATLAS3.8.2.

The first error in 2dof.py appears when *p*=1.6666666666666667. {{{Traceback (most recent call last):

File "2dof.py", line 38, in <module>

w,v = eig(A,B)

File "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/linalg/decomp.py", line 156, in eig

return _geneig(a1,b,left,right,overwrite_a,overwrite_b)

File "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/linalg/decomp.py", line 91, in _geneig

vr = _make_complex_eigvecs(w, vr, t)

File "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/linalg/decomp.py", line 39, in _make_complex_eigvecs

vnew.real = numpy.take(vin,ind[::2],1)

ValueError?: array dimensions are not compatible for copy

Omega

1.6666666666666667 }}}

Changed 4 years ago by pv

Pure-Fortran testcase for LAPACK concerning this problem

Changed 4 years ago by nils

f77 2dof.f -o 2dof -L/data/home/nwagner/src/ATLAS3.8.2/Linux_Intel_Xeon/lib/ -llapack -lf77blas -latlas
 Checking for missing eigenvalue. If you see no
 further output, you don't have this problem.
 Missing (?) complex pair for omega =  1.86868687
 1  0.  1.47904779  0.303787824
 2  0. -1.47904779  0.303787824
 3  0.  0.  0.
 4  0. -1.00480791E-15  8.8817842E-16

Changed 4 years ago by nils

On a 32-bit machine and ATLAS3.7.11

g77 2dof.f -o 2dof -L/usr/local/lib/atlas -llapack -lf77blas -latlas
./2dof
 Checking for missing eigenvalue. If you see no
 further output, you don't have this problem.
 Missing (?) complex pair for omega =  1.31313131
 1  0.  1.91032895  0.442909991
 2  0. -1.91032895  0.442909991
 3  0.  0.  0.
 4  0. -2.60436313E-16  1.54390389E-16
 Missing (?) complex pair for omega =  2.67676768
 1  0.  3.93404812E-16  6.93008476E-17
 2  0. -2.68886246E-17  4.73660824E-18
 3  0.  0.  0.
 4  0. -1.21185329E-16  3.74917111E-16

Changed 4 years ago by pv

Asked about it here: http://icl.cs.utk.edu/lapack-forum/viewtopic.php?f=2&t=807&start=0&st=0&sk=t&sd=a some time ago but got no answer. Going to try if the Lapack devs answer mail better...

Changed 3 years ago by cdavid

What's the status on this, Pauli ?

Changed 3 years ago by pv

In the end, I forgot to send the mail :/ Sent now, and CC'd to scipy-dev.

Anyway, this is starting to look like a subtle compiler/optimization bug:

# lapack 3.2, with only -g flag in make.inc OPTS
$ gfortran --version
4.3.2-1ubuntu11
$ make clean
...
$ make lib
...
$ gfortran -o 2dof 2dof.f lapack_LINUX.a -lblas-3
$ ./2dof 
Checking for missing eigenvalue. If you see no
further output, you don't have this problem.

No bug there. But,

# lapack 3.2, with -O2 flag in make.inc OPTS
$ make clean
...
$ make lib
$ gfortran -o 2dof 2dof.f lapack_LINUX.a -lblas-3
$ ./2dof 
 Checking for missing eigenvalue. If you see no
 further output, you don't have this problem.
 Missing (?) complex pair for omega =  0.85858585858585856     
           1   0.0000000000000000        0.0000000000000000        0.0000000000000000     
           2   0.0000000000000000      -1.56357833001198695E-016  4.05220561966457282E-017
           3   0.0000000000000000       5.05904065833670526E-016  2.36247653384591416E-016
           4   0.0000000000000000      -5.12056639374098328E-016  2.39120789141678003E-016
 Missing (?) complex pair for omega =   2.6767676767676769     
           1   0.0000000000000000        1.0350059973275267       0.18232312052566751     
           2   0.0000000000000000       -1.0350059973275267       0.18232312052566751     
           3   0.0000000000000000        0.0000000000000000        0.0000000000000000     
           4   0.0000000000000000      -1.21185328887430225E-016  3.74917111245487433E-016

Lapack 3.1.1 has -O3 as the default, and also this produces an error (whereas -g doesn't). Ubuntu compiles by default with -O3.

On another machine, 64-bit instead of 32-bit, with gfortran 4.3.1, I always get an error independent of optimization flags.

Changed 3 years ago by jarrod.millman

  • cc millman@… removed

Changed 3 years ago by pv

  • status changed from new to needs_decision

Changed 17 months ago by pv

  • status changed from needs_decision to closed
  • resolution set to fixed
  • milestone changed from 0.9.0 to 0.10.0

This is a bug in LAPACK, but I nevertheless added a workaround for it in r7002, as the LAPACK guys don't seem to be in a hurry to fix it -- the bug is somehow related to which optimization flags LAPACK is compiled with.

(The old implementation of _make_complex_eigvecs was also a bit wonky, and it makes sense to clean it up at the same time.)

Note: See TracTickets for help on using tickets.