PossibleOptimizationAreas/FillDiscussion

Version 1 (modified by sasha, 7 years ago)

moved ndarray.fill discussion to a separate page

> python -m timeit -s "from numpy import zeros; x = zeros(10000,'b')" "x.fill(1)"
10000 loops, best of 3: 69.5 usec per loop
> python -m timeit -s "from numpy import zeros; x = zeros(10000,'h')" "x.fill(1)"
10000 loops, best of 3: 66.1 usec per loop
> python -m timeit -s "from numpy import zeros; x = zeros(10000,'i')" "x.fill(1)"
10000 loops, best of 3: 66.3 usec per loop
> python -m timeit -s "from numpy import zeros; x = zeros(10000,'d')" "x.fill(1)"
10000 loops, best of 3: 73.2 usec per loop
> python -m timeit -s "from numpy import zeros; x = zeros(10000,'b')" "x += 1"
10000 loops, best of 3: 58 usec per loop
> python -m timeit -s "from numpy import zeros; x = zeros(10000,'h')" "x += 1"
10000 loops, best of 3: 33.7 usec per loop
> python -m timeit -s "from numpy import zeros; x = zeros(10000,'i')" "x += 1"
10000 loops, best of 3: 33.6 usec per loop
> python -m timeit -s "from numpy import zeros; x = zeros(10000,'d')" "x += 1"
10000 loops, best of 3: 36.9 usec per loop

The attached patch results in the following timings:

> python -m timeit -s "from numpy import zeros; x = zeros(10000,'b')" "x.fill(1)"
100000 loops, best of 3: 4.55 usec per loop
> python -m timeit -s "from numpy import zeros; x = zeros(10000,'h')" "x.fill(1)"
100000 loops, best of 3: 12 usec per loop
> python -m timeit -s "from numpy import zeros; x = zeros(10000,'i')" "x.fill(1)"
100000 loops, best of 3: 12.4 usec per loop
> python -m timeit -s "from numpy import zeros; x = zeros(10000,'d')" "x.fill(1)"
100000 loops, best of 3: 13 usec per loop

Note the more than 10x improvement in the 'b' case.

Attachments