| Version 24 (modified by rgommers, 2 months ago) |
|---|
The Numpy and Scipy GSoC projects are run under the umbrella of the Python Software Foundation (PSF). The PSF info for GSoC 2013 can be found at http://wiki.python.org/moin/SummerOfCode/2013. In particular, look at what is expected of students: http://wiki.python.org/moin/SummerOfCode/Expectations. Note that it's important to start discussing your project idea with the community and your potential mentor early, don't wait till the last week! Also don't wait till close to the application deadline with submitting your first pull request to Numpy/Scipy on Github; getting used to the workflow and reworking your patch for review comments may take more time than you expect.
Summer of Code 2013 Ideas for Numpy & Scipy
Consistent empty array handling in Numpy and Scipy
<describe>
Full support for building Scipy with Bento
Scipy can currently be built in two ways: with numpy.distutils and with Bento. Building Scipy with Bento is the way of the future: it's much faster and has better support for Scipy's complex build requirement. However Bento support is relatively new and not yet complete. Possible goals of this project are:
- Robust builds on at least Windows, Linux and OS X.
- Support straightforward building against multiple BLAS/LAPACK implementations (Netlib BLAS/LAPACK, ATLAS, Intel MKL, OpenBLAS).
- Unify templating tools in build scripts. This could mean adding a small templating library to Bento itself.
- Setting up a continuous integration server which tests Bento builds for Python 2.x & 3.x, on various platforms.
- Improved reporting for user build configurations. This would help a lot in diagnosing build-related problems that users encounter.
Ideas from previous years that may still be relevant
Improve missing data (NA) support in Numpy
Numpy 1.7 includes for the first time support for missing data, implemented in a way similar to how it works in R. There is a lot more to do in this direction, for example:
- Size - that requires bit masks and a decision that masks only take two values.
- Speed - that requires support in the ufunc loops.
- Functions - isna needs some help, like isanyna(a, axis=1)
- More support in current functions.
- Implement NA support for relevant projects which depend on numpy (for example pandas).
- Continue work on Fwrap started by Kurt Smith in GSoC2009
- Improve datasource and integrate it into all the numpy/scipy io http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/lib/_datasource.py
- Statistics Cleanup
- scipy.ndimage: Rewrite in Python where possible, port to Cython elsewhere. Decide on a consistent coordinate framework. As a bonus, fix boundary issues.
- Leverage patches from CellProfiler developers for this, see e.g. patches with ticket #945
- Automatic Differentiation - there are existing implementations and work in progress (algopy, openopt)
