| Version 2 (modified by rkern, 7 years ago) |
|---|
Statistics Review
In order to ensure the high quality of scipy, we are declaring April and May to be Statistics Review Months. By the end of the month, we want the community to have thoroughly reviewed every function in source:trunk/Lib/stats/stats.py and source:trunk/Lib/stats/morestats.py, at least. Peer review seems to be working for the scientific community; it will probably work for us, too.
A ticket has been created for each function. As you review a function, you can add your comments to its ticket. If you don't have SVN commit access, you can also add patches to fix the function. Ticket comments should primarily add information; disagreements and back-and-forth discussion should happen on the scipy-dev mailing list. All of the review tickets have a ticket_type of review and the milestone Statistics Review Months.
At the end of the month, all of the functions that have not yet passed review will be moved to scipy.sandbox.stats or removed entirely as appropriate. If the Statistics Review Months work out well, we'll move on to other Review Months. Among other things, we will treat the probability distributions separately from the Statistics Review Months just to keep the work manageable. Special Functions Month is also an appealing target.
Another possibility is to have month-long sprints with a focus on implementing a lot of new functionality in easily digestible chunks. For example we might have LAPACK Month where we provide f2py wrappers for every important subroutine in LAPACK.
Thank you for contributing to SciPy and making it a rock-solid numerical library!
Review Checklist
These items should all be true for a function to pass review.
1. The function works. Sometimes, you just have to state the obvious.
2. The function has a complete docstring. The docstring should clearly describe what the function does, what each argument should be, and what the function returns. If necessary, it should also contain a reference. Since many of these functions are named after prolific people, the docstring should be descriptive enough to disambiguate between all of the procedures that might have a similar name. Between the docstring and the reference, even an inexperienced user should not have to go Googling to find out what the function does.
3. One or more unit tests sufficiently exercise the function. Per item 1, these tests should pass.
4. The algorithm is appropriate. It doesn't have to be the best algorithm for the task (see below), just one that is good enough.
5. The code is clean and uses modern idioms.
6. The function is part of the public API or is used by a function in the public API. Many of the functions are internal support functions; they should be documented as such. There may even be stragglers that aren't used at all.
Questions to Answer
While each of the above statements should be True, the following questions should simply be answered during the review process. The actual answer will not prevent the function from passing review, though.
1. Are there better or alternative algorithms? For example, there is more than one way to compute means and variances. We don't have to use those other algorithms, but it would be good to document what they are and what tradeoffs they entail.
2. Is the functionality complete? "The function does what it says it can do," is sufficient for it to pass review, but the review process may suggest obvious extensions; these should be documented if not implemented.
3. What else depends on this function?
4. Is there a friendly set of examples or a tutorial for using the function? Unit tests alone don't count although some of them can be easily adapted for a tutorial.
