Ticket #605 (closed defect: fixed)

Opened 3 years ago

Last modified 2 years ago

Incorrect behaviour of numpy.histogram

Reported by: Elby Owned by: somebody
Priority: normal Milestone: 1.2.0
Component: numpy.lib Version: none
Keywords: Cc:

Description

The behavior of numpy.histogram is not consistent with its doc string :

  • doc string says that, with a range argument, values outside of this range are allocated to the closest bin.
  • in fact values below the range are simply ignored.

There was a discussion on this subject on the scipy.user mailing list: http://groups.google.com/group/scipy-user/browse_frm/thread/3b3166e2200f846b/d6040fb6b659c6dd?hl=fr&lnk=gst&q=histogram#d6040fb6b659c6dd

IMHO, the current behavior of numpy.histogram, that is assuming that values below the range are outliers, is not what a neophyte is waiting for, and should be clearly stated.

Besides, the user should have the possibility to choose what to do with values outside the range : just ignoring them is not a good idea in most of the case I've seen.

Attachments

histogram.patch (8.4 KB) - added by dhuard 2 years ago.
A new histogram function (breakage warning).

Change History

  Changed 3 years ago by Elby

  • summary changed from Incorerdct behaviour of numpy.hiistogram to Incorerdct behaviour of numpy.histogram

  Changed 3 years ago by Elby

  • summary changed from Incorerdct behaviour of numpy.histogram to Incorrect behaviour of numpy.histogram

  Changed 2 years ago by jarrod.millman

  • severity changed from normal to blocker

Changed 2 years ago by dhuard

A new histogram function (breakage warning).

  Changed 2 years ago by dhuard

The patch makes sure that:

  • outliers are not counted.
  • normalization is done correctly, that is, sum(hist*diff(bins))==1.

Also, I added support for weights for good measure.

It breaks the current behavior in the following ways:

  • returns the bin edges (nbins+1) instead of the left bin edges.
  • upper outliers are not put in the closest bin.

  Changed 2 years ago by pv

I think this was fixed in r5085, r5086, r5087, r5088 by dhuard

follow-up: ↓ 7   Changed 2 years ago by dhuard

It's fixed, but I'd like someone to review what I've done.

in reply to: ↑ 6   Changed 2 years ago by peridot

Replying to dhuard:

It's fixed, but I'd like someone to review what I've done.

Looks good to me; the docstring doesn't mention that you can use a sequence of left bin edges and new=False, but maybe that's just as well.

  Changed 2 years ago by charris

  • milestone changed from 1.1.0 to 1.2.0

  Changed 2 years ago by dhuard

This can be closed. The current ticket for the follow up on the histogram change is #797.

  Changed 2 years ago by charris

  • status changed from new to closed
  • resolution set to fixed

Closed as requested. See #797 for followup.

Note: See TracTickets for help on using tickets.