[SciPy-dev] Presentation of pymachine, a python package for machine learning

David Cournapeau david@ar.media.kyoto-u.ac...
Mon May 14 20:48:52 CDT 2007


Peter Skomoroch wrote:
> I followed some of the discussion around datasets in the previous 
> threads.  As you mentioned, it might make sense to make some larger 
> datasets available separately or as cran-style optional installs, but 
> I think it would also be a big plus for scipy to have some more 
> smaller built-in datasets of various types.
I agree, but there is the problem of licensing. I don't know what is the 
status of data from a legal point of view: if I copy data from a book, I 
guess this is copyrighted as the rest of the book. But then there are 
some famous datasets (eg old faithful, iris, etc...) which are available 
in different softwares with different licenses.

Copying the datasets of R (in r-base) would be useful, but they fall 
under the GPL, hence they cannot be included in scipy, at least if 
datasets are also under the GPL. Unfortunately, I don't know the legal 
details of those cases (status of data in a package licensed under the GPL).

Anyway, if I have some useful datasets, I will certainly make them in a 
separate package, so that it can be used outside pymachine.

David


More information about the Scipy-dev mailing list