- Evolutionary Learning on Structured Data for Artificial Neural Networks, Honours Thesis, July 2002 [PS.GZ 320kB] [PDF 712kB]
These data files were used to test my evolutionary machine learning algorithm. They are collected from a number of sources, the details of which are provided in my Honours thesis.
- nParity training data
600 random sets of bit values labelled with the parity of the set (Odd or Even). Created by me.
- Binary search tree training data
500 random binary trees labelled as binary search trees (Positive) or not (Negative). Created by Kee Siong Ng
- Predictive toxicology challenge processed carcinogenicity data
351 molecules classified as carcinogenic (Positive) or not (Negative). This data has been processed significantly from that provided originally for the PTC. See the PTC homepage for details about the original dataset. This data is a cut down version of the data of carcinogenicity of molecules on female rats.
- Musk-1 and Musk-2 data.
92 and 102 molecules classified as smelling as musk or not. They are described by Dietterich et al in Solving the multiple instance problem with axis-parallel rectangles, Artificial Intelligence, 89:31-71, 1997.
- Two spirals classification data
This data was generated by me conforming (to the best of my knowledge) to the two spirals machine learning problem. There are two intertwined spirals going around the origin three times.