One common problem of machine learning is the "curse of high dimensionality". When there are too many attributes in the input data, many of the ML algorithms will be very inefficient or some of them will even be non-performing (e.g. in nearest neighbor computation, data points in a high-dimensional space are pretty much equal distance with each other). It is quite possible that the attributes we selected are inter-dependent on each other. If so, we may be able to extract a smaller subset of...