When dealing with high-dimensional data, a lot can be gained, in terms of both computational time and precision, by only considering the most important features. Many feature selection methods are based on the assumption that important features are highly correlated with their corresponding classes, but mainly uncorrelated with each other. Often, this assumption can help eliminate redundancies and produce good predictors using only a small subset of features. However, when the predictability depends on interactions between the features, such methods will fail to produce satisfactory results. Also, since the suitability of the selected features depends on the learning algorithm in which they will be used, correlation-based filter methods might not be optimal when using genetic programs as the final classifiers, as they fail to capture the possibly complex relationships that are expressible by the genetic programming rules. In this thesis a method that can find important features, both independently and dependently discriminative, is introduced. This method works by performing two different types of permutation tests that classifies each of the features as either irrelevant, independently predictive or dependently predictive. The proposed method directly evaluates the suitability of the features with respect to the learning algorithm in question. Also, in contrast to computationally expensive wrapper methods that require several subsets of features to be evaluated, a feature classification can be obtained after only one single pass, even though the time required does equal the training time of the classifier. The evaluation shows that the attributes chosen by the permutation tests always yield a classifier at least as good as the one obtained when all attributes are used during training - and often better. The proposed method also fares well when compared to other attribute selection methods such as RELIEFF and CFS.