Brute force for popular models, Feature Reduction and Scaling Techniques for Classification [Python]
In the field of machine learning, it is often advised not to solely rely on ML for every problem you encounter. For instance, rule number one of Martin Zinkevich, a research scientist at Google, is “you should not be afraid to launch a product without Machine Learning.”
“If the solution meets the requirements with simple rules or heuristics then there is no reason to use ML“ -Kurtis Pykes, However, in this particular discussion, we will temporarily set aside these rules and explore the possibility of creating an all-encompassing solution using ML, purely for the sake of amusement. we will build the absolute solution to use in many instances just for fun!
Importing Libraries: We will begin by importing the necessary libraries from scikit-learn. The code snippet imports the required modules, including dataset loading, model selection, preprocessing, feature selection, classification models, and performance metrics. Additionally, we suppress any warning messages to ensure clean output.
Remember this is just for fun!
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.decomposition import PCA, NMF
from sklearn.feature_selection import SelectKBest, chi2, RFECV, SelectFromModel
from sklearn.linear_model import LogisticRegression, Lasso
from sklearn.svm import SVC, LinearSVC
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier, GradientBoostingClassifier, ExtraTreesClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.tree import DecisionTreeClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
Loading the Dataset and initiate accuracy and best model parameters:
The Iris dataset is loaded using the load_iris
function from sklearn.datasets
, for beginners this is previously prepared data (cleaned) that is provided by sklearn
to learn on it.
you can enter you own data here, The features are stored in the variable X
, and the corresponding ‘labels’ (results that we will compare our predition to) are stored in y
.
# Load the dataset
iris = load_iris()
X = iris.data
y = iris.target
Splitting the Dataset: To evaluate the models, we split the dataset into training and testing sets using the train_test_split
function from sklearn.model_selection
. The training set will contain 40% of the data, while the testing set will contain 60%. This split allows us to train the models on a subset of the data and evaluate their performance on unseen samples from model perspective.
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.6,
random_state=42)
Defining Models: We define a dictionary of popular classification models, including logistic regression, support vector machines, random forest, AdaBoost, gradient boosting, k-nearest neighbors, Gaussian naïve Bayes, decision tree, and multi-layer perceptron. Each model is initialized with its default parameters.
# Define a dictionary of popular models
models = {
"Logistic Regression": LogisticRegression(),
"Support Vector Machine": SVC(),
"Random Forest": RandomForestClassifier(),
"AdaBoost": AdaBoostClassifier(),
"Gradient Boosting": GradientBoostingClassifier(),
"K-Nearest Neighbors": KNeighborsClassifier(),
"Gaussian Naive Bayes": GaussianNB(),
"Decision Tree": DecisionTreeClassifier(),
"Multi-layer Perceptron": MLPClassifier(),
"Extra Trees": ExtraTreesClassifier()
}
Feature Reduction Techniques: In this section, we explore three feature reduction techniques: Principal Component Analysis (PCA), SelectKBest, and Recursive Feature Elimination (RFECV). These techniques help reduce the dimensionality of the feature space and retain the most informative features.
For PCA, we iterate over a range of values for n_components
, which represents the number of principal components to keep. For SelectKBest, we iterate over a range of values for k
, which represents the number of top features to select based on statistical tests. Lastly, for RFECV, we iterate over a range of values for cv
, which represents the number of cross-validation folds to perform during feature elimination.
# Define a dictionary of feature reduction techniques
feature_reduction_techniques = {
"PCA": PCA(),
"NMF": NMF(),
"SelectKBest": SelectKBest(chi2),
"Recursive Feature Elimination": RFECV(estimator=LogisticRegression()),
"Lasso": SelectFromModel(estimator=Lasso()),
"Linear SVC": SelectFromModel(estimator=LinearSVC())
}
Feature Scaling Techniques: Next, we explore two feature scaling techniques: StandardScaler and MinMaxScaler. These techniques normalize the feature values to a specific range, improving the convergence and performance of certain models.
# Define a dictionary of feature scaling techniques
feature_scaling_techniques = {
"StandardScaler": StandardScaler(),
"MinMaxScaler": MinMaxScaler()
}
Model Evaluation: We iterate through the feature reduction techniques, scaling techniques, and classification models. For each combination, we apply the reduction and scaling techniques to the training and testing sets. Then, we train the model, make predictions on the scaled test set, and calculate the accuracy using the accuracy_score
function from sklearn.metrics
.
for reduction_name, reduction_model in feature_reduction_techniques.items():
for X_train_reduced, X_test_reduced in apply_feature_reduction(reduction_model, X_train, X_test):
for scaling_name, X_train_scaled, X_test_scaled in apply_feature_scaling(feature_scaling_techniques, X_train_reduced, X_test_reduced):
for model_name, model in models.items():
# Train the model
model.fit(X_train_scaled, y_train)
# Make predictions on the scaled test set
y_pred = model.predict(X_test_scaled)
confusion_mat = confusion_matrix(y_test, y_pred)
# Calculate accuracy from the confusion matrix
accuracy = np.trace(confusion_mat) / np.sum(confusion_mat)
# Update the best model and accuracy if necessary
if accuracy > best_accuracy:
best_accuracy = accuracy
best_model_name = model_name
best_confusion_matrix = confusion_mat
There is multiple for instances to loop through all reduction technique and assign different parameters to each one of them.
Throughout the iterations, we track the maximum accuracy achieved and the corresponding best model. Once all combinations have been evaluated, we print the results, including the feature reduction technique, feature scaling technique, best model, and accuracy.
Conclusion: In this article, we explored different feature reduction and scaling techniques using scikit-learn. By combining these techniques with various classification models, we aimed to find the best model and its associated accuracy for the Iris dataset. Proper feature reduction and scaling are crucial for improving classification performance. The code provided can be modified and extended to explore other datasets and combinations of techniques, allowing students, curious practitioners, or even researchers to identify the most effective approach for their specific classification tasks.
For the code please check:
github.com/Barood-cmd/BruteforceML/
Thanks for your time! hopefully I will see you again.