Medical Image Analysis for Brain Tumor Detection Using Convolutional Neural Networks (CNN) in Python¶
Description:¶
This project focuses on medical image analysis for detecting brain tumors using deep learning techniques. The task is to classify brain MRI images into two categories: "Tumor" and "No Tumor." A Convolutional Neural Network (CNN) is used to extract features and perform classification. The dataset, sourced from Kaggle, consists of 253 images, with equal representation of both tumor and non-tumor cases.
Key Features:¶
Data Preprocessing: The dataset is loaded, resized, and normalized for better model performance.
Data Augmentation: To address class imbalance and enhance model generalization, image augmentation techniques like rotation, zoom, and horizontal flip are applied.
Model Architecture: The model is built using a CNN architecture consisting of convolutional layers (Conv2D) for feature extraction, MaxPooling2D layers for down-sampling, and dense layers for classification.
Class Imbalance Handling: The model incorporates class weights during training to mitigate the imbalance between tumor and non-tumor classes.
Performance Evaluation: The model's performance is evaluated using test data, with accuracy, classification report, and confusion matrix generated for detailed performance insights. An ROC curve and AUC score are also visualized for further assessment.
Model Saving: The trained model is saved for future predictions and deployment.
Achievements:¶
Initial Accuracy: Achieved an accuracy of 90.00% on a sample of 30 images.
Extended Accuracy: After extending to 51 images, the accuracy reached 86.27%, demonstrating robustness with increased data.
Confusion Matrix: Visualized the confusion matrix to assess model performance across both classes (Tumor and No Tumor).
Image Visualization: Displayed predictions on 51 and 30 test images to further assess model accuracy and performance.
Conclusion:¶
This project demonstrates the effectiveness of Convolutional Neural Networks for classifying brain tumor images with high accuracy. By utilizing deep learning techniques, data augmentation, and addressing class imbalance, the model successfully performs image classification for medical diagnosis. The system can potentially be used for automated brain tumor detection in clinical settings, providing doctors with a valuable tool for decision-making.
import os
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix, roc_curve, auc
from sklearn.utils import class_weight # Ensure class_weight is correctly imported
from PIL import Image
# Load and visualize images with and without tumor
tumor_image_path = "brain_tumor_dataset/Tumor/Y92.png" # Replace with the actual path to your image
no_tumor_image_path = "brain_tumor_dataset/No Tumor/8 no.jpg" # Replace with the actual path to your image
tumor_image = Image.open('brain_tumor_dataset/Tumor/Y92.png')
no_tumor_image = Image.open('brain_tumor_dataset/No Tumor/8 no.jpg')
plt.figure(figsize=(8, 4))
plt.subplot(1, 2, 1)
plt.imshow(tumor_image)
plt.title("Brain with Tumor")
plt.axis("off")
plt.subplot(1, 2, 2)
plt.imshow(no_tumor_image)
plt.title("Brain without Tumor")
plt.axis("off")
plt.show()
# Assuming dataset structure
# data/
# ├── Tumor/
# └── No_Tumor/
data_dir = "brain_tumor_dataset" # Replace with your dataset path
categories = ["Tumor", "No Tumor"]
data = []
labels = []
for category in categories:
path = os.path.join(data_dir, category)
class_label = categories.index(category) # 0 for Tumor, 1 for No_Tumor
for img_name in os.listdir(path):
try:
img_path = os.path.join(path, img_name)
img = Image.open(img_path).convert("RGB").resize((150, 150))
data.append(np.array(img))
labels.append(class_label)
except Exception as e:
print(f"Error loading image {img_name}: {e}")
data = np.array(data)
labels = np.array(labels)
# Normalize the data
data = data / 255.0
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42, stratify=labels)
# Convert labels to categorical
y_train_cat = to_categorical(y_train, num_classes=2)
y_test_cat = to_categorical(y_test, num_classes=2)
# Compute class weights to handle class imbalance
class_weights = class_weight.compute_class_weight(
'balanced', classes=np.unique(y_train), y=y_train
)
class_weights_dict = {i: class_weights[i] for i in range(len(class_weights))}
# Data Augmentation
datagen = ImageDataGenerator(
rotation_range=30,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.3,
horizontal_flip=True,
fill_mode="nearest"
)
datagen.fit(X_train)
# Ensure the generator can repeat data
train_generator = datagen.flow(X_train, y_train_cat, batch_size=32)
# Calculate steps per epoch (for 253 images and batch size 32)
steps_per_epoch = len(X_train) // 32 # This will be 253 // 32 = 7
# Model architecture
model = Sequential([
Conv2D(32, (3, 3), activation="relu", input_shape=(150, 150, 3)),
MaxPooling2D((2, 2)),
Conv2D(64, (3, 3), activation="relu"),
MaxPooling2D((2, 2)),
Conv2D(128, (3, 3), activation="relu"),
MaxPooling2D((2, 2)),
Flatten(),
Dense(128, activation="relu"),
Dropout(0.5),
Dense(2, activation="softmax")
])
# Compile the model
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
# Train the model with class weights
history = model.fit(
train_generator, # Use the generator here
validation_data=(X_test, y_test_cat),
epochs=20,
steps_per_epoch=steps_per_epoch, # Use calculated steps per epoch
verbose=1,
class_weight=class_weights_dict # Apply class weights during training
)
Epoch 1/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 3s 529ms/step - accuracy: 0.6782 - loss: 0.6257 - val_accuracy: 0.8627 - val_loss: 0.5145 Epoch 2/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 58ms/step - accuracy: 0.5312 - loss: 0.7340 - val_accuracy: 0.8627 - val_loss: 0.5201 Epoch 3/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 719ms/step - accuracy: 0.6615 - loss: 0.6169 - val_accuracy: 0.7451 - val_loss: 0.5369 Epoch 4/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 50ms/step - accuracy: 0.6250 - loss: 0.6531 - val_accuracy: 0.8235 - val_loss: 0.5354 Epoch 5/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 675ms/step - accuracy: 0.6861 - loss: 0.6314 - val_accuracy: 0.7451 - val_loss: 0.5237 Epoch 6/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 70ms/step - accuracy: 0.5312 - loss: 0.8124 - val_accuracy: 0.8431 - val_loss: 0.5137 Epoch 7/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 5s 771ms/step - accuracy: 0.6759 - loss: 0.5780 - val_accuracy: 0.8235 - val_loss: 0.5054 Epoch 8/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 71ms/step - accuracy: 0.6000 - loss: 0.5554 - val_accuracy: 0.6667 - val_loss: 0.5250 Epoch 9/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 5s 760ms/step - accuracy: 0.6789 - loss: 0.6112 - val_accuracy: 0.8824 - val_loss: 0.4621 Epoch 10/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 68ms/step - accuracy: 0.5000 - loss: 0.5694 - val_accuracy: 0.8235 - val_loss: 0.4786 Epoch 11/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 673ms/step - accuracy: 0.5627 - loss: 0.6393 - val_accuracy: 0.8627 - val_loss: 0.5188 Epoch 12/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 57ms/step - accuracy: 0.5625 - loss: 0.6918 - val_accuracy: 0.8235 - val_loss: 0.5275 Epoch 13/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 644ms/step - accuracy: 0.7669 - loss: 0.5693 - val_accuracy: 0.8627 - val_loss: 0.5329 Epoch 14/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 58ms/step - accuracy: 0.6250 - loss: 0.6478 - val_accuracy: 0.8824 - val_loss: 0.5368 Epoch 15/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 668ms/step - accuracy: 0.6878 - loss: 0.6137 - val_accuracy: 0.8824 - val_loss: 0.5424 Epoch 16/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 60ms/step - accuracy: 0.6250 - loss: 0.5935 - val_accuracy: 0.8824 - val_loss: 0.5437 Epoch 17/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 660ms/step - accuracy: 0.6904 - loss: 0.6051 - val_accuracy: 0.8039 - val_loss: 0.5489 Epoch 18/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 62ms/step - accuracy: 0.6250 - loss: 0.6301 - val_accuracy: 0.8627 - val_loss: 0.5498 Epoch 19/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 640ms/step - accuracy: 0.6230 - loss: 0.5832 - val_accuracy: 0.8824 - val_loss: 0.5523 Epoch 20/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 63ms/step - accuracy: 0.6250 - loss: 0.6549 - val_accuracy: 0.8824 - val_loss: 0.5536
# Evaluate the model
loss, accuracy = model.evaluate(X_test, y_test_cat)
print(f"Test Loss: {loss:.4f}, Test Accuracy: {accuracy:.4f}")
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 151ms/step - accuracy: 0.8903 - loss: 0.5464 Test Loss: 0.5536, Test Accuracy: 0.8824
# Classification Report
y_pred = np.argmax(model.predict(X_test), axis=1)
print("\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=categories))
2/2 ━━━━━━━━━━━━━━━━━━━━ 1s 269ms/step Classification Report: precision recall f1-score support Tumor 0.88 0.94 0.91 31 No Tumor 0.89 0.80 0.84 20 accuracy 0.88 51 macro avg 0.88 0.87 0.87 51 weighted avg 0.88 0.88 0.88 51
# Confusion Matrix
conf_matrix = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(6, 5))
sns.heatmap(conf_matrix, annot=True, fmt="d", cmap="Blues", xticklabels=categories, yticklabels=categories)
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.show()
# ROC and AUC Visualization
y_prob = model.predict(X_test)[:, 1]
fpr, tpr, _ = roc_curve(y_test, y_prob)
roc_auc = auc(fpr, tpr)
plt.figure(figsize=(8, 6))
plt.plot(fpr, tpr, label=f"AUC = {roc_auc:.2f}", color="darkorange")
plt.plot([0, 1], [0, 1], color="navy", linestyle="--")
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("ROC Curve")
plt.legend(loc="lower right")
plt.show()
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 158ms/step
# Visualize predictions for each test image
correct_preds = 0
incorrect_preds = 0
plt.figure(figsize=(12, 12))
for i in range(30): # Display first 30 images from test set
ax = plt.subplot(10, 10, i + 1)
ax.imshow(X_test[i])
pred_label = categories[np.argmax(model.predict(X_test[i:i+1]))]
true_label = categories[y_test[i]]
ax.set_title(f"True: {true_label}\nPred: {pred_label}", fontsize=6)
ax.axis("off")
# Track the best accuracy images
if pred_label == true_label:
correct_preds += 1
else:
incorrect_preds += 1
# Ensure the plot is fully rendered before saving
plt.tight_layout()
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 82ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 67ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 52ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 52ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 59ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 60ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 67ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 66ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 59ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 58ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 55ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 65ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 52ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 53ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 53ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 67ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 116ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 52ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 60ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 53ms/step
# Save the image grid of predictions as PNG
plt.savefig("prediction_results_30_images.png", bbox_inches='tight', pad_inches=0.1)
plt.close() # Close the plot to avoid memory issues
# Print the best accuracy results
total_images = 30
accuracy_images = (correct_preds / total_images) * 100
print(f"Accuracy based on the first 30 images: {accuracy_images:.2f}%")
print(f"Correct Predictions: {correct_preds}, Incorrect Predictions: {incorrect_preds}")
Accuracy based on the first 30 images: 90.00% Correct Predictions: 27, Incorrect Predictions: 3
# Save accuracy plot (as an additional chart, showing accuracy based on 30 images)
fig, ax = plt.subplots(figsize=(6, 4))
ax.bar(["Correct", "Incorrect"], [correct_preds, incorrect_preds], color=["green", "red"])
ax.set_title(f"Prediction Results (Accuracy: {accuracy_images:.2f}%)")
ax.set_ylabel("Number of Predictions")
ax.set_xlabel("Prediction Type")
Text(0.5, 0, 'Prediction Type')
# Save the accuracy plot as PNG
plt.tight_layout()
plt.savefig("accuracy_results.png")
plt.close() # Close the plot to avoid memory issues
# Save model
model.save("brain_tumor_detection_model.h5")
print("\nModel saved as brain_tumor_detection_model.h5!")
WARNING:absl:You are saving your model as an HDF5 file via `model.save()` or `keras.saving.save_model(model)`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')` or `keras.saving.save_model(model, 'my_model.keras')`.
Model saved as brain_tumor_detection_model.h5!