Medical Image Analysis for Brain Tumor Detection Using Convolutional Neural Networks (CNN) in Python¶

Description:¶

This project focuses on medical image analysis for detecting brain tumors using deep learning techniques. The task is to classify brain MRI images into two categories: "Tumor" and "No Tumor." A Convolutional Neural Network (CNN) is used to extract features and perform classification. The dataset, sourced from Kaggle, consists of 253 images, with equal representation of both tumor and non-tumor cases.

Key Features:¶

  • Data Preprocessing: The dataset is loaded, resized, and normalized for better model performance.

  • Data Augmentation: To address class imbalance and enhance model generalization, image augmentation techniques like rotation, zoom, and horizontal flip are applied.

  • Model Architecture: The model is built using a CNN architecture consisting of convolutional layers (Conv2D) for feature extraction, MaxPooling2D layers for down-sampling, and dense layers for classification.

  • Class Imbalance Handling: The model incorporates class weights during training to mitigate the imbalance between tumor and non-tumor classes.

  • Performance Evaluation: The model's performance is evaluated using test data, with accuracy, classification report, and confusion matrix generated for detailed performance insights. An ROC curve and AUC score are also visualized for further assessment.

  • Model Saving: The trained model is saved for future predictions and deployment.

Achievements:¶

  • Initial Accuracy: Achieved an accuracy of 90.00% on a sample of 30 images.

  • Extended Accuracy: After extending to 51 images, the accuracy reached 86.27%, demonstrating robustness with increased data.

  • Confusion Matrix: Visualized the confusion matrix to assess model performance across both classes (Tumor and No Tumor).

  • Image Visualization: Displayed predictions on 51 and 30 test images to further assess model accuracy and performance.

Conclusion:¶

This project demonstrates the effectiveness of Convolutional Neural Networks for classifying brain tumor images with high accuracy. By utilizing deep learning techniques, data augmentation, and addressing class imbalance, the model successfully performs image classification for medical diagnosis. The system can potentially be used for automated brain tumor detection in clinical settings, providing doctors with a valuable tool for decision-making.

In [43]:
import os
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix, roc_curve, auc
from sklearn.utils import class_weight  # Ensure class_weight is correctly imported
from PIL import Image
In [97]:
# Load and visualize images with and without tumor

tumor_image_path = "brain_tumor_dataset/Tumor/Y92.png"  # Replace with the actual path to your image
no_tumor_image_path = "brain_tumor_dataset/No Tumor/8 no.jpg"  # Replace with the actual path to your image

tumor_image = Image.open('brain_tumor_dataset/Tumor/Y92.png')
no_tumor_image = Image.open('brain_tumor_dataset/No Tumor/8 no.jpg')

plt.figure(figsize=(8, 4))
plt.subplot(1, 2, 1)
plt.imshow(tumor_image)
plt.title("Brain with Tumor")
plt.axis("off")

plt.subplot(1, 2, 2)
plt.imshow(no_tumor_image)
plt.title("Brain without Tumor")
plt.axis("off")
plt.show()
No description has been provided for this image
In [53]:
# Assuming dataset structure
# data/
# ├── Tumor/
# └── No_Tumor/

data_dir = "brain_tumor_dataset"  # Replace with your dataset path
categories = ["Tumor", "No Tumor"]

data = []
labels = []

for category in categories:
    path = os.path.join(data_dir, category)
    class_label = categories.index(category)  # 0 for Tumor, 1 for No_Tumor
    for img_name in os.listdir(path):
        try:
            img_path = os.path.join(path, img_name)
            img = Image.open(img_path).convert("RGB").resize((150, 150))
            data.append(np.array(img))
            labels.append(class_label)
        except Exception as e:
            print(f"Error loading image {img_name}: {e}")

data = np.array(data)
labels = np.array(labels)
In [55]:
# Normalize the data

data = data / 255.0

# Train-test split

X_train, X_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42, stratify=labels)

# Convert labels to categorical

y_train_cat = to_categorical(y_train, num_classes=2)
y_test_cat = to_categorical(y_test, num_classes=2)

# Compute class weights to handle class imbalance

class_weights = class_weight.compute_class_weight(
    'balanced', classes=np.unique(y_train), y=y_train
)
class_weights_dict = {i: class_weights[i] for i in range(len(class_weights))}
In [57]:
# Data Augmentation

datagen = ImageDataGenerator(
    rotation_range=30,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.3,
    horizontal_flip=True,
    fill_mode="nearest"
)
datagen.fit(X_train)

# Ensure the generator can repeat data

train_generator = datagen.flow(X_train, y_train_cat, batch_size=32)
In [59]:
# Calculate steps per epoch (for 253 images and batch size 32)

steps_per_epoch = len(X_train) // 32  # This will be 253 // 32 = 7
In [63]:
# Model architecture

model = Sequential([
    Conv2D(32, (3, 3), activation="relu", input_shape=(150, 150, 3)),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation="relu"),
    MaxPooling2D((2, 2)),
    Conv2D(128, (3, 3), activation="relu"),
    MaxPooling2D((2, 2)),
    Flatten(),
    Dense(128, activation="relu"),
    Dropout(0.5),
    Dense(2, activation="softmax")
])

# Compile the model

model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
In [67]:
# Train the model with class weights
history = model.fit(
    train_generator,  # Use the generator here
    validation_data=(X_test, y_test_cat),
    epochs=20,
    steps_per_epoch=steps_per_epoch,  # Use calculated steps per epoch
    verbose=1,
    class_weight=class_weights_dict  # Apply class weights during training
)
Epoch 1/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 3s 529ms/step - accuracy: 0.6782 - loss: 0.6257 - val_accuracy: 0.8627 - val_loss: 0.5145
Epoch 2/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 58ms/step - accuracy: 0.5312 - loss: 0.7340 - val_accuracy: 0.8627 - val_loss: 0.5201
Epoch 3/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 719ms/step - accuracy: 0.6615 - loss: 0.6169 - val_accuracy: 0.7451 - val_loss: 0.5369
Epoch 4/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 50ms/step - accuracy: 0.6250 - loss: 0.6531 - val_accuracy: 0.8235 - val_loss: 0.5354
Epoch 5/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 675ms/step - accuracy: 0.6861 - loss: 0.6314 - val_accuracy: 0.7451 - val_loss: 0.5237
Epoch 6/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 70ms/step - accuracy: 0.5312 - loss: 0.8124 - val_accuracy: 0.8431 - val_loss: 0.5137
Epoch 7/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 5s 771ms/step - accuracy: 0.6759 - loss: 0.5780 - val_accuracy: 0.8235 - val_loss: 0.5054
Epoch 8/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 71ms/step - accuracy: 0.6000 - loss: 0.5554 - val_accuracy: 0.6667 - val_loss: 0.5250
Epoch 9/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 5s 760ms/step - accuracy: 0.6789 - loss: 0.6112 - val_accuracy: 0.8824 - val_loss: 0.4621
Epoch 10/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 68ms/step - accuracy: 0.5000 - loss: 0.5694 - val_accuracy: 0.8235 - val_loss: 0.4786
Epoch 11/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 673ms/step - accuracy: 0.5627 - loss: 0.6393 - val_accuracy: 0.8627 - val_loss: 0.5188
Epoch 12/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 57ms/step - accuracy: 0.5625 - loss: 0.6918 - val_accuracy: 0.8235 - val_loss: 0.5275
Epoch 13/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 644ms/step - accuracy: 0.7669 - loss: 0.5693 - val_accuracy: 0.8627 - val_loss: 0.5329
Epoch 14/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 58ms/step - accuracy: 0.6250 - loss: 0.6478 - val_accuracy: 0.8824 - val_loss: 0.5368
Epoch 15/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 668ms/step - accuracy: 0.6878 - loss: 0.6137 - val_accuracy: 0.8824 - val_loss: 0.5424
Epoch 16/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 60ms/step - accuracy: 0.6250 - loss: 0.5935 - val_accuracy: 0.8824 - val_loss: 0.5437
Epoch 17/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 660ms/step - accuracy: 0.6904 - loss: 0.6051 - val_accuracy: 0.8039 - val_loss: 0.5489
Epoch 18/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 62ms/step - accuracy: 0.6250 - loss: 0.6301 - val_accuracy: 0.8627 - val_loss: 0.5498
Epoch 19/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 640ms/step - accuracy: 0.6230 - loss: 0.5832 - val_accuracy: 0.8824 - val_loss: 0.5523
Epoch 20/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 63ms/step - accuracy: 0.6250 - loss: 0.6549 - val_accuracy: 0.8824 - val_loss: 0.5536
In [69]:
# Evaluate the model

loss, accuracy = model.evaluate(X_test, y_test_cat)
print(f"Test Loss: {loss:.4f}, Test Accuracy: {accuracy:.4f}")
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 151ms/step - accuracy: 0.8903 - loss: 0.5464
Test Loss: 0.5536, Test Accuracy: 0.8824
In [71]:
# Classification Report

y_pred = np.argmax(model.predict(X_test), axis=1)
print("\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=categories))
2/2 ━━━━━━━━━━━━━━━━━━━━ 1s 269ms/step

Classification Report:
              precision    recall  f1-score   support

       Tumor       0.88      0.94      0.91        31
    No Tumor       0.89      0.80      0.84        20

    accuracy                           0.88        51
   macro avg       0.88      0.87      0.87        51
weighted avg       0.88      0.88      0.88        51

In [73]:
# Confusion Matrix

conf_matrix = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(6, 5))
sns.heatmap(conf_matrix, annot=True, fmt="d", cmap="Blues", xticklabels=categories, yticklabels=categories)
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.show()
No description has been provided for this image
In [75]:
# ROC and AUC Visualization

y_prob = model.predict(X_test)[:, 1]
fpr, tpr, _ = roc_curve(y_test, y_prob)
roc_auc = auc(fpr, tpr)

plt.figure(figsize=(8, 6))
plt.plot(fpr, tpr, label=f"AUC = {roc_auc:.2f}", color="darkorange")
plt.plot([0, 1], [0, 1], color="navy", linestyle="--")
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("ROC Curve")
plt.legend(loc="lower right")
plt.show()
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 158ms/step
No description has been provided for this image
In [77]:
# Visualize predictions for each test image

correct_preds = 0
incorrect_preds = 0
plt.figure(figsize=(12, 12))
for i in range(30):  # Display first 30 images from test set
    ax = plt.subplot(10, 10, i + 1)
    ax.imshow(X_test[i])
    pred_label = categories[np.argmax(model.predict(X_test[i:i+1]))]
    true_label = categories[y_test[i]]
    ax.set_title(f"True: {true_label}\nPred: {pred_label}", fontsize=6)
    ax.axis("off")
    
    # Track the best accuracy images
    if pred_label == true_label:
        correct_preds += 1
    else:
        incorrect_preds += 1

# Ensure the plot is fully rendered before saving
plt.tight_layout()
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 82ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 67ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 52ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 52ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 59ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 60ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 67ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 66ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 59ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 58ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 55ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 65ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 52ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 53ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 53ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 67ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 116ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 52ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 60ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 53ms/step
No description has been provided for this image
In [79]:
# Save the image grid of predictions as PNG

plt.savefig("prediction_results_30_images.png", bbox_inches='tight', pad_inches=0.1)
plt.close()  # Close the plot to avoid memory issues
In [89]:
# Print the best accuracy results

total_images = 30
accuracy_images = (correct_preds / total_images) * 100
print(f"Accuracy based on the first 30 images: {accuracy_images:.2f}%")
print(f"Correct Predictions: {correct_preds}, Incorrect Predictions: {incorrect_preds}")
Accuracy based on the first 30 images: 90.00%
Correct Predictions: 27, Incorrect Predictions: 3
In [91]:
# Save accuracy plot (as an additional chart, showing accuracy based on 30 images)

fig, ax = plt.subplots(figsize=(6, 4))
ax.bar(["Correct", "Incorrect"], [correct_preds, incorrect_preds], color=["green", "red"])
ax.set_title(f"Prediction Results (Accuracy: {accuracy_images:.2f}%)")
ax.set_ylabel("Number of Predictions")
ax.set_xlabel("Prediction Type")
Out[91]:
Text(0.5, 0, 'Prediction Type')
No description has been provided for this image
In [93]:
# Save the accuracy plot as PNG

plt.tight_layout()
plt.savefig("accuracy_results.png")
plt.close()  # Close the plot to avoid memory issues
In [95]:
# Save model

model.save("brain_tumor_detection_model.h5")
print("\nModel saved as brain_tumor_detection_model.h5!")
WARNING:absl:You are saving your model as an HDF5 file via `model.save()` or `keras.saving.save_model(model)`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')` or `keras.saving.save_model(model, 'my_model.keras')`. 
Model saved as brain_tumor_detection_model.h5!