Medical Image Analysis for Brain Tumor Detection Using Convolutional Neural Networks (CNN) in Python¶

Description:¶

This project focuses on medical image analysis for detecting brain tumors using deep learning techniques. The task is to classify brain MRI images into two categories: "Tumor" and "No Tumor." A Convolutional Neural Network (CNN) is used to extract features and perform classification. The dataset, sourced from Kaggle, consists of 253 images, with equal representation of both tumor and non-tumor cases.

Key Features:¶

  • Data Preprocessing: The dataset is loaded, resized, and normalized for better model performance.

  • Data Augmentation: To address class imbalance and enhance model generalization, image augmentation techniques like rotation, zoom, and horizontal flip are applied.

  • Model Architecture: The model is built using a CNN architecture consisting of convolutional layers (Conv2D) for feature extraction, MaxPooling2D layers for down-sampling, and dense layers for classification.

  • Class Imbalance Handling: The model incorporates class weights during training to mitigate the imbalance between tumor and non-tumor classes.

  • Performance Evaluation: The model's performance is evaluated using test data, with accuracy, classification report, and confusion matrix generated for detailed performance insights. An ROC curve and AUC score are also visualized for further assessment.

  • Model Saving: The trained model is saved for future predictions and deployment.

Achievements:¶

  • Initial Accuracy: Achieved an accuracy of 90.00% on a sample of 30 images.

  • Extended Accuracy: After extending to 51 images, the accuracy reached 86.27%, demonstrating robustness with increased data.

  • Confusion Matrix: Visualized the confusion matrix to assess model performance across both classes (Tumor and No Tumor).

  • Image Visualization: Displayed predictions on 30 and 51 test images to further assess model accuracy and performance.

Conclusion:¶

This project demonstrates the effectiveness of Convolutional Neural Networks for classifying brain tumor images with high accuracy. By utilizing deep learning techniques, data augmentation, and addressing class imbalance, the model successfully performs image classification for medical diagnosis. The system can potentially be used for automated brain tumor detection in clinical settings, providing doctors with a valuable tool for decision-making.

In [35]:
import os
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix, roc_curve, auc
from sklearn.utils import class_weight  # Ensure class_weight is correctly imported
from PIL import Image
In [87]:
# Load and visualize images with and without tumor

tumor_image_path = "brain_tumor_dataset/Tumor/Y20.jpg"  # Replace with the actual path to your image
no_tumor_image_path = "brain_tumor_dataset/No Tumor/8 no.jpg"  # Replace with the actual path to your image

tumor_image = Image.open('brain_tumor_dataset/Tumor/Y20.jpg')
no_tumor_image = Image.open('brain_tumor_dataset/No Tumor/8 no.jpg')

plt.figure(figsize=(8, 4))
plt.subplot(1, 2, 1)
plt.imshow(tumor_image)
plt.title("Brain with Tumor")
plt.axis("off")

plt.subplot(1, 2, 2)
plt.imshow(no_tumor_image)
plt.title("Brain without Tumor")
plt.axis("off")
plt.show()
No description has been provided for this image
In [89]:
# Assuming dataset structure
# data/
# ├── Tumor/
# └── No_Tumor/

data_dir = "brain_tumor_dataset"  # Replace with your dataset path
categories = ["Tumor", "No Tumor"]

data = []
labels = []

for category in categories:
    path = os.path.join(data_dir, category)
    class_label = categories.index(category)  # 0 for Tumor, 1 for No_Tumor
    for img_name in os.listdir(path):
        try:
            img_path = os.path.join(path, img_name)
            img = Image.open(img_path).convert("RGB").resize((150, 150))
            data.append(np.array(img))
            labels.append(class_label)
        except Exception as e:
            print(f"Error loading image {img_name}: {e}")

data = np.array(data)
labels = np.array(labels)
In [91]:
# Normalize the data

data = data / 255.0

# Train-test split

X_train, X_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42, stratify=labels)

# Convert labels to categorical

y_train_cat = to_categorical(y_train, num_classes=2)
y_test_cat = to_categorical(y_test, num_classes=2)

# Compute class weights to handle class imbalance

class_weights = class_weight.compute_class_weight(
    'balanced', classes=np.unique(y_train), y=y_train
)
class_weights_dict = {i: class_weights[i] for i in range(len(class_weights))}
In [93]:
# Data Augmentation

datagen = ImageDataGenerator(
    rotation_range=30,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.3,
    horizontal_flip=True,
    fill_mode="nearest"
)
datagen.fit(X_train)

# Ensure the generator can repeat data

train_generator = datagen.flow(X_train, y_train_cat, batch_size=32)
In [95]:
# Calculate steps per epoch (for 253 images and batch size 32)

steps_per_epoch = len(X_train) // 32  # This will be 253 // 32 = 7
In [99]:
# Model architecture

model = Sequential([
    Conv2D(32, (3, 3), activation="relu", input_shape=(150, 150, 3)),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation="relu"),
    MaxPooling2D((2, 2)),
    Conv2D(128, (3, 3), activation="relu"),
    MaxPooling2D((2, 2)),
    Flatten(),
    Dense(128, activation="relu"),
    Dropout(0.5),
    Dense(2, activation="softmax")
])

# Compile the model

model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
In [103]:
# Train the model with class weights

history = model.fit(
    train_generator,  # Use the generator here
    validation_data=(X_test, y_test_cat),
    epochs=20,
    steps_per_epoch=steps_per_epoch,  # Use calculated steps per epoch
    verbose=1,
    class_weight=class_weights_dict  # Apply class weights during training
)
Epoch 1/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 589ms/step - accuracy: 0.5933 - loss: 0.6191 - val_accuracy: 0.7059 - val_loss: 0.4916
Epoch 2/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 60ms/step - accuracy: 0.5312 - loss: 0.8031 - val_accuracy: 0.7451 - val_loss: 0.4843
Epoch 3/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 643ms/step - accuracy: 0.6702 - loss: 0.6120 - val_accuracy: 0.8824 - val_loss: 0.4633
Epoch 4/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 57ms/step - accuracy: 0.5938 - loss: 0.6640 - val_accuracy: 0.8235 - val_loss: 0.4976
Epoch 5/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 737ms/step - accuracy: 0.5854 - loss: 0.6567 - val_accuracy: 0.8627 - val_loss: 0.5196
Epoch 6/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 56ms/step - accuracy: 0.5938 - loss: 0.6201 - val_accuracy: 0.8824 - val_loss: 0.5096
Epoch 7/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 630ms/step - accuracy: 0.6644 - loss: 0.6314 - val_accuracy: 0.8824 - val_loss: 0.4847
Epoch 8/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 73ms/step - accuracy: 0.5312 - loss: 0.7247 - val_accuracy: 0.8824 - val_loss: 0.4827
Epoch 9/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 5s 833ms/step - accuracy: 0.5992 - loss: 0.6254 - val_accuracy: 0.8235 - val_loss: 0.4677
Epoch 10/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 55ms/step - accuracy: 0.7188 - loss: 0.5955 - val_accuracy: 0.7647 - val_loss: 0.4753
Epoch 11/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 707ms/step - accuracy: 0.6872 - loss: 0.6146 - val_accuracy: 0.8824 - val_loss: 0.4070
Epoch 12/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 62ms/step - accuracy: 0.7188 - loss: 0.5903 - val_accuracy: 0.8824 - val_loss: 0.4026
Epoch 13/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 691ms/step - accuracy: 0.6616 - loss: 0.6375 - val_accuracy: 0.8824 - val_loss: 0.3987
Epoch 14/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 58ms/step - accuracy: 0.9000 - loss: 0.4299 - val_accuracy: 0.9020 - val_loss: 0.3984
Epoch 15/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 633ms/step - accuracy: 0.7297 - loss: 0.6092 - val_accuracy: 0.7255 - val_loss: 0.5088
Epoch 16/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 57ms/step - accuracy: 0.6875 - loss: 0.5857 - val_accuracy: 0.5294 - val_loss: 0.6490
Epoch 17/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 741ms/step - accuracy: 0.5221 - loss: 0.7306 - val_accuracy: 0.8627 - val_loss: 0.5118
Epoch 18/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 60ms/step - accuracy: 0.6875 - loss: 0.6031 - val_accuracy: 0.8039 - val_loss: 0.5213
Epoch 19/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 659ms/step - accuracy: 0.7431 - loss: 0.6233 - val_accuracy: 0.8431 - val_loss: 0.5121
Epoch 20/20
6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 55ms/step - accuracy: 0.6562 - loss: 0.7531 - val_accuracy: 0.8627 - val_loss: 0.5138
In [105]:
# Evaluate the model

loss, accuracy = model.evaluate(X_test, y_test_cat)
print(f"Test Loss: {loss:.4f}, Test Accuracy: {accuracy:.4f}")
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 140ms/step - accuracy: 0.8772 - loss: 0.5103
Test Loss: 0.5138, Test Accuracy: 0.8627
In [107]:
# Classification Report

y_pred = np.argmax(model.predict(X_test), axis=1)
print("\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=categories))
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 194ms/step

Classification Report:
              precision    recall  f1-score   support

       Tumor       0.83      0.97      0.90        31
    No Tumor       0.93      0.70      0.80        20

    accuracy                           0.86        51
   macro avg       0.88      0.83      0.85        51
weighted avg       0.87      0.86      0.86        51

In [109]:
# Confusion Matrix

conf_matrix = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(6, 5))
sns.heatmap(conf_matrix, annot=True, fmt="d", cmap="Blues", xticklabels=categories, yticklabels=categories)
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.show()
No description has been provided for this image
In [111]:
# ROC and AUC Visualization

y_prob = model.predict(X_test)[:, 1]
fpr, tpr, _ = roc_curve(y_test, y_prob)
roc_auc = auc(fpr, tpr)

plt.figure(figsize=(8, 6))
plt.plot(fpr, tpr, label=f"AUC = {roc_auc:.2f}", color="darkorange")
plt.plot([0, 1], [0, 1], color="navy", linestyle="--")
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("ROC Curve")
plt.legend(loc="lower right")
plt.show()
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 124ms/step
No description has been provided for this image
In [113]:
# Adjust the number of images based on the size of X_test

num_images_to_display = min(90, len(X_test))  # Display up to 90 images, but not more than available in X_test

correct_preds = 0
incorrect_preds = 0
plt.figure(figsize=(18, 18))  # Adjusted to fit 90 images in a grid
for i in range(num_images_to_display):  # Display up to the number of available test images
    ax = plt.subplot(10, 10, i + 1)  # Adjusted to 10 rows, 10 columns for up to 100 images
    ax.imshow(X_test[i])
    pred_label = categories[np.argmax(model.predict(X_test[i:i+1]))]
    true_label = categories[y_test[i]]
    ax.set_title(f"True: {true_label}\nPred: {pred_label}", fontsize=6)
    ax.axis("off")
    
    # Track the best accuracy images
    if pred_label == true_label:
        correct_preds += 1
    else:
        incorrect_preds += 1

# Ensure the plot is fully rendered before saving

plt.tight_layout()
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 66ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 61ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 123ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 55ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 59ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 59ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 45ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 68ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 59ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 55ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 53ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 60ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 73ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 67ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 73ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 58ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 58ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 47ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 52ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 68ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 52ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 60ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 55ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 165ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 65ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 67ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 53ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 51ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 52ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 48ms/step
No description has been provided for this image
In [115]:
# Save the image grid of predictions as PNG

plt.savefig("prediction_results_90_images.png", bbox_inches='tight', pad_inches=0.1)
plt.close()  # Close the plot to avoid memory issues
In [117]:
# Print the best accuracy results

accuracy_images = (correct_preds / num_images_to_display) * 100
print(f"Accuracy based on the first {num_images_to_display} images: {accuracy_images:.2f}%")
print(f"Correct Predictions: {correct_preds}, Incorrect Predictions: {incorrect_preds}")
Accuracy based on the first 51 images: 86.27%
Correct Predictions: 44, Incorrect Predictions: 7
In [119]:
# Save accuracy plot (as an additional chart, showing accuracy based on the images displayed)

fig, ax = plt.subplots(figsize=(6, 4))
ax.bar(["Correct", "Incorrect"], [correct_preds, incorrect_preds], color=["green", "red"])
ax.set_title(f"Prediction Results (Accuracy: {accuracy_images:.2f}%)")
ax.set_ylabel("Number of Predictions")
ax.set_xlabel("Prediction Type")
Out[119]:
Text(0.5, 0, 'Prediction Type')
No description has been provided for this image
In [121]:
# Save the accuracy plot as PNG

plt.tight_layout()
plt.savefig("accuracy_results_90_images.png")
plt.close()  # Close the plot to avoid memory issues
In [123]:
# Save model

model.save("brain_tumor_detection_model.h5")
print("\nModel saved as brain_tumor_detection_model.h5!")
WARNING:absl:You are saving your model as an HDF5 file via `model.save()` or `keras.saving.save_model(model)`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')` or `keras.saving.save_model(model, 'my_model.keras')`. 
Model saved as brain_tumor_detection_model.h5!