Medical Image Analysis for Brain Tumor Detection Using Convolutional Neural Networks (CNN) in Python¶
Description:¶
This project focuses on medical image analysis for detecting brain tumors using deep learning techniques. The task is to classify brain MRI images into two categories: "Tumor" and "No Tumor." A Convolutional Neural Network (CNN) is used to extract features and perform classification. The dataset, sourced from Kaggle, consists of 253 images, with equal representation of both tumor and non-tumor cases.
Key Features:¶
Data Preprocessing: The dataset is loaded, resized, and normalized for better model performance.
Data Augmentation: To address class imbalance and enhance model generalization, image augmentation techniques like rotation, zoom, and horizontal flip are applied.
Model Architecture: The model is built using a CNN architecture consisting of convolutional layers (Conv2D) for feature extraction, MaxPooling2D layers for down-sampling, and dense layers for classification.
Class Imbalance Handling: The model incorporates class weights during training to mitigate the imbalance between tumor and non-tumor classes.
Performance Evaluation: The model's performance is evaluated using test data, with accuracy, classification report, and confusion matrix generated for detailed performance insights. An ROC curve and AUC score are also visualized for further assessment.
Model Saving: The trained model is saved for future predictions and deployment.
Achievements:¶
Initial Accuracy: Achieved an accuracy of 90.00% on a sample of 30 images.
Extended Accuracy: After extending to 51 images, the accuracy reached 86.27%, demonstrating robustness with increased data.
Confusion Matrix: Visualized the confusion matrix to assess model performance across both classes (Tumor and No Tumor).
Image Visualization: Displayed predictions on 30 and 51 test images to further assess model accuracy and performance.
Conclusion:¶
This project demonstrates the effectiveness of Convolutional Neural Networks for classifying brain tumor images with high accuracy. By utilizing deep learning techniques, data augmentation, and addressing class imbalance, the model successfully performs image classification for medical diagnosis. The system can potentially be used for automated brain tumor detection in clinical settings, providing doctors with a valuable tool for decision-making.
import os
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix, roc_curve, auc
from sklearn.utils import class_weight # Ensure class_weight is correctly imported
from PIL import Image
# Load and visualize images with and without tumor
tumor_image_path = "brain_tumor_dataset/Tumor/Y20.jpg" # Replace with the actual path to your image
no_tumor_image_path = "brain_tumor_dataset/No Tumor/8 no.jpg" # Replace with the actual path to your image
tumor_image = Image.open('brain_tumor_dataset/Tumor/Y20.jpg')
no_tumor_image = Image.open('brain_tumor_dataset/No Tumor/8 no.jpg')
plt.figure(figsize=(8, 4))
plt.subplot(1, 2, 1)
plt.imshow(tumor_image)
plt.title("Brain with Tumor")
plt.axis("off")
plt.subplot(1, 2, 2)
plt.imshow(no_tumor_image)
plt.title("Brain without Tumor")
plt.axis("off")
plt.show()
# Assuming dataset structure
# data/
# ├── Tumor/
# └── No_Tumor/
data_dir = "brain_tumor_dataset" # Replace with your dataset path
categories = ["Tumor", "No Tumor"]
data = []
labels = []
for category in categories:
path = os.path.join(data_dir, category)
class_label = categories.index(category) # 0 for Tumor, 1 for No_Tumor
for img_name in os.listdir(path):
try:
img_path = os.path.join(path, img_name)
img = Image.open(img_path).convert("RGB").resize((150, 150))
data.append(np.array(img))
labels.append(class_label)
except Exception as e:
print(f"Error loading image {img_name}: {e}")
data = np.array(data)
labels = np.array(labels)
# Normalize the data
data = data / 255.0
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42, stratify=labels)
# Convert labels to categorical
y_train_cat = to_categorical(y_train, num_classes=2)
y_test_cat = to_categorical(y_test, num_classes=2)
# Compute class weights to handle class imbalance
class_weights = class_weight.compute_class_weight(
'balanced', classes=np.unique(y_train), y=y_train
)
class_weights_dict = {i: class_weights[i] for i in range(len(class_weights))}
# Data Augmentation
datagen = ImageDataGenerator(
rotation_range=30,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.3,
horizontal_flip=True,
fill_mode="nearest"
)
datagen.fit(X_train)
# Ensure the generator can repeat data
train_generator = datagen.flow(X_train, y_train_cat, batch_size=32)
# Calculate steps per epoch (for 253 images and batch size 32)
steps_per_epoch = len(X_train) // 32 # This will be 253 // 32 = 7
# Model architecture
model = Sequential([
Conv2D(32, (3, 3), activation="relu", input_shape=(150, 150, 3)),
MaxPooling2D((2, 2)),
Conv2D(64, (3, 3), activation="relu"),
MaxPooling2D((2, 2)),
Conv2D(128, (3, 3), activation="relu"),
MaxPooling2D((2, 2)),
Flatten(),
Dense(128, activation="relu"),
Dropout(0.5),
Dense(2, activation="softmax")
])
# Compile the model
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
# Train the model with class weights
history = model.fit(
train_generator, # Use the generator here
validation_data=(X_test, y_test_cat),
epochs=20,
steps_per_epoch=steps_per_epoch, # Use calculated steps per epoch
verbose=1,
class_weight=class_weights_dict # Apply class weights during training
)
Epoch 1/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 589ms/step - accuracy: 0.5933 - loss: 0.6191 - val_accuracy: 0.7059 - val_loss: 0.4916 Epoch 2/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 60ms/step - accuracy: 0.5312 - loss: 0.8031 - val_accuracy: 0.7451 - val_loss: 0.4843 Epoch 3/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 643ms/step - accuracy: 0.6702 - loss: 0.6120 - val_accuracy: 0.8824 - val_loss: 0.4633 Epoch 4/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 57ms/step - accuracy: 0.5938 - loss: 0.6640 - val_accuracy: 0.8235 - val_loss: 0.4976 Epoch 5/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 737ms/step - accuracy: 0.5854 - loss: 0.6567 - val_accuracy: 0.8627 - val_loss: 0.5196 Epoch 6/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 56ms/step - accuracy: 0.5938 - loss: 0.6201 - val_accuracy: 0.8824 - val_loss: 0.5096 Epoch 7/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 630ms/step - accuracy: 0.6644 - loss: 0.6314 - val_accuracy: 0.8824 - val_loss: 0.4847 Epoch 8/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 73ms/step - accuracy: 0.5312 - loss: 0.7247 - val_accuracy: 0.8824 - val_loss: 0.4827 Epoch 9/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 5s 833ms/step - accuracy: 0.5992 - loss: 0.6254 - val_accuracy: 0.8235 - val_loss: 0.4677 Epoch 10/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 55ms/step - accuracy: 0.7188 - loss: 0.5955 - val_accuracy: 0.7647 - val_loss: 0.4753 Epoch 11/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 707ms/step - accuracy: 0.6872 - loss: 0.6146 - val_accuracy: 0.8824 - val_loss: 0.4070 Epoch 12/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 62ms/step - accuracy: 0.7188 - loss: 0.5903 - val_accuracy: 0.8824 - val_loss: 0.4026 Epoch 13/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 691ms/step - accuracy: 0.6616 - loss: 0.6375 - val_accuracy: 0.8824 - val_loss: 0.3987 Epoch 14/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 58ms/step - accuracy: 0.9000 - loss: 0.4299 - val_accuracy: 0.9020 - val_loss: 0.3984 Epoch 15/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 633ms/step - accuracy: 0.7297 - loss: 0.6092 - val_accuracy: 0.7255 - val_loss: 0.5088 Epoch 16/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 57ms/step - accuracy: 0.6875 - loss: 0.5857 - val_accuracy: 0.5294 - val_loss: 0.6490 Epoch 17/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 741ms/step - accuracy: 0.5221 - loss: 0.7306 - val_accuracy: 0.8627 - val_loss: 0.5118 Epoch 18/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 60ms/step - accuracy: 0.6875 - loss: 0.6031 - val_accuracy: 0.8039 - val_loss: 0.5213 Epoch 19/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 4s 659ms/step - accuracy: 0.7431 - loss: 0.6233 - val_accuracy: 0.8431 - val_loss: 0.5121 Epoch 20/20 6/6 ━━━━━━━━━━━━━━━━━━━━ 1s 55ms/step - accuracy: 0.6562 - loss: 0.7531 - val_accuracy: 0.8627 - val_loss: 0.5138
# Evaluate the model
loss, accuracy = model.evaluate(X_test, y_test_cat)
print(f"Test Loss: {loss:.4f}, Test Accuracy: {accuracy:.4f}")
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 140ms/step - accuracy: 0.8772 - loss: 0.5103 Test Loss: 0.5138, Test Accuracy: 0.8627
# Classification Report
y_pred = np.argmax(model.predict(X_test), axis=1)
print("\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=categories))
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 194ms/step Classification Report: precision recall f1-score support Tumor 0.83 0.97 0.90 31 No Tumor 0.93 0.70 0.80 20 accuracy 0.86 51 macro avg 0.88 0.83 0.85 51 weighted avg 0.87 0.86 0.86 51
# Confusion Matrix
conf_matrix = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(6, 5))
sns.heatmap(conf_matrix, annot=True, fmt="d", cmap="Blues", xticklabels=categories, yticklabels=categories)
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.show()
# ROC and AUC Visualization
y_prob = model.predict(X_test)[:, 1]
fpr, tpr, _ = roc_curve(y_test, y_prob)
roc_auc = auc(fpr, tpr)
plt.figure(figsize=(8, 6))
plt.plot(fpr, tpr, label=f"AUC = {roc_auc:.2f}", color="darkorange")
plt.plot([0, 1], [0, 1], color="navy", linestyle="--")
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("ROC Curve")
plt.legend(loc="lower right")
plt.show()
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 124ms/step
# Adjust the number of images based on the size of X_test
num_images_to_display = min(90, len(X_test)) # Display up to 90 images, but not more than available in X_test
correct_preds = 0
incorrect_preds = 0
plt.figure(figsize=(18, 18)) # Adjusted to fit 90 images in a grid
for i in range(num_images_to_display): # Display up to the number of available test images
ax = plt.subplot(10, 10, i + 1) # Adjusted to 10 rows, 10 columns for up to 100 images
ax.imshow(X_test[i])
pred_label = categories[np.argmax(model.predict(X_test[i:i+1]))]
true_label = categories[y_test[i]]
ax.set_title(f"True: {true_label}\nPred: {pred_label}", fontsize=6)
ax.axis("off")
# Track the best accuracy images
if pred_label == true_label:
correct_preds += 1
else:
incorrect_preds += 1
# Ensure the plot is fully rendered before saving
plt.tight_layout()
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 66ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 61ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 123ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 55ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 59ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 59ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 45ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 68ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 59ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 55ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 53ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 60ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 73ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 67ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 73ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 58ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 58ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 47ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 52ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 68ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 52ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 60ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 55ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 165ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 65ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 67ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 53ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 51ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 52ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 48ms/step
# Save the image grid of predictions as PNG
plt.savefig("prediction_results_90_images.png", bbox_inches='tight', pad_inches=0.1)
plt.close() # Close the plot to avoid memory issues
# Print the best accuracy results
accuracy_images = (correct_preds / num_images_to_display) * 100
print(f"Accuracy based on the first {num_images_to_display} images: {accuracy_images:.2f}%")
print(f"Correct Predictions: {correct_preds}, Incorrect Predictions: {incorrect_preds}")
Accuracy based on the first 51 images: 86.27% Correct Predictions: 44, Incorrect Predictions: 7
# Save accuracy plot (as an additional chart, showing accuracy based on the images displayed)
fig, ax = plt.subplots(figsize=(6, 4))
ax.bar(["Correct", "Incorrect"], [correct_preds, incorrect_preds], color=["green", "red"])
ax.set_title(f"Prediction Results (Accuracy: {accuracy_images:.2f}%)")
ax.set_ylabel("Number of Predictions")
ax.set_xlabel("Prediction Type")
Text(0.5, 0, 'Prediction Type')
# Save the accuracy plot as PNG
plt.tight_layout()
plt.savefig("accuracy_results_90_images.png")
plt.close() # Close the plot to avoid memory issues
# Save model
model.save("brain_tumor_detection_model.h5")
print("\nModel saved as brain_tumor_detection_model.h5!")
WARNING:absl:You are saving your model as an HDF5 file via `model.save()` or `keras.saving.save_model(model)`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')` or `keras.saving.save_model(model, 'my_model.keras')`.
Model saved as brain_tumor_detection_model.h5!