Monday, June 29, 2026

Naïve Bayes Algorithm in Machine Learning Using Python

Naïve Bayes Algorithm in Machine Learning

🟦 Program Aim

Aim:

To implement the Gaussian Naïve Bayes Algorithm using Python and predict whether a patient has Diabetes or is Healthy based on their blood sugar level.

🟩 Algorithm Used

Gaussian Naïve Bayes (GaussianNB)

🟨 Problem Statement

A hospital wants to predict whether a patient is Healthy or has Diabetes based on the patient's Blood Sugar Level.

🟪 Step 1: Import Required Library


from sklearn.naive_bayes import GaussianNB

Explanation

sklearn is the Scikit-learn library.
naive_bayes is the module that contains Naïve Bayes algorithms.
GaussianNB is used for continuous numerical data (e.g., blood sugar, age, height, weight).

🟦 Step 2: Create the Training Dataset


X = [
    [85],
    [90],
    [95],
    [140],
    [150],
    [160]
]

Explanation

X represents the input feature (Independent Variable).

Each value is the patient's Blood Sugar Level (mg/dL).

Patient	Blood Sugar
Patient 1	85
Patient 2	90
Patient 3	95
Patient 4	140
Patient 5	150
Patient 6	160

The algorithm uses these values for learning.

🟩 Step 3: Create the Output Labels


y = [
    "Healthy",
    "Healthy",
    "Healthy",
    "Diabetes",
    "Diabetes",
    "Diabetes"
]

Explanation

y represents the target variable (Dependent Variable).

Blood Sugar	Output
85	Healthy
90	Healthy
95	Healthy
140	Diabetes
150	Diabetes
160	Diabetes

The algorithm learns the relationship between blood sugar levels and health status.

🟨 Step 4: Create the Gaussian Naïve Bayes Model


model = GaussianNB()

Explanation

This line creates an object of the Gaussian Naïve Bayes classifier.

The model is now ready to be trained.

🟪 Step 5: Train the Model


model.fit(X, y)

Explanation

The fit() function trains the model using the training data.

X = Input data (Blood Sugar)
y = Output labels (Healthy / Diabetes)

During training, the model:

Calculates the prior probability of each class.
Calculates the likelihood of each blood sugar value for each class.
Uses Bayes' Theorem to estimate probabilities.

🟦 Step 6: Predict for a New Patient


prediction = model.predict([[145]])

Explanation

The patient's blood sugar level is 145 mg/dL.

The model calculates:

Probability of Healthy
Probability of Diabetes

It selects the class with the higher probability.

🟩 Step 7: Display the Prediction


print("Prediction =", prediction[0])

Explanation

prediction is returned as a list (or array).

Using [0] retrieves the first (and only) predicted result.

Possible Output:


Prediction = Diabetes

🟥 Step 8: Complete Python Program


# Import Gaussian Naïve Bayes
from sklearn.naive_bayes import GaussianNB

# Training Data (Blood Sugar Levels)
X = [
    [85],
    [90],
    [95],
    [140],
    [150],
    [160]
]

# Output Labels
y = [
    "Healthy",
    "Healthy",
    "Healthy",
    "Diabetes",
    "Diabetes",
    "Diabetes"
]

# Create Model
model = GaussianNB()

# Train Model
model.fit(X, y)

# Predict New Patient
prediction = model.predict([[145]])

# Display Result
print("Prediction =", prediction[0])

🟦 Sample Output


Prediction = Diabetes

🟩 Step-by-Step Workflow


Start
   │
   ▼
Import GaussianNB
   │
   ▼
Create Training Dataset (X)
   │
   ▼
Create Output Labels (y)
   │
   ▼
Create GaussianNB Model
   │
   ▼
Train Model using fit()
   │
   ▼
Enter New Blood Sugar Value
   │
   ▼
Predict using predict()
   │
   ▼
Display Prediction
   │
   ▼
End

🟨 Line-by-Line Explanation

Line	Code	Description
1	`from sklearn.naive_bayes import GaussianNB`	Imports the Gaussian Naïve Bayes classifier.
2	`X = [...]`	Creates the input feature (blood sugar values).
3	`y = [...]`	Creates the output labels (Healthy/Diabetes).
4	`model = GaussianNB()`	Creates the Naïve Bayes model.
5	`model.fit(X, y)`	Trains the model using the training data.
6	`prediction = model.predict([[145]])`	Predicts the class for a new patient.
7	`print(prediction[0])`	Displays the predicted class.

🟪 Why Gaussian Naïve Bayes?

Gaussian Naïve Bayes is suitable because the feature (blood sugar level) is a continuous numerical value.

Examples of continuous data include:

Blood Sugar
Age
Height
Weight
Salary
Temperature

🟦 Advantages

✔ Easy to implement
✔ Fast training and prediction
✔ Works well with small datasets
✔ Handles continuous numerical data
✔ Effective for classification problems

🟥 Limitations

❌ Assumes all features are independent.
❌ Performance may decrease if features are highly correlated.
❌ Sensitive to the quality of training data.

🟩 Applications

🏥 Disease Diagnosis
📧 Spam Email Detection
😊 Sentiment Analysis
📰 News Classification
🌐 Language Detection
💳 Fraud Detection

📝 Viva Questions

What is Naïve Bayes?
Why is it called Naïve?
What is Gaussian Naïve Bayes?
What is the purpose of fit()?
What is the purpose of predict()?
What is the difference between Gaussian, Multinomial, and Bernoulli Naïve Bayes?
Why is prediction[0] used?
Which Python library provides the Naïve Bayes algorithm?

🎯 Key Points for Exams

Algorithm: Gaussian Naïve Bayes
Library: sklearn.naive_bayes
Model Class: GaussianNB()
Training Method: fit()
Prediction Method: predict()
Input: Continuous numerical values
Output: Predicted class (e.g., Healthy or Diabetes)

⭐ One-Line Revision

Gaussian Naïve Bayes is a supervised machine learning algorithm that uses Bayes' Theorem and probability to classify continuous numerical data by assuming that all input features are independent.

SEM 1	SEM 2	SEM 3
SEM 4	SEM 5	SEM 6

SEM 1	SEM 2	SEM 3
SEM 4	SEM 5	SEM 6

SEM 1	SEM 2	SEM 3
SEM 4	SEM 5	SEM 6

CLASS-4	CLASS-5	CLASS-6
CLASS-7	CLASS-8	CLASS-9
CLASS10	CLASS11 application	CLASS12 application
CLASS11 science	CLASS12 science

C	C++	CORE JAVA	SQL	PYTHON
MS OFFICE	HTML	VISUAL BASIC	advanced java	8085
PROLOG	ASSEMBLY LANGUAGE	JAVA SCRIPT	SHELL PROGRAMMING	R
DIGITAL ELECTRONICS	COMPUTER ARCHITECTURE	DATA STRUCTURE	OPERATING SYSTEM	GRAPH THEORY
DISCRETE MATHEMATICS	NUMERICAL ALGORITHM	AUTOMATA	MICROPROCESSOR	NETWORKING
GRAPHICS	SOFTWARE ENGINEERING	DATABSE	ANALYSIS OF ALGORITHM	IMAGE PROCESSING
ARTIFICIAL INTELLIGENCE	BIG DATA	CLOUD COMPUTING	DATA MINING	INTERNET TECHNOLOGY

Bijan Krishna Paul

Total Pageviews

Monday, June 29, 2026