Total Pageviews

Monday, June 29, 2026

Association Rule Mining (Apriori Algorithm) Using Python

 

Association Rule Mining (Apriori Algorithm)


Note: Association Rule Mining is an Unsupervised Machine Learning technique. It is mainly used for Market Basket Analysis to discover relationships between items frequently purchased together.


🟦 Program Aim

Aim:

To implement the Association Rule Mining (Apriori Algorithm) using Python and identify products that are frequently purchased together.


🟩 Algorithm Used

Apriori Algorithm


🟨 Problem Statement

A supermarket wants to analyze customer shopping patterns. By examining previous transactions, the store aims to identify products that are frequently purchased together. This information helps improve product placement, cross-selling, and promotional strategies.


🟪 Step 1: Install Required Library

Install the mlxtend package (only once).

pip install mlxtend

Explanation

  • mlxtend stands for Machine Learning Extensions.
  • It provides the Apriori algorithm and functions for generating association rules.

🟦 Step 2: Import Required Libraries

import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori, association_rules

Explanation

  • pandas → Used to create and manipulate data.
  • TransactionEncoder → Converts transaction data into a True/False matrix.
  • apriori() → Finds frequent itemsets.
  • association_rules() → Generates association rules from frequent itemsets.

🟩 Step 3: Create the Transaction Dataset

transactions = [
["Milk", "Bread", "Butter"],
["Milk", "Bread"],
["Milk", "Butter"],
["Bread", "Butter"],
["Milk", "Bread", "Butter", "Eggs"],
["Bread", "Eggs"],
["Milk", "Eggs"]
]

Explanation

Each inner list represents one customer's shopping basket.

CustomerPurchased Items
1Milk, Bread, Butter
2Milk, Bread
3Milk, Butter
4Bread, Butter
5Milk, Bread, Butter, Eggs
6Bread, Eggs
7Milk, Eggs

🟨 Step 4: Convert Transactions into Binary Format

encoder = TransactionEncoder()

encoded_data = encoder.fit(transactions).transform(transactions)

df = pd.DataFrame(encoded_data, columns=encoder.columns_)

Explanation

The Apriori algorithm requires data in binary (True/False or 1/0) format.

The dataset becomes:

BreadButterEggsMilk
TrueTrueFalseTrue
TrueFalseFalseTrue
FalseTrueFalseTrue
TrueTrueFalseFalse
TrueTrueTrueTrue
TrueFalseTrueFalse
FalseFalseTrueTrue

🟦 Step 5: Display the Dataset

print(df)

Explanation

Displays the converted transaction matrix used for mining frequent itemsets.


🟩 Step 6: Find Frequent Itemsets

frequent_items = apriori(df, min_support=0.3, use_colnames=True)

print(frequent_items)

Explanation

  • min_support = 0.3 means an itemset must appear in at least 30% of all transactions.
  • use_colnames=True displays product names instead of column numbers.

Example Output:

SupportItemsets
0.71{Milk}
0.71{Bread}
0.57{Butter}
0.43{Eggs}
0.43{Milk, Bread}
0.43{Milk, Butter}

🟨 Step 7: Generate Association Rules

rules = association_rules(
frequent_items,
metric="confidence",
min_threshold=0.7
)

print(rules)

Explanation

This step generates association rules using:

  • Metric = Confidence
  • Minimum Confidence = 70%

Example Rule:

Milk  → Bread

Meaning:

Customers buying Milk are likely to buy Bread as well.


🟥 Step 8: Display Selected Columns

print(rules[['antecedents',
'consequents',
'support',
'confidence',
'lift']])

Explanation

This displays the most important measures:

AntecedentConsequentSupportConfidenceLift
MilkBread0.430.751.05
BreadButter0.430.601.04

🟪 Complete Python Program

import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori, association_rules

transactions = [
["Milk", "Bread", "Butter"],
["Milk", "Bread"],
["Milk", "Butter"],
["Bread", "Butter"],
["Milk", "Bread", "Butter", "Eggs"],
["Bread", "Eggs"],
["Milk", "Eggs"]
]

encoder = TransactionEncoder()

encoded_data = encoder.fit(transactions).transform(transactions)

df = pd.DataFrame(encoded_data, columns=encoder.columns_)

print("Transaction Dataset")
print(df)

frequent_items = apriori(df,
min_support=0.3,
use_colnames=True)

print("\nFrequent Itemsets")
print(frequent_items)

rules = association_rules(frequent_items,
metric="confidence",
min_threshold=0.7)

print("\nAssociation Rules")
print(rules[['antecedents',
'consequents',
'support',
'confidence',
'lift']])

🟩 Sample Output

Transaction Dataset

Bread Butter Eggs Milk
0 True True False True
1 True False False True
2 False True False True
3 True True False False
4 True True True True
5 True False True False
6 False False True True

Frequent Itemsets

support itemsets

0.71 {Milk}

0.71 {Bread}

0.57 {Butter}

0.43 {Eggs}

0.43 {Milk, Bread}

...

Association Rules

Milk → Bread

Bread → Butter

🟦 Step-by-Step Working of the Algorithm

Transaction Data


Convert into Binary Matrix


Apply Apriori Algorithm


Find Frequent Itemsets


Generate Association Rules


Display Support, Confidence & Lift

🟨 Important Terms

TermDescription
SupportFrequency of an itemset appearing in all transactions.
ConfidenceProbability that customers who buy item A also buy item B.
LiftMeasures the strength of the relationship between two items. A lift value greater than 1 indicates a positive association.
Frequent ItemsetA group of items that appears frequently in the dataset.
Association RuleA rule showing the relationship between two or more items (e.g., Milk → Bread).

🌍 Real-Life Applications

  • 🛒 Market Basket Analysis
  • 🛍 Product Recommendation Systems
  • 🏪 Store Shelf Arrangement
  • 💳 Banking Product Recommendations
  • 🎬 Movie Recommendation Systems
  • 🌐 E-commerce Websites (Amazon, Flipkart)
  • 🍔 Restaurant Combo Offers

🎯 Viva Questions

  1. What is Association Rule Mining?
  2. What is the Apriori Algorithm?
  3. Define Support, Confidence, and Lift.
  4. What is a Frequent Itemset?
  5. Why is TransactionEncoder used?
  6. What is the purpose of min_support?
  7. What is the purpose of min_threshold in association rules?
  8. Give two real-life applications of Association Rule Mining.

⭐ One-Line Revision

Association Rule Mining uses the Apriori algorithm to discover frequently occurring item combinations and generate rules such as "If a customer buys Milk, they are also likely to buy Bread."

No comments:

Post a Comment