Experiment 1: Linear Regression Using Python
🎯 Aim
To implement the Linear Regression algorithm using Python and predict the value of a dependent variable based on an independent variable.
📖 Theory
Linear Regression is one of the simplest Supervised Machine Learning algorithms. It is used to predict continuous numerical values by finding a best-fit straight line between the input (independent variable) and the output (dependent variable).
It assumes a linear relationship between the variables.
Mathematical Equation
Where:
- Y = Predicted Output (Dependent Variable)
- X = Input (Independent Variable)
- m = Slope of the Line
- c = Intercept
🌍 Real-Life Example
A company wants to predict an employee's salary based on their years of experience.
| Experience (Years) | Salary (₹) |
|---|---|
| 1 | 25,000 |
| 2 | 30,000 |
| 3 | 35,000 |
| 4 | 45,000 |
| 5 | 50,000 |
| 6 | 60,000 |
| 7 | 65,000 |
| 8 | 70,000 |
Now, we want to predict the salary of an employee with 9 years of experience.
🪜 Step-by-Step Algorithm
Step 1️⃣ Import Required Libraries
Import the necessary libraries.
import pandas as pd
from sklearn.linear_model import LinearRegression
Explanation
- pandas → Used to create and manage datasets.
-
LinearRegression → Imports the Linear Regression model from
scikit-learn.
Step 2️⃣ Create the Dataset
data = {
"Experience": [1,2,3,4,5,6,7,8],
"Salary": [25000,30000,35000,45000,50000,60000,65000,70000]
}
df = pd.DataFrame(data)
Explanation
We create a simple dataset using a Python dictionary.
The dataset has two columns:
- Experience → Independent Variable (X)
- Salary → Dependent Variable (Y)
The data is converted into a DataFrame for easy processing.
Step 3️⃣ Display the Dataset
print(df)
Output
Experience Salary
0 1 25000
1 2 30000
2 3 35000
3 4 45000
4 5 50000
5 6 60000
6 7 65000
7 8 70000
Step 4️⃣ Separate Input and Output Variables
X = df[["Experience"]]
y = df["Salary"]
Explanation
Machine Learning models require:
- X → Input Features (Independent Variable)
- y → Target Variable (Dependent Variable)
Here:
X = Experience
y = Salary
Step 5️⃣ Create the Linear Regression Model
model = LinearRegression()
Explanation
This creates an empty Linear Regression model.
At this stage, the model has not learned from the data.
Step 6️⃣ Train the Model
model.fit(X, y)
Explanation
The fit() function trains the model.
During training:
- Reads all training data
- Calculates the best-fit line
- Finds the slope (m)
- Finds the intercept (c)
The model is now ready for prediction.
Step 7️⃣ Predict Salary
experience = [[9]]
prediction = model.predict(experience)
Explanation
We ask the model:
"Predict the salary of an employee with 9 years of experience."
The predict() function uses the learned line to estimate the salary.
Step 8️⃣ Display the Prediction
print("Predicted Salary =", prediction[0])
Sample Output
Predicted Salary = 78809.52
(The exact value may vary slightly depending on the fitted line.)
💻 Complete Python Program
# Step 1: Import Libraries
import pandas as pd
from sklearn.linear_model import LinearRegression
# Step 2: Create Dataset
data = {
"Experience": [1,2,3,4,5,6,7,8],
"Salary": [25000,30000,35000,45000,50000,60000,65000,70000]
}
df = pd.DataFrame(data)
# Step 3: Display Dataset
print("Dataset:")
print(df)
# Step 4: Separate Input and Output
X = df[["Experience"]]
y = df["Salary"]
# Step 5: Create Model
model = LinearRegression()
# Step 6: Train Model
model.fit(X, y)
# Step 7: Predict Salary
experience = [[9]]
prediction = model.predict(experience)
# Step 8: Display Result
print("\nPredicted Salary for 9 years experience = ₹", round(prediction[0],2))
🔄 Workflow
Start
│
▼
Import Libraries
│
▼
Create Dataset
│
▼
Display Dataset
│
▼
Separate X and y
│
▼
Create Linear Regression Model
│
▼
Train Model using fit()
│
▼
Predict using predict()
│
▼
Display Prediction
│
▼
End
📌 Explanation of Important Functions
| Function | Purpose |
|---|---|
pd.DataFrame() | Creates a table from data |
LinearRegression() | Creates the regression model |
fit(X, y) | Trains the model using the dataset |
predict() | Predicts the output for new input |
✅ Advantages
- Simple and easy to implement
- Fast training and prediction
- Easy to interpret results
- Works well for linear relationships
❌ Limitations
- Only models linear relationships
- Sensitive to outliers
- Performance decreases if data is non-linear
🌍 Applications
- Salary Prediction
- House Price Prediction
- Sales Forecasting
- Stock Trend Analysis
- Weather Forecasting
- Business Revenue Prediction
⭐ Memory Trick
Import Libraries
↓
Create Dataset
↓
Separate X and y
↓
Create Model
↓
Train using fit()
↓
Predict using predict()
↓
Display Result
Easy Formula to Remember:
Import → Data → X & y → Model → Fit → Predict → Output
🎓 Viva Questions
- What is Linear Regression?
- Why is it called a supervised learning algorithm?
- What are the independent and dependent variables?
-
What is the purpose of
fit()? -
What is the purpose of
predict()? - What is the equation of a regression line?
-
What is the role of
Xandy? - Give two real-life applications of Linear Regression.
No comments:
Post a Comment