Salary Prediction Using Liner Regression

Link to Github repo

This beginner-level machine learning project aims to predict an employee’s salary based on their years of experience using Simple Linear Regression. The goal is to build an interpretable model, visualize the relationship between experience and salary, and make predictions for new data points.

Context

Imagine you are working with an HR analytics team at a hiring consultancy firm. The firm has collected historical data of employees including their years of experience and the corresponding salaries. The management wants to use this data to predict the expected salary for a new candidate based on their work experience. This would help in making fair salary offers and in budgeting for hiring plans.

Data Cleaning image

Why is this important?

  • - Helps HR teams make data-driven salary decisions based on historical trends.
  • - Assists candidates and hiring managers in setting realistic salary expectations.
  • - Builds foundational understanding of regression analysis and model interpretability.

Packages Used

  • - Numpy
  • - Pandas
  • - Matplotlib
  • - SciKitlearn

Steps Taken

  • - Loading & Exploring Dataset
  • - visualizing the Data
  • - Split Data into Train & Test
  • - Train Linear Regression
  • - Predicting on Test Data
  • - Visulaizing Regression Line on Salary vs Experience
  • - Evaluate the Model

Conclusion

In this project, we worked with a simulated dataset resembling historical employee records, where the goal was to predict salaries based on years of experience. Imagine this scenario within an HR analytics team at a hiring consultancy firm — the firm needs a simple and effective tool to estimate salary expectations for new candidates using existing employee data.

We built a linear regression model that learned the relationship between experience (in months) and salary (in thousands). After training and evaluating the model on 1,000 records, we achieved an R² score of 0.6264, indicating the model could explain over 62% of the variation in salary using just one feature. With an average prediction error of about 4,000, the model provides a solid baseline for salary estimation.

This project demonstrates how even a simple regression model can be used by HR teams to support fair and data-driven salary offers, improve budgeting accuracy, and streamline hiring processes.



Project Image Gallery

Project
Project
Project
Project