Contents

###### Daily Dose 015 | Probability and Statistics

#### How do you define a Least Squares Linear Regression line?

I know for me, in college, being asked to find the Least Squares Linear Regression line of a data set often left me scratching my head.

I didn’t really get the need for the tables and strict formulas, I was more comfortable in the realm of derivations and hard science.

But I was wrong, much of what we do as engineers in the real world couldn’t be done without relying on the workings within the discipline of ENGINEERING PROBABILITY & STATISTICS. We use those tables and strict formulas to make better decisions. We use them to strengthen our analysis, test data and assess risk.

We use them to design and manufacture robust products, detect problems and understand how variations affect performance.

All this to say, there’s a reason we as engineers must be comfortable in this realm.

In this lesson, we jump in to a problem that is covered in the subject of ENGINEERING PROBABILITY & STATISTICS, specifically, we will be learning how to define a LEAST SQUARES LINEAR REGRESSION LINE.

###### Key Definition

## What is LEAST SQUARES LINEAR REGRESSION?

The method of LEAST SQUARES is a standard approach in REGRESSION ANALYSIS, or LINEAR REGRESSION ANALYSIS, used to approximate the solution of sets of equations in which there are more equations than unknowns.

LEAST SQUARES means that the overall solution minimizes the sum of the squares of the residuals made in the results of every single equation.

###### The Process

## What is the least squares process?

As engineers, we are going to have to have some way to best predict the future, based on what we know from the past.

This is where linear regression comes in, a statistical method that allows us to find relationships between variable data.

Imagine that you have two variables, one that you can control (independent) and one that you observe (dependent).

Now, imagine fitting a straight line that best represents their relationship.

This isn’t any ordinary line – it’s the least squares linear regression line.

It’s crafted with precision to minimize the sum of the squares of vertical distances (errors) from each data point to the line.

Using this line, we will be better equipped to predict how certain independent variables will perform in real world applications.

Being asked to define a least squares linear regression line may seem complicated at first, but it doesn’t need to be.

There is a process that can be broken down and carried out in even the most complex scenarios, let’s run through the steps:

**Gather Data**- Using a table of values
- A scatter Plot
- Simple words
**Understand the Goal:**`y`= predicted value`m`= slope of the line`x`= independent variable’s value`b`= y-intercept**Calculate the Slope (m):**`m`= (`n`*`Σxy`–`Σx`*`Σy`) / (`n`*`Σx`–^{2}`Σx`^{2})`n`= number of data points`Σxy`= sum of the product of paired data points`Σx`and`Σy`= sum of x values and y values respectively.**Calculate the Y-intercept (b):**`b`= (`Σy`–`m`*`Σx`) /`n`**Formulate the Regression Line:**`y`=`m`*`x`+`b`.**Evaluate Goodness of Fit:****Apply and Predict:**

Read through the problem statement and collect all the paired data points.

This data may be presented in a number of different ways:

The goal is to determine the independent and dependent variables and the data that is presented for each.

The aim is to find a straight line, represented by `y` = `m` * `x` + `b`, where:

Using the formula:

Where:

Using the formula:

With `m` and `b` calculated, the least squares regression line is:

Use the coefficient of determination, `r ^{2}`, to measure how well the line fits the data.

A value close to 1 indicates a good fit.

With the regression line defined, you can predict the Dependent variable’s value for any given Independent variable.

And with all of that being stated, check out the video and see how we can go about solving this type of problem in the most efficient manner.

As always, we are here to help, Prepineer