Please read the Introduction to Linear Regression before continuing to read this Polynomial Regression article.

Brief summary of *regression in Machine Learning (ML)* is a type of *supervised learning algorithm*, it is used for predicting continuous numerical values based on input data. It aims to model the relationship between independent variables (features in training data) and a dependent variable (target of the training data) by fitting a function to the data. The predicted values are continuous, meaning they can take any value within a given range from the trained data.

Examples of regression algorithms include *Linear Regression*, *Polynomial Regression*, *Decision Tree Regression*, etc. Linear regression is the most basic algorithm and easy to understand because it is just *a simple math formula to create a linear line* to predict output (y) from input (X), the X can be consist of a single value (single feature or 'Simple Linear Regression') or multiple values (multiple features or 'Multiple Linear Regression').

Simple Linear Regression formula is:

y = (X * coefficient) + intercept

Multiple Linear Regression formula is:

y = (X_{1}^{1}* coef_{1}) + (X_{2}^{2}* coef_{2}) + (X_{3}^{3}* coef_{3}) + (X_{n}^{n}* coef_{n}) + intercept

Polynomial regression is used *when the relationship between the independent variable and the dependent variable is not linear*, often the real data to be trained cannot be adequately modeled by a straight line (linear). Polynomial regression is a special case of Multiple Linear Regression, *one example of formula*:

y = (X_{1}^{1}* coef_{1}) + (X_{2}^{2}* coef_{2}) + (X_{3}^{3}* coef_{3}) + (X_{n}^{n}* coef_{n}) + intercept

Describing the math formula in depth is outside the scope of this article, but we can use graph to help us to visualize and understand how Polynomial Regression can fit better than Linear Regression.

Note in the grap, the value of 'intercept' is the value of 'y' which the line crosses/hits the vertical line.

Example case scenarios to use AI prediction using Regression algorithm:

- A single feature based on
**body weight**to predict*person shoe size*. - Multiple features based on
**city (location)**and**the house size**to predict*House price*. - Multiple features based on
**car brand**,**year (age)**and**total mileage**to predict*Used-car price*. - Multiple features based on
**total year of work experience**,**Python language skill score**,**C/C++ language skill score**and**Javascript skill score**to predict*software developer salary*.

Simple Polynomial Regression demo using *Python code in Jupyter Notebook*, Github: Polynomial Regression

Some benefits to use Polynomial regression instead of Linear regression:

**Capturing Non-Linear Relationships:**Linear regression assumes a linear relationship between variables, but real-world relationships can be more complex. Polynomial regression can capture non-linear patterns in the data, providing a better fit.**Flexibility:**Polynomial regression allows for a more flexible curve to be fitted to the data. By introducing higher-degree polynomial terms, the model can accommodate more complex relationships between variables.**Improved Accuracy:**In situations where the relationship between variables is non-linear, polynomial regression can provide more accurate predictions compared to linear regression.**Better Model Interpretation:**While polynomial regression introduces complexity, it can also provide insights into the curvature and shape of the relationship between variables, which may be valuable for interpretation.

Disadvantages when use polynomial regression instead of linear regression:

**Overfitting:**the predicted result is more suitable for trained data and not suitable for real-world data, especially with higher-degree polynomial terms.**More complex:**need to put more care to generalize the 'degree' of polynomial terms to fit the data.- On creating Polynomial Regression, we need to define an integer value of 'degree'.
- In some cases using smaller 'degree' value may be underfit but higher 'degree' value may cause overfit.
- Tip:
*scikit-learn*has a function*r2_score()*to compare which 'degree' value is suitable for the data, the return value of r2_score() is between 0.0 to 1.0 but a negative value is possible, higher value (closer to 1.0) means more accurate.