Jason's Notebook

❯

❯

Regression Analysis

❯

Training Regression Models

Training Regression Models

Jan 12, 20261 min read

1. Splitting the Data

Dataset is divided into 2 or more subsets
- Training set (70% - 80%)
  - used to fit the regression model by learning relationships between independent and dependent variables
- Testing Set (20% - 30%)
  - used to evaluate the model’s generalization performance on unseen data
  - helps assess whether the model is overfitting or underfitting

2. Model Training

trained using training dataset by minimizing a loss function
- Mean Squared Error (MSE)
- Mean Absolute Error (MAE)
steps:
1. Selecting a regression algorithm (Linear Regression, Random Forest Regression, Support Vector Regression)
2. Determining the best-fit parameters through methods like
  - gradient descent (linear models)
  - tree-based optimization (decision trees)
3. Applying regularization techniques like these to prevent overfitting
  - Ridge (L2)
  - Lasso (L1)

3. Model Testing

model is tested on the test database to measure its performance
- ensures the model can make accurate predictions on unseen data
Helps Identify:
- Overfitting:
  - when the model performs well on training data but poorly on testing data
- Underfitting:
  - when the model fails to capture important patterns
  - leading to poor performance on both training and test data

Graph View

1. Splitting the Data
2. Model Training
3. Model Testing

Backlinks

Regression Analysis

Created with Quartz v4.5.1 © 2026

© Copyright Jason Wang

GitHub
Jason's Portfolio Website