L1 & L2

L1 & L2 regularization

L1(Lasso) and L2(Ridge) regularization are used to address overfitting by penalizing large weights in the model.

L1 Regularization(Lasso): Adds the absolute value of all weights in the model to the loss function.
- Encourages sparsity, meaning some weights can be driven to zero, effectively removing unimportant features
- prefer when features selection is desired
L2 Regularization(Ridge): Adds the squared value of all weights to the loss function.
- Discourages excessively large weights but generally doesn’t drive them to zero, resulting in a smoother weight distribution
- more stable and efficient, default choice

$L oss = l oss + R e gu l a r i z a t i o n$

Implementing in PyTorch

Using `weight_decay`

PyTorch的一些优化算法如SGD，Adam内部实现了L2正则化，weight_decay代表L2的强度。
A higher value of weight_decay leads to stronger regularization.

optimizer = torch.optim.SGD(model.parameter(), lr=0.01, weight_decay=0.001)

Manual

import torch
from torch import nn
 
def L1_regularization(model):
	l1_reg = 0
	for parm in model.parameters():
		l1_reg += torch.sum(torch.abs(parm))
	return l1_reg
def L2_regularization(model):
	l2_reg = 0
	for parm in model.parameters():
		l2_reg += torch.sum(torch.square(parm))
	return l2_reg
 
def loss(pred, target, L1_weight=0., l2_weight=0.):
	loss = loss_function(pred, target)
	l1_penalty = L1_regularization() * L1_weight
	L2_penalty = L2_regularization() * L2_weight
	return loss = L1_penalty + L2_penalty

White Box

Notes

Diffusion Intro

PF-ODE

Diffusion

Thoughts

生成式人工智能浪潮下残缺的人

《人都是要死的》读后感

L1 & L2

L1 & L2 regularization

Implementing in PyTorch

Using `weight_decay`

Manual

Graph View

Table of Contents

Backlinks

White Box

Notes

Diffusion Intro

PF-ODE

Diffusion

Thoughts

生成式人工智能浪潮下残缺的人

《人都是要死的》读后感

L1 & L2

L1 & L2 regularization

Implementing in PyTorch

Using weight_decay

Manual

Graph View

Table of Contents

Backlinks

Using `weight_decay`