PyTorch Tutorial

0. Foreword

The copyright belongs to the original author, and I am only using it for learning and sharing purposes.

Here is the author’s original address and related links.

Video: https://youtube.com/playlist?list=PLJV_el3uVTsOePyfmkfivYZ7Rqr2nMk3W

Course Home: https://speech.ee.ntu.edu.tw/~hylee/ml/2023-spring.php

Github: https://github.com/Fafa-DL/Lhy_Machine_Learning

PyTorch Official Document: https://pytorch.org/docs/stable/index.html

1. Background: Prerequisites & What is PyTorch ?

2. Train

2. Dataset & Dataloader

Dataset: stores data samples and expected values
Dataloader: groups data in batches, enables multiprocessing.

1
2
dataset = MyDataset(file)
dataloader = Dataloader(dataset, batch_size, shuffle=True)

Customize MyDataset:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
from torch.utils.data import Dataset, Dataloader

class MyDataset(Dataset):
def init(self, file):
# Read data & preprocess
self.data = ...

def __getitem__(self, index):
# Returns one sample at a time
return self.data[index]

def __len__(self):
# Returns the size of the dataset
return len(self.data)
1
2
3
dataset = MyDataset(file)

dataloader = Dataloader(dataset, batch_size=5, shuffle=False)

3. Tensors

3.1 Tensors

  • High-dimensional matrices(arrays)

3.2 Shape of Tensors

  • Check with .shape

Note: dim in PyTorch == axis in NumPy

3.3 Creating Tensors

  • Directly from data (list or numpy.ndarray)
1
2
x = torch.tensor([[1, -1], [-1, 1]])
x = torch.from_numpy(np.arrays([[1, -1], [-1, 1]]))
  • Tensor of constant zeros & ones
1
2
x = torch.zeros([2, 2])
x = torch.ones([1, 2, 5])

3.4 Common Operations

Common arithmetic functions are supported, such as:

  • Addition

    1
    z = x + y
  • Subtraction

    1
    z = x - y
  • Power

    1
    y = x.pow(2)
  • Summation

    1
    y = x.sum()
  • Mean

    1
    y = x.mean()
  • Transpose: transpose two specified dimensions

    1
    2
    3
    4
    5
    6
    >>> x = torch.zeros([2, 3])
    >>> x.shape
    torch.Size([2, 3])
    >>> x = x.transpose(0, 1)
    >>> x.shape
    torch.Size([3, 2])

  • Squeeze: remove the specified dimension with length = 1
    1
    2
    3
    4
    5
    6
    >>> x = torch.zeros([1, 2, 3])
    >>> x.shape
    torch.Size([1, 2, 3])
    >>> x = x.squeeze(0)
    >>> x.shape
    torch.Size([2, 3])

Tips: The 0 of x.squeeze(0) represents dimension 0.

  • Unsqueeze: expand a new dimension
    1
    2
    3
    4
    5
    6
    >>> x = torch.zeros([2, 3])
    >>> x.shape
    torch.Size([2, 3])
    >>> x.unsqueeze(1)
    >>> x.shape
    torch.Size([2, 1, 3])

Tips: The 1 of x.unsqueeze(1) represents dimension 1.

  • Cat: concatenate multiple tensors
    1
    2
    3
    4
    5
    6
    >>> x = torch.zeros([2, 1, 3])
    >>> y = torch.zeros([2, 3, 3])
    >>> z = torch.zeros([2, 2, 3])
    >>> w = torch.cat([x, y, z], dim=1)
    >>> w.shape
    torch.Size([2, 6, 3])

Common initialization values:

1
2
def normal(shape):
return torch.randn(size=shape, device=device) * 0.01

Note: The 0.01 for reducing variance

3.5 Data Type

  • Using different data types for model and data will cause errors.

3.6 PyTorch v.s. NumPy

  • Similar attributes

  • Many functions have the same names as well

3.7 Device

  • Tensor & modules will be computed with CPU by default

Use .to() to move tensors to appropriate device.

  • CPU

    1
    x = x.to('cpu')
  • GPU

    1
    x = x.to('cuda')

3.8 Device(GPU)

  • Check if your computer has NVIDIA GPU

    1
    torch.cuda.is_available()
  • Multiple GPUs: specified ‘cuda:0’, ‘cuda:1’, ‘cuda:2’, …

3.9 Gradient Calculation

1
2
3
4
>>> x = torch.tensor([[1., 0.], [-1., 1.]], requires_grad=True)
>>> z = x.pow(2).sum()
>>> z.backward()
>>> x.grad

4. torch.nn: Models, Loss Functions

4.1 Network Layers

  • Linear Layer(Fully-connected Layer)
    nn.Linear(in_features, out_features)



4.2 Network Parameters

1
2
3
4
5
>>> layer = torch.nn.Linear(32, 64)
>>> layer.weight.shape
torch.Size([64, 32])
>>> layer.bias.shape
torch.Size([64])

4.3 Non-Linear Activation Functions

  • Sigmoid Activation
    nn.Sigmoid()
1
2
3
m = torch.nn.Sigmoid()
x1 = torch.arange(-10, 10 + 1, 0.1)
y1 = m(x1)

  • ReLu Activation
    nn.ReLu()
1
2
3
m = torch.nn.ReLu()
x1 = torch.arange(-10, 10 + 1, 0.1)
y1 = m(x1)

4.4 Build your own neural network

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import torch.nn as nn


class MyModel(nn.Module):
def __init__(self):
"""
Initialize your model & define layers
"""
super(MyModel, self).__init__()
self.net = nn.Sequential(
nn.Linear(10, 32),
nn.Sigmoid(),
nn.Linear(32, 1)
)

def forward(self, x):
"""
Compute output of your NN
"""
return self.net(x)

Both have the same effect.

1
2
3
4
5
6
7
8
9
10
11
12
class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.layer1 = nn.Linear(10, 32)
self.layer2 = nn.Sigmoid()
self.layer3 = nn.Linear(32, 1)

def forward(self, x):
out = self.layer1(x)
out = self.layer2(out)
out = self.layer3(out)
return out

4.5 Loss Functions

  • Mean Squared Error (for regression tasks)

    1
    criterion = nn.MSELoss()
  • Cross Entropy (for classification tasks)

    1
    criterion = nn.CrossEntropyLoss()
  • loss = criterion(model_output, expected_value)

5. torch.optim: Optimization

  • Gradient-based optimiztion algorithms that adjust network parameters to reduce error.

  • E.g. Stochastic Gradient Descent (SGD)

1
optimizer = torch.optim.SGD(model.parameters(), lr, momentum = 0)
  • For every batch of data:
  1. Call optimizer.zero_grad() to reset gradients of model parameters.
  2. Call loss.backward() to backpropagate gradients of prediction loss.
  3. Call optimizer.step() to adjust model parameters.

6. Entire Procedure

6.1 Neural Network Training Setup

1
2
3
4
5
dataset = MyDataset(file)  # read data via MyDataset
tr_set = Dataloader(dataset, 16, shuffle=True) # put dataset into Dataloader
model = MyModel().to(device) # construct model and move to device(cpu/cuda)
criterion = nn.MSELoss() # set loss function
optimizer = torch.optim.SGD(model.parameters(), 0.1) # set optimizer

6.2 Neural Network Training Loop

1
2
3
4
5
6
7
8
9
for epoch in range(n_epochs):  # iterate n_epochs
model.train() # set model to train mode
for x, y in tr_set: # iterate through the dataloader
optimizer.zero_grad() # set gradient to zero
x, y = x.to(device), y.to(device) # move data to device(cpu/cuda)
pred = model(x) # forward pass (compute output)
loss = criterion(pred, y) # compute loss
loss.backward() # compute gradient (backpropagation)
optimizer.step() # update model with optimizer

6.3 Neural Network Validation Loop

1
2
3
4
5
6
7
8
9
model.eval()  # set model to evaluation mode
total_loss = 0
for x, y in dv_set: # iterate through the dataloader
x, y = x.to(device), y.to(device) # move data to device (cpu/cuda)
with torch.no_grad(): # disable gradient calculation
pred = model(x) # forward pass (compute output)
loss = criterion(pred, y) # compute loss
total_loss += loss.cpu().item() * len(x) # accumulate loss
avg_loss = total_loss / len(dv_set.dataset) # compute averaged loss

6.4 Neural Network Testing Loop

1
2
3
4
5
6
7
model.eval()  # set model to evaluation mode
preds = []
for x in tt_set: # iterate through the dataloader
x = x.to(device)
with torch.no_grad(): # disable gradient calculation
pred = model(x) # forward pass (compute output)
preds.append(pred.cpu()) # collect prediction

7. Save/load models

  • Save

    1
    torch.save(model.state_dict(), path)
  • Load

    1
    2
    ckpt = torch.load(path)
    model.load_state_dict(ckpt)

8. More About PyTorch

  • Useful github repositories using PyTorch

PyTorch Tutorial
https://www.hardyhu.cn/2023/05/17/PyTorch-Tutorial/
Author
John Doe
Posted on
May 17, 2023
Licensed under