PyTorch Tutorial
0. Foreword
The copyright belongs to the original author, and I am only using it for learning and sharing purposes.
Here is the author’s original address and related links.
Video: https://youtube.com/playlist?list=PLJV_el3uVTsOePyfmkfivYZ7Rqr2nMk3W
Course Home: https://speech.ee.ntu.edu.tw/~hylee/ml/2023-spring.php
Github: https://github.com/Fafa-DL/Lhy_Machine_Learning
PyTorch Official Document: https://pytorch.org/docs/stable/index.html
1. Background: Prerequisites & What is PyTorch ?
2. Train
2. Dataset & Dataloader
Dataset: stores data samples and expected values
Dataloader: groups data in batches, enables multiprocessing.
1 |
|
Customize MyDataset:
1 |
|
1 |
|
3. Tensors
3.1 Tensors
- High-dimensional matrices(arrays)
3.2 Shape of Tensors
- Check with
.shape
Note: dim
in PyTorch == axis
in NumPy
3.3 Creating Tensors
- Directly from data (
list
ornumpy.ndarray
)
1 |
|
- Tensor of constant zeros & ones
1 |
|
3.4 Common Operations
Common arithmetic functions are supported, such as:
Addition
1
z = x + y
Subtraction
1
z = x - y
Power
1
y = x.pow(2)
Summation
1
y = x.sum()
Mean
1
y = x.mean()
Transpose: transpose two specified dimensions
1
2
3
4
5
6>>> x = torch.zeros([2, 3])
>>> x.shape
torch.Size([2, 3])
>>> x = x.transpose(0, 1)
>>> x.shape
torch.Size([3, 2])
- Squeeze: remove the specified dimension with length = 1
1
2
3
4
5
6>>> x = torch.zeros([1, 2, 3])
>>> x.shape
torch.Size([1, 2, 3])
>>> x = x.squeeze(0)
>>> x.shape
torch.Size([2, 3])
Tips: The 0
of x.squeeze(0
) represents dimension 0
.
- Unsqueeze: expand a new dimension
1
2
3
4
5
6>>> x = torch.zeros([2, 3])
>>> x.shape
torch.Size([2, 3])
>>> x.unsqueeze(1)
>>> x.shape
torch.Size([2, 1, 3])
Tips: The 1
of x.unsqueeze(1
) represents dimension 1
.
- Cat: concatenate multiple tensors
1
2
3
4
5
6>>> x = torch.zeros([2, 1, 3])
>>> y = torch.zeros([2, 3, 3])
>>> z = torch.zeros([2, 2, 3])
>>> w = torch.cat([x, y, z], dim=1)
>>> w.shape
torch.Size([2, 6, 3])
Common initialization values:
1 |
|
Note: The 0.01
for reducing variance
3.5 Data Type
- Using different data types for model and data will cause errors.
3.6 PyTorch v.s. NumPy
- Similar attributes
- Many functions have the same names as well
3.7 Device
- Tensor & modules will be computed with
CPU
by default
Use .to() to move tensors to appropriate device.
CPU
1
x = x.to('cpu')
GPU
1
x = x.to('cuda')
3.8 Device(GPU)
Check if your computer has NVIDIA GPU
1
torch.cuda.is_available()
Multiple GPUs: specified ‘cuda:0’, ‘cuda:1’, ‘cuda:2’, …
3.9 Gradient Calculation
1 |
|
4. torch.nn: Models, Loss Functions
4.1 Network Layers
- Linear Layer(
Fully-connected
Layer)
nn.Linear(in_features, out_features)
4.2 Network Parameters
1 |
|
4.3 Non-Linear Activation Functions
- Sigmoid Activation
nn.Sigmoid()
1 |
|
- ReLu Activation
nn.ReLu()
1 |
|
4.4 Build your own neural network
1 |
|
Both have the same effect.
1 |
|
4.5 Loss Functions
Mean Squared Error (for regression tasks)
1
criterion = nn.MSELoss()
Cross Entropy (for classification tasks)
1
criterion = nn.CrossEntropyLoss()
loss = criterion(model_output, expected_value)
5. torch.optim: Optimization
Gradient-based optimiztion algorithms that adjust network parameters to reduce error.
E.g. Stochastic Gradient Descent (SGD)
1 |
|
- For every batch of data:
- Call
optimizer.zero_grad()
to reset gradients of model parameters. - Call
loss.backward()
to backpropagate gradients of prediction loss. - Call
optimizer.step()
to adjust model parameters.
6. Entire Procedure
6.1 Neural Network Training Setup
1 |
|
6.2 Neural Network Training
Loop
1 |
|
6.3 Neural Network Validation
Loop
1 |
|
6.4 Neural Network Testing
Loop
1 |
|
7. Save/load models
Save
1
torch.save(model.state_dict(), path)
Load
1
2ckpt = torch.load(path)
model.load_state_dict(ckpt)
8. More About PyTorch
- Useful github repositories using PyTorch
- Huggingface Transformers (transformer models: BERT, GPT, …)
- Fairseq (sequence modeling for NLP & speech)
- ESPnet (speech recognition, translation, synthesis, …)