Other PyTorch Tools#

PyTorch has a number of other useful features and utilities as well as a few compantion libraries. Most of these are beyond the scope of this course, but this brief section outlines a few of them and points toward additional resources.

PyTorch Linear Algebra: torch.linalg#

PyTorch’s linear algebra subpackage, similar to NumPy’s numpy.linalg, contains calculations such as matrix inverse and least-squares functions that will respect PyTorch’s autograd. This essentially means that you can calculate eigenvalues and perform linear regression as a step in a nonlinear model, and PyTorch will track the gradients for these operations correctly during optimization.

See the torch.linalg Reference API and this Linear Algebra chapter from an online Scientific Computing in Python course by Brad Nelson for some additional information.

Train/Test/Validation Splitting#

PyTorch includes a function in the torch.utils.data subpackage for randomly splitting data into train, test, and validation subsets called random_split.

import torch
from torch.utils.data import random_split

# Simulate a dataset of 1000 observations and 8 features:
n = 1000
dataset = torch.randn(n, 8)

# Split the dataset into three subsets:
(train, test, val) = random_split(dataset, (0.65, 0.2, 0.15))

print('train:', len(train))
print(' test:', len(test))
print('  val:', len(val))
train: 650
 test: 200
  val: 150

The returned objects aren’t actually the submatrices; rather they are a Subset type that stores both the original dataset and the indices of the subselection.

print(type(train))
print(len(train.dataset))
print(len(train.indices))
<class 'torch.utils.data.dataset.Subset'>
1000
650

PyTorch Distributions#

There is a whole subpackage of PyTorch for probability distributions and calculations related to them. You can find more information in the API Reference for torch.distributions. Let’s look at a quick example of an exponential distribution.

from torch.distributions import Exponential

exp_dist = Exponential(1.5)

# Plot the PDF of the exponential distribution.
import matplotlib.pyplot as plt

x = torch.linspace(0,10,500)
# PyTorch distributions have functions for CDF but the PDF function calculates
# the log probability; this is more commonly used in optimization.
y = exp_dist.log_prob(x)
y = torch.exp(y)
plt.plot(x, y, 'k-')

plt.show()
../_images/97694fba5e32bf21a2bf6a3d30b671437d14764e64cb209b52b90b61b9795478.png

Torch Hub#

Torch Hub allows one to upload and share models with other researchers. There is an interface for loading models from Torch Hub in PyTorch that we will see in the next section (torch.hub).