[Solved] Pytorch DataLoader: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow – Pytorch

by
Ali Hasan
llama-cpp-python multiprocessing pytorch pytorch-dataloader

Quick Fix: Instead of creating a tensor from a list of numpy arrays, use torch.from_numpy(array) to create a tensor from a numpy array.

The Problem:

When using PyTorch’s DataLoader() to train a model, a User Warning is displayed indicating that creating a tensor from a list of numpy arrays is slow. The warning suggests converting the list to a single numpy array before creating the tensor. The user wants to explore options to convert the DataLoader() dataset to a tensor or find a better approach to resolve this issue.

The Solutions:

Solution 1: Convert to Tensor using `torch.from_numpy()`

To resolve the warning, modify the code segment that creates the tensor from a list of numpy arrays:

labels = torch.from_numpy(labels)

Q&A

What is the correct way to create a tensor from a numpy array?

tensor = torch.from_numpy(array)

Where is the problem coming from?

From sentence_transformer library.