How to save a YOLOv8 model after some training on a custom dataset to continue the training later? – Yolov8

by
Ali Hasan
llama-cpp-python pytorch yolov8

Quick Fix: To load and save the YOLOv8 model in PyTorch, use the torch.save() and torch.load() functions. Here’s an example:

# Save the model
torch.save(model.state_dict(), 'yolov8_custom.pt')

# Load the model
model = YOLOv8()
model.load_state_dict(torch.load('yolov8_custom.pt'))

# Continue training
optimizer = ...  # Initialize your optimizer here
for epoch in range(num_epochs):
    for batch in train_data:
        # Forward pass and compute loss
        ...

        # Backpropagation
        ...

        # Update model weights
        optimizer.step()

The Problem:

User is training a YOLOv8 model on a custom dataset using Google Colab. They wish to save the model periodically for future training resumption. Attempts to save the model or weights using PyTorch’s save methods have been unsuccessful.

The Solutions:

Solution 1: Save the best weights

The best weight is automatically stored in the “`runs/detect/train/weights“` directory as “`best.pt“`. When continuing training, use the “`best.pt“` weight instead of “`yolov8x.pt“` to train the model.

Solution 2: Resume Training Using Load() Function

To save and continue training a YOLOv8 model on a custom dataset, utilize the load() function. After the initial training epoch, use the following steps:

  1. Save the model checkpoints: During training, YOLOv8 automatically creates model checkpoints in the results/run folder. These checkpoints include both model weights and training metadata.

  2. Resume training: To continue training, instantiate a new YOLOv8 model using the same YAML configuration file as before. Then, load the saved model checkpoint using the load() function:

    model = YOLO('yolov8x.yaml').load('results/run/weights/best_train.pt')  # replace 'best_train.pt' with the latest checkpoint file
    model.train(data='/image_datasets/Website_Screenshots.v1-raw.yolov8/data.yaml', epochs=1)  # Continue training for additional epochs
    

By default, YOLOv8 uses mixed precision training. To disable it and use a more stable floating-point precision, add --amp 0 to the training command:

model.train(data='/image_datasets/Website_Screenshots.v1-raw.yolov8/data.yaml', epochs=1, amp=0)

Solution 3: Utilize the backup method

To alleviate the issue of unavailable runs directories, you can employ a backup strategy. Divide the training process into segments to avoid reaching GPU limits that could cause disruptions. Create a backup of the runs directory on your drive. This enables you to continue training or make predictions using the best.pt or last.pt models stored in the backup directory.

Q&A

How to save YOLOv8 model after some training on a custom dataset to continue the training later?

The trained model will be saved in the results/run folder in the working dir.

Where is the saved training?

runs/detect/train/weights directory as best.pt

Is runs directory always available?

No. A solution is to divide training epochs and make a backup of runs directory

Video Explanation:

The following video, titled "Train Yolov8 object detection on a custom dataset | Step by step ...", provides additional insights and in-depth exploration related to the topics discussed in this post.

Play video

Hey, I am facing an issue where evrything's running fine, even my results are also getting saved but still there is ( no detection ) written ...