SAIL-VOS Dataset

Download

[Recommended] Use the download script here to download the dataset including the RGB frames, the segmentation masks and the MSCOCO style json annotation files.


Alternatively, you can download the dataset using the following links as well:

Training set

RGB frames: part 1, part 2, part 3, part 4, part 5, part 6, part 7, part 8, part 9, part 10, part 11, part 12, part 13, part 14, part 15, part 16, part 17

Segmentation: download

Json annotation file: download

Json annotation file (24 classes): download

Validation set

RGB frames: part 1, part 2, part 3, part 4, part 5

Segmentation: download

Json annotation file: download

Json annotation file (24 classes): download


If you use the download script, you don’t have to do the following steps. The script automatically unzip files for you.

To untar the download tar files, run

cat sailvos_train.tar.* | tar xvf -
cat sailvos_val.tar.* | tar xvf -

Then unzip the files by running

unzip sailvos_train.zip
unzip sailvos_val.zip
unzip sailvos_train_annotations.zip
unzip sailvos_val_annotations.zip

Dataset Overview

The SAIL-VOS dataset contains the RGB frames, the visible masks (modal masks) and the amodal masks.

After you unzip the files, you will see folders which correspondes to video sequences.

.
├── ah_1_mcs_1               
├── ah_3a_mcs_3   
├── ah_3a_mcs_6                 
└── ...

Under each video sequence, you will see two folders images/ and visible/ which stores the RGB frames and the instance level visible segmentation masks, respectively. Other folders with a 4-digit prefix store the amodal segmentation masks, and the 4-digit prefix number represents the object ID which is consistent to the visible masks stored in visible/.

ah_1_mcs_1  
├── images          # RGB frames          
├── visible         # visible masks
├── 0001_*          # amodal masks of the object with object id = 1            
└── ...

The folder images stores the RGB frames.

images  
├── 000000.bmp      # the first frame
├── 000001.bmp      # the second frame       
├── 000002.bmp      # the third frame 
└── ...

The folder visible stores the visible masks

visible  
├── 000000.npy      # the visible segmentation of the first frame
├── 000001.npy      # the visible segmentation of the second frame
├── 000002.npy      # the visible segmentation of the third frame
└── ...

To load the visible mask in python,

import numpy as np
m = np.load('ah_1_mcs_1/visible/000000.npy')

To find the visible mask of the object with object ID = 1,

import numpy as np
m = np.load('ah_1_mcs_1/visible/000000.npy')
m1 = (m==1)

The amodal segmentation annotations of the object with object ID = 1 are in the folder 0001_*.

ah_1_mcs_1  
├── 0001_Ped_000000225514697_000000000172546_00   # amodal masks for object with ID=1
│   ├── 000000.png   # amodal masks for object with ID=1 in the first frame
│   ├── 000001.png   # amodal masks for object with ID=1 in the second frame
│   ├── 000002.png   # amodal masks for object with ID=1 in the third frame
│   └── ...
├── 0008_Ped_000002602752943_000000000000002_00   # amodal masks for object with ID=8
├── 0010_Ped_000003254803008_000000000172802_00   # amodal masks for object with ID=10
└── ...

An example python code to read the amodal segmentation:

from PIL import Image
import numpy as np
am = Image.open('ah_1_mcs_1/0001_Ped_000000225514697_000000000172546_00/000000.png')
am = np.asarray(am)
print(np.unique(am)) # [0, 255]