Skip to content

datasets #

Classes:

Name Description
MNISTC

Corrupted MNIST image classification dataset.

CIFAR10C

Corrupted CIFAR10 image classification dataset.

CIFAR100C

Corrupted CIFAR100 image classification dataset.

TinyImageNet

TinyImageNet image classification dataset.

TinyImageNetC

Corrupted TinyImageNet image classification dataset.

MNISTC #

MNISTC(
    root: Path | str = None,
    transform: Callable | None = None,
    target_transform: Callable | None = None,
    corruptions: list[str] = [
        "brightness",
        "canny_edges",
        "dotted_line",
        "fog",
        "glass_blur",
        "impulse_noise",
        "motion_blur",
        "rotate",
        "scale",
        "shear",
        "shot_noise",
        "spatter",
        "stripe",
        "translate",
        "zigzag",
    ],
    download: bool = False,
)

Bases: VisionDataset

Corrupted MNIST image classification dataset.

Contains 10,000 test images for each one of 15 corruptions. From MNIST-C: A Robustness Benchmark for Computer Vision.

Parameters:

Name Type Description Default
root Path | str

Root directory of the dataset.

None
transform Callable | None

Transform to apply to the data.

None
target_transform Callable | None

Transform to apply to the targets.

None
corruptions list[str]

List of corruptions to apply to the data.

['brightness', 'canny_edges', 'dotted_line', 'fog', 'glass_blur', 'impulse_noise', 'motion_blur', 'rotate', 'scale', 'shear', 'shot_noise', 'spatter', 'stripe', 'translate', 'zigzag']
download bool

If true, downloads the dataset from the internet and puts it in the root directory.

False

Methods:

Name Description
download

Attributes:

Name Type Description
base_folder
corruptions
data
filename
sub_folder
targets
url
zip_md5

base_folder #

base_folder = Path('MNIST-C/raw')

corruptions #

corruptions = corruptions

data #

data = concatenate(
    [
        load(
            root
            / sub_folder
            / corruption
            / "test_images.npy"
        )
        for corruption in corruptions
    ],
    axis=0,
)

filename #

filename = 'mnist_c.zip'

sub_folder #

sub_folder = Path('mnist_c')

targets #

targets = astype(int64)

url #

url = 'https://zenodo.org/record/3239543/files/mnist_c.zip'

zip_md5 #

zip_md5 = '4b34b33045869ee6d424616cd3a65da3'

download #

download() -> None

CIFAR10C #

CIFAR10C(
    root: Path | str = None,
    transform: Callable | None = None,
    target_transform: Callable | None = None,
    corruptions: list[str] = [
        "brightness",
        "contrast",
        "defocus_blur",
        "elastic_transform",
        "fog",
        "frost",
        "gaussian_blur",
        "gaussian_noise",
        "glass_blur",
        "impulse_noise",
        "jpeg_compression",
        "motion_blur",
        "pixelate",
        "saturate",
        "shot_noise",
        "snow",
        "spatter",
        "speckle_noise",
        "zoom_blur",
    ],
    shift_severity: int = 5,
    download: bool = False,
)

Bases: VisionDataset

Corrupted CIFAR10 image classification dataset.

Contains 10,000 test images for each corruption. From Benchmarking Neural Network Robustness to Common Corruptions and Perturbations.

Parameters:

Name Type Description Default
root Path | str

Root directory of the dataset.

None
transform Callable | None

Transform to apply to the data.

None
target_transform Callable | None

Transform to apply to the targets.

None
corruptions list[str]

List of corruptions to apply to the data.

['brightness', 'contrast', 'defocus_blur', 'elastic_transform', 'fog', 'frost', 'gaussian_blur', 'gaussian_noise', 'glass_blur', 'impulse_noise', 'jpeg_compression', 'motion_blur', 'pixelate', 'saturate', 'shot_noise', 'snow', 'spatter', 'speckle_noise', 'zoom_blur']
shift_severity int

Severity of the corruption to apply. Must be an integer between 1 and 5.

5
download bool

If true, downloads the dataset from the internet and puts it in the root directory.

False

Methods:

Name Description
download

Attributes:

Name Type Description
base_folder
corruption_data_checksums
corruptions
data
filename
shift_severity
sub_folder
targets
tgz_md5
url

base_folder #

base_folder = Path('CIFAR10-C/raw')

corruption_data_checksums #

corruption_data_checksums = {
    "fog": "7b397314b5670f825465fbcd1f6e9ccd",
    "jpeg_compression": "2b9cc4c864e0193bb64db8d7728f8187",
    "zoom_blur": "6ea8e63f1c5cdee1517533840641641b",
    "speckle_noise": "ef00b87611792b00df09c0b0237a1e30",
    "glass_blur": "7361fb4019269e02dbf6925f083e8629",
    "spatter": "8a5a3903a7f8f65b59501a6093b4311e",
    "shot_noise": "3a7239bb118894f013d9bf1984be7f11",
    "defocus_blur": "7d1322666342a0702b1957e92f6254bc",
    "elastic_transform": "9421657c6cd452429cf6ce96cc412b5f",
    "gaussian_blur": "c33370155bc9b055fb4a89113d3c559d",
    "frost": "31f6ab3bce1d9934abfb0cc13656f141",
    "saturate": "1cfae0964219c5102abbb883e538cc56",
    "brightness": "0a81ef75e0b523c3383219c330a85d48",
    "snow": "bb238de8555123da9c282dea23bd6e55",
    "gaussian_noise": "ecaf8b9a2399ffeda7680934c33405fd",
    "motion_blur": "fffa5f852ff7ad299cfe8a7643f090f4",
    "contrast": "3c8262171c51307f916c30a3308235a8",
    "impulse_noise": "2090e01c83519ec51427e65116af6b1a",
    "labels": "c439b113295ed5254878798ffe28fd54",
    "pixelate": "0f14f7e2db14288304e1de10df16832f",
}

corruptions #

corruptions = corruptions

data #

data = concatenate(
    [
        load(root / sub_folder / corruption + ".npy")[
            shift_severity
            - 1 * 10000 : shift_severity * 10000
        ]
        for corruption in corruptions
    ],
    axis=0,
)

filename #

filename = 'CIFAR-10-C.tar'

shift_severity #

shift_severity = shift_severity

sub_folder #

sub_folder = Path('CIFAR-10-C')

targets #

targets = astype(int64)

tgz_md5 #

tgz_md5 = '56bf5dcef84df0e2308c6dcbcbbd8499'

url #

url = (
    "https://zenodo.org/record/2535967/files/CIFAR-10-C.tar"
)

download #

download() -> None

CIFAR100C #

CIFAR100C(
    root: Path | str = None,
    transform: Callable | None = None,
    target_transform: Callable | None = None,
    corruptions: list[str] = [
        "brightness",
        "contrast",
        "defocus_blur",
        "elastic_transform",
        "fog",
        "frost",
        "gaussian_blur",
        "gaussian_noise",
        "glass_blur",
        "impulse_noise",
        "jpeg_compression",
        "motion_blur",
        "pixelate",
        "saturate",
        "shot_noise",
        "snow",
        "spatter",
        "speckle_noise",
        "zoom_blur",
    ],
    shift_severity: int = 5,
    download: bool = False,
)

Bases: CIFAR10C

Corrupted CIFAR100 image classification dataset.

From Benchmarking Neural Network Robustness to Common Corruptions and Perturbations.

Parameters:

Name Type Description Default
root Path | str

Root directory of the dataset.

None
transform Callable | None

Transform to apply to the data.

None
target_transform Callable | None

Transform to apply to the targets.

None
corruptions list[str]

List of corruptions to apply to the data.

['brightness', 'contrast', 'defocus_blur', 'elastic_transform', 'fog', 'frost', 'gaussian_blur', 'gaussian_noise', 'glass_blur', 'impulse_noise', 'jpeg_compression', 'motion_blur', 'pixelate', 'saturate', 'shot_noise', 'snow', 'spatter', 'speckle_noise', 'zoom_blur']
shift_severity int

Severity of the corruption to apply. Must be an integer between 1 and 5.

5
download bool

If true, downloads the dataset from the internet and puts it in root directory.

False

Methods:

Name Description
download

Attributes:

Name Type Description
base_folder
corruption_data_checksums
corruptions
data
filename
shift_severity
sub_folder
targets
tgz_md5
url

base_folder #

base_folder = Path('CIFAR100-C/raw')

corruption_data_checksums #

corruption_data_checksums = {
    "fog": "4efc7ebd5e82b028bdbe13048e3ea564",
    "jpeg_compression": "c851b7f1324e1d2ffddeb76920576d11",
    "zoom_blur": "0204613400c034a81c4830d5df81cb82",
    "speckle_noise": "e3f215b1a0f9fd9fd6f0d1cf94a7ce99",
    "glass_blur": "0bf384f38e5ccbf8dd479d9059b913e1",
    "spatter": "12ccf41d62564d36e1f6a6ada5022728",
    "shot_noise": "b0a1fa6e1e465a747c1b204b1914048a",
    "defocus_blur": "d923e3d9c585a27f0956e2f2ad832564",
    "elastic_transform": "a0792bd6581f6810878be71acedfc65a",
    "gaussian_blur": "5204ba0d557839772ef5a4196a052c3e",
    "frost": "3a39c6823bdfaa0bf8b12fe7004b8117",
    "saturate": "c0697e9fdd646916a61e9c312c77bf6b",
    "brightness": "f22d7195aecd6abb541e27fca230c171",
    "snow": "0237be164583af146b7b144e73b43465",
    "gaussian_noise": "ecc4d366eac432bdf25c024086f5e97d",
    "motion_blur": "732a7e2e54152ff97c742d4c388c5516",
    "contrast": "322bb385f1d05154ee197ca16535f71e",
    "impulse_noise": "3b3c210ddfa0b5cb918ff4537a429fef",
    "labels": "bb4026e9ce52996b95f439544568cdb2",
    "pixelate": "96c00c60f144539e14cffb02ddbd0640",
}

corruptions #

corruptions = corruptions

data #

data = concatenate(
    [
        load(root / sub_folder / corruption + ".npy")[
            shift_severity
            - 1 * 10000 : shift_severity * 10000
        ]
        for corruption in corruptions
    ],
    axis=0,
)

filename #

filename = 'CIFAR-100-C.tar'

shift_severity #

shift_severity = shift_severity

sub_folder #

sub_folder = Path('CIFAR-100-C')

targets #

targets = astype(int64)

tgz_md5 #

tgz_md5 = '11f0ed0f1191edbf9fa23466ae6021d3'

url #

url = "https://zenodo.org/record/3555552/files/CIFAR-100-C.tar"

download #

download() -> None

TinyImageNet #

TinyImageNet(
    root: str | Path,
    train: bool = True,
    transform: Callable | None = None,
    target_transform: Callable | None = None,
    download: bool = False,
)

Bases: VisionDataset

TinyImageNet image classification dataset.

The training dataset contains 100,000 images of 200 classes (500 for each class) downsized to 64x64 color images. The test set has 10,000 images (50 for each class).

Parameters:

Name Type Description Default
root str | Path

Root directory of the dataset.

required
train bool

If True, creates dataset from training data, otherwise from test data.

True
transform Callable | None

Transform to apply to the data.

None
target_transform Callable | None

Transform to apply to the targets.

None
download bool

If true, downloads the dataset from the internet and puts it in the root directory.

False

Methods:

Name Description
download

Attributes:

Name Type Description
base_folder
class_to_idx
data
filename
targets
tgz_md5
url

base_folder #

base_folder = Path('TinyImageNet/raw')

class_to_idx #

class_to_idx = {classes[i]: _bfor i in range(len(classes))}

data #

data = []

filename #

filename = 'tiny-imagenet-200.zip'

targets #

targets = [item[1] for item in data]

tgz_md5 #

tgz_md5 = '90528d7ca1a48142e341f4ef8d21d0de'

url #

url = 'http://cs231n.stanford.edu/tiny-imagenet-200.zip'

download #

download() -> None

TinyImageNetC #

TinyImageNetC(
    root: Path | str = None,
    transform: Callable | None = None,
    target_transform: Callable | None = None,
    corruptions: list[str] = [
        "brightness",
        "contrast",
        "defocus_blur",
        "elastic_transform",
        "fog",
        "frost",
        "gaussian_blur",
        "gaussian_noise",
        "glass_blur",
        "impulse_noise",
        "jpeg_compression",
        "motion_blur",
        "pixelate",
        "saturate",
        "shot_noise",
        "snow",
        "spatter",
        "speckle_noise",
        "zoom_blur",
    ],
    shift_severity: int = 5,
    download: bool = False,
)

Bases: VisionDataset

Corrupted TinyImageNet image classification dataset.

Contains 10,000 64x64 color test images for each corruption (200 classes, 50 images per class). From Benchmarking Neural Network Robustness to Common Corruptions and Perturbations.

Parameters:

Name Type Description Default
root Path | str

Root directory of the dataset.

None
transform Callable | None

Transform to apply to the data.

None
target_transform Callable | None

Transform to apply to the targets.

None
corruptions list[str]

List of corruptions to apply to the data.

['brightness', 'contrast', 'defocus_blur', 'elastic_transform', 'fog', 'frost', 'gaussian_blur', 'gaussian_noise', 'glass_blur', 'impulse_noise', 'jpeg_compression', 'motion_blur', 'pixelate', 'saturate', 'shot_noise', 'snow', 'spatter', 'speckle_noise', 'zoom_blur']
shift_severity int

Severity of the corruption to apply. Must be an integer between 1 and 5.

5
download bool

If true, downloads the dataset from the internet and puts it in the root directory.

False

Methods:

Name Description
download

Download the dataset.

Attributes:

Name Type Description
base_folder
corruptions
data
filename
shift_severity
targets
tgz_md5
url

base_folder #

base_folder = Path('TinyImageNet-C/raw')

corruptions #

corruptions = corruptions

data #

data = []

filename #

filename = [
    "Tiny-ImageNet-C.tar",
    "Tiny-ImageNet-C-extra.tar",
]

shift_severity #

shift_severity = shift_severity

targets #

targets = []

tgz_md5 #

tgz_md5 = [
    "f9c9a9dbdc11469f0b850190f7ad8be1",
    "0db0588d243cf403ef93449ec52b70eb",
]

url #

url = 'https://zenodo.org/record/8206060/files/'

download #

download() -> None

Download the dataset.