Datasets

Datasets#

Dataset(name[, base_dir, pre_transform, ...])

Base dataset class.

KarateClubDataset()

Zachary's Karate Club netowork dataset from An Information Flow Model for Conflict and Fission in Small Groups.

PlanetoidDataset(name[, split, ...])

The citation network datasets "Cora", "CiteSeer" and "PubMed" from the "Revisiting Semi-Supervised Learning with Graph Embeddings" paper.

QM7bDataset([base_dir])

QM7b dataset from the "MoleculeNet: A Benchmark for Molecular Machine Learning" paper, consisting of 7,211 molecules with 14 regression targets.

TUDataset(name[, cleaned, base_dir])

A collection of over 120 benchmark datasets for graph classification and regression, made available by TU Dortmund University.

SuperPixelDataset(name, split[, ...])

MNIST and CIFAR10 superpixel datasets for graph classification tasks converted fromt the original MINST and CIFAR10 images.

OGBDataset(name[, split, base_dir])

Datasets from the Open Graph Benchmark (OGB) collection of realistic, large-scale, and diverse benchmark datasets for machine learning on graphs.

EllipticBitcoinDataset([base_dir, ...])

The Elliptic Bitcoin dataset of Bitcoin transactions from the "Anti-Money Laundering in Bitcoin: Experimenting with Graph Convolutional Networks for Financial Forensics" paper.

MovieLens100K(base_dir[, transform, ...])

The MovieLens 100K heterogeneous rating dataset, assembled by GroupLens Research from the MovieLens web site, consisting of movies (1,682 nodes) and users (943 nodes) with 100K ratings between them.

IMDB([base_dir, transform, pre_transform])

A subset of the Internet Movie Database (IMDB), as collected in the "MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding" paper.

DBLP([base_dir, transform, pre_transform])

A subset of the DBLP computer science bibliography website, as collected in the "MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding" paper.