mlx_graphs.datasets.PlanetoidDataset

Contents

mlx_graphs.datasets.PlanetoidDataset#

class mlx_graphs.datasets.PlanetoidDataset(name: Literal['cora', 'citeseer', 'pubmed'], split: Literal['public', 'full', 'geom-gcn'] = 'public', without_self_loops: bool = True, base_dir: str | None = None)[source]#

Bases: Dataset

The citation network datasets "Cora", "CiteSeer" and "PubMed" from the “Revisiting Semi-Supervised Learning with Graph Embeddings” paper. Nodes represent documents and edges represent citation links. Training, validation and test splits are given by binary masks.

This dataset follows a similar implementation as in PyG.

Parameters:

Example:

from mlx_graphs.datasets import Planetoid

dataset = Planetoid("cora")
>>> cora(num_graphs=1)

dataset[0]
>>> GraphData(
        edge_index(shape=(2, 10556), int32)
        node_features(shape=(2708, 1433), float32)
        node_labels(shape=(2708,), int32)
        train_mask(shape=(2708,), bool)
        val_mask(shape=(2708,), bool)
        test_mask(shape=(2708,), bool))