mlx_graphs.datasets.IMDB

Contents

mlx_graphs.datasets.IMDB#

class mlx_graphs.datasets.IMDB(base_dir: str | None = None, transform: Callable | None = None, pre_transform: Callable | None = None)[source]#

Bases: Dataset

A subset of the Internet Movie Database (IMDB), as collected in the “MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding” paper. IMDB is a heterogeneous graph containing three types of entities - movies (4,278 nodes), actors (5,257 nodes), and directors (2,081 nodes). The movies are divided into three classes (action, comedy, drama) according to their genre. Movie features correspond to elements of a bag-of-words representation of its plot keywords.

Parameters:
  • base_dir (Optional[str]) – directory where the dataset should be saved.

  • transform (Optional[Callable]) – A function/transform that takes in an HeteroGraphData object and returns a transformed version. The data object will be transformed before every access. (default: None)

  • pre_transform (Optional[Callable]) – A function/transform that takes in an HeteroGraphData object and returns a transformed version. The data object will be transformed before being saved to disk. (default: None)