mlx_graphs.datasets.EllipticBitcoinDataset

mlx_graphs.datasets.EllipticBitcoinDataset#

class mlx_graphs.datasets.EllipticBitcoinDataset(base_dir: str | None = None, pre_transform: Callable | None = None, transform: Callable | None = None)[source]#

Bases: Dataset

The Elliptic Bitcoin dataset of Bitcoin transactions from the “Anti-Money Laundering in Bitcoin: Experimenting with Graph Convolutional Networks for Financial Forensics” paper.

EllipticBitcoinDataset maps Bitcoin transactions to real entities belonging to licit categories (exchanges, wallet providers, miners, licit services, etc.) versus illicit ones (scams, malware, terrorist organizations, ransomware, Ponzi schemes, etc.)

There exists 203,769 node transactions and 234,355 directed edge payments flows, with two percent of nodes (4,545) labelled as illicit, and twenty-one percent of nodes (42,019) labelled as licit. The remaining transactions are unknown

Parameters:
  • base_dir (Optional[str]) – Directory where to store dataset files. Default is in the local directory .mlx_graphs_data/.

  • pre_transform (Optional[Callable]) – A function/transform which takes in a GraphData object and returns a transformed version. The data will be transformed before saving to the disk.

  • transforms – A function/transform that takes in a graphData object and returns a transformed version The data object will be transformed before every access