NaturalIdPartitioner#

class NaturalIdPartitioner(partition_by: str)[source]#

Bases: Partitioner

Partitioner for a dataset that can be divided by a column with partition ids.

Parameters:

partition_by (str) – The name of the column that contains the unique values of partitions.

Examples

“flwrlabs/shakespeare” dataset >>> from flwr_datasets import FederatedDataset >>> from flwr_datasets.partitioner import NaturalIdPartitioner >>> >>> partitioner = NaturalIdPartitioner(partition_by=”character_id”) >>> fds = FederatedDataset(dataset=”flwrlabs/shakespeare”, >>> partitioners={“train”: partitioner}) >>> partition = fds.load_partition(0)

“sentiment140” (aka Twitter) dataset >>> from flwr_datasets import FederatedDataset >>> from flwr_datasets.partitioner import NaturalIdPartitioner >>> >>> partitioner = NaturalIdPartitioner(partition_by=”user”) >>> fds = FederatedDataset(dataset=”sentiment140”, >>> partitioners={“train”: partitioner}) >>> partition = fds.load_partition(0)

Methods

is_dataset_assigned()

Check if a dataset has been assigned to the partitioner.

load_partition(partition_id)

Load a single partition corresponding to a single partition_id.

Attributes

dataset

Dataset property.

num_partitions

Total number of partitions.

partition_id_to_natural_id

Node id to corresponding natural id present.

property dataset: Dataset#

Dataset property.

is_dataset_assigned() bool#

Check if a dataset has been assigned to the partitioner.

This method returns True if a dataset is already set for the partitioner, otherwise, it returns False.

Returns:

dataset_assigned – True if a dataset is assigned, otherwise False.

Return type:

bool

load_partition(partition_id: int) Dataset[source]#

Load a single partition corresponding to a single partition_id.

The choice of the partition is based on unique integers assigned to each natural id present in the dataset in the partition_by column.

Parameters:

partition_id (int) – the index that corresponds to the requested partition

Returns:

dataset_partition – single dataset partition

Return type:

Dataset

property num_partitions: int#

Total number of partitions.

property partition_id_to_natural_id: Dict[int, str]#

Node id to corresponding natural id present.

Natural ids are the unique values in partition_by column in dataset.