Skip to content


class Column

Enumeration of column types. Supported types are: TEXT, CATEGORICAL, BOOLEAN, NUMERIC, INTEGER, DATETIME, DATE, TIME, COORDINATES.

class Table


def __init__(columns: dict[str, Column | ForeignKey | PrimaryKey]
| None = None,
**kwargs: Column | ForeignKey | PrimaryKey) -> None

Class representing a table in a relational dataset.


  • columns - Dictionary mapping each feature column to a Column object and each foreign key column to a ForeignKey object. A single optional PrimaryKey may be specified.
  • **kwargs - Additional Column, PrimaryKey or ForeignKey objects.

class Schema


def __init__(tables: dict[str, Table] | None = None, **kwargs: Table) -> None

Schema of a relational dataset.


  • tables - Dictionary associating each data table to a Table object. Tables will be reordered to allow traversing from the graph roots to its leaves.
  • **kwargs - Additional Table objects.

class PrimaryKey

Primary key

class ForeignKey


def __init__(parent: str) -> None

Foreign key of a relational dataset, namely a relation between two tables.


  • parent - Parent table (table to which the key refers to).

class RelationalData


def __init__(data: Data, schema: Schema) -> None

Relational data structure.


  • data - A dictionary with table names as keys and pandas.DataFrame’s as values.
  • schema - A Schema object.


def split(ratio: float,
reset_index: bool = False,
rng: np.random.Generator | int | None = None) -> tuple[RD, RD]

Split the input data according to the given ratios for each root table.


  • ratio - Split ratio.
  • reset_index - Whether to reset the index of the resulting dataframes.
  • rng - Random state. If an int, it will be used as seed, if None the seed will be chosen randomly.


Tuple with the two splits.