Skip to content

Models

class TabularModel

TabularModel.build

@classmethod
def build(cls: type['TabularModel'],
preproc: TabularPreproc,
size: str | Size | TabularModelSize,
dropout: float | None = 0.01,
block_size: int | None = None) -> 'TabularModel'

Tabular model to generate synthetic tabular relational data.

Arguments:

  • preproc - A TabularPreproc object.
  • size - The size configuration of the model. Could be either a Size object, a TabularModelSize object or a string representation of such objects.
  • dropout - The dropout probability.
  • block_size - Maximum sequence length that the model can process. If None, no maximum sequence length is imposed.

Returns:

A TabularModel instance.

TabularModel.generate

def generate(n_samples: int,
batch_size: int = 0,
temp: float = 1.0) -> RelationalData

Generate synthetic relational data.

Arguments:

  • n_samples - Desired number of samples in the root table.
  • batch_size - Batch size used during generation. If 0, generate all data in a single batch.
  • temp - Temperature parameter for sampling.

Returns:

A RelationalData object.

class TextModel

TextModel.build

@classmethod
def build(cls: type['TextModel'],
preproc: TextPreproc,
size: str | Size | TextModelSize,
block_size: int,
dropout: float | None = 0.01) -> 'TextModel'

Text model to generate synthetic text columns of a table which is part of a relational structure.

Arguments:

  • preproc - A TextPreproc object.
  • size - The size configuration of the model. Could be either a Size object, a TextModelSize object or a string representation of such objects.
  • block_size - Maximum text sequence length that the model can process.
  • dropout - The dropout probability.

Returns:

A TextModel instance.

TextModel.generate

def generate(data: RelationalData,
batch_size: int = 0,
max_text_len: int = 0,
temp: float = 1.0) -> RelationalData

Generate text columns in the current table.

Arguments:

  • data - A RelationalData object containing synthetic data.
  • batch_size - Batch size used during generation. If 0, generate all data in a single batch.
  • max_text_len - Maximum length for the generated text. If 0, the maximum possible value is used.
  • temp - Temperature parameter for sampling.

Returns:

A RelationalData object.

class Size

Enumeration class representing different model sizes. Supported sizes are: SMALL, MEDIUM and LARGE.

class TabularModelSize

Model size for TabularModel objects.

Arguments:

  • n_layers - Number of internal layers.
  • h - Number of heads.
  • d - Size of the internal dimension.

TabularModelSize.from_size

@classmethod
def from_size(cls, size: Size | str) -> 'TabularModelSize'

Create an instance based on a given Size or its string representation.

Arguments:

  • size - A Size object.

class TextModelSize

Model size for TextModel objects.

Arguments:

  • n_layers - Number of internal layers.
  • h - Number of heads.
  • d - Size of the internal dimension.

TextModelSize.from_size

@classmethod
def from_size(cls, size: Size | str) -> 'TextModelSize'

Create an instance based on a given Size or its string representation.

Arguments:

  • size - A Size or a str representing a Size.