Benchmarks

In this section, we present some benchmarks that illustrate the resources consumed by standard training on a specific CPU and a specific GPU machine, for several datasets and model sizes.

CPU training and generation

In this section, we present benchmarks for typical CPU training and synthesis times. Except for the batch size, all other parameters were set to their default values. All tests were conducted on an AWS EC2 c6i.4xlarge instance, which is equipped with 16 CPU cores and 32 GB of RAM.

Dataset Adult

The UCI Adult dataset, also known as the Census Income dataset, is a single table dataset with 15 columns and 29,305 records.

Model Size	Batch Size	Training Time	Training Steps	Time per step (s)	Generation Time (s)
Small	256	320	4200	0.076	5
Medium	256	425	3200	0.133	6
Large	256	608	2600	0.233	8

Dataset Basket

We consider a subset of the BasketballMen dataset:

players:
- parent table
- 4,556 records
- primary key: playerID
- feature columns: pos, height, weight, college, race, birthCity, birthState, birthCountry
season:
- child table
- 21,415 records
- foreign key: playerID
- feature columns: year, stint, tmID, lgID, GP, points, GS, assists, steals, minutes
all_stars:
- child table
- 1,487 records
- foreign key: playerID
- feature columns: conference, league_id, points, rebounds, assists, blocks

Model Size	Batch Size	Training Time	Training Steps	Time per step (s)	Generation Time (s)
Small	64	1923	5800	0.332	8
Medium	64	2339	3800	0.616	16
Large	64	3312	2800	1.183	18

Dataset Airbnb

The Airbnb Open Data dataset includes information about Airbnb listings in New York City. In its original form, it consists of a single table, but we find it natural to rearrange it into two tables:

host:
- parent table
- 33,712 records
- primary key: host_id.
- feature columns: host_name, calculated_host_listings_count
listings:
- child table
- 43,885 records
- primary key: id
- foreign key: host_id
- feature columns: neighbourhood_group, neighbourhood, latitude, longitude, room_type, price, minimum_nights, number_of_reviews, last_review, reviews_per_month, availability_365

Model Size	Batch Size	Training Time	Training Steps	Time per step (s)	Generation Time (s)
Small	32	1892	18200	0.104	24
Medium	32	2933	16400	0.179	29
Large	32	3526	11200	0.315	39

Single- and Multi-GPU training

In this section, we present benchmarks for typical CPU and GPU training times running on a virtual machine with 4x L40S GPUs and a 32-core AMD EPYC 9354 CPU. They provide examples of the possible speed-up provided by single- and multi-GPU training with respect to CPU training.

Dataset Berka

The Berka dataset is a collection of financial information from a Czech bank.

We consider the following tables:

account:
- parent table
- 4,050 records
- primary key: account_id
- feature columns: district_id, frequency, date
order:
- child table
- 5,822 records
- primary key: order_id
- foreign key: account_id
- feature columns: bank_to, account_to, amount, k_symbol
loan:
- child table
- 606 records
- primary key: loan_id
- foreign key: account_id
- feature columns: date, amount, duration, payments, status
trans:
- child table
- 388249 records
- primary key: trans_id
- foreign key: account_id
- feature columns: date, type, operation, amount, balance, k_symbol, bank
- maximum 100 transactions per client

The following parameters were used for all runs:

batch size: 512
training steps: 20,000

Model Size	CPU: Time per step (s)	CPU: RAM (GiB)	GPU: Time per step (s)	GPU: VRAM (GiB)	4x GPUs: Time per step (s)	4x GPUs: VRAM max per GPU (GiB)
Small	18.6	5.6	0.63	3.4	0.26	4.6
Medium	34.5	7.9	1.09	6.4	0.32	10.4
Large	63.7	14.2	1.91	15.5	0.50	16.0

Dataset Porto

The Porto dataset is a dataset containing taxi trajectories recorded over one year (from 2013/07/01 to 2014/06/30) in the city of Porto, Portugal.

The original dataset consists of a single table, with the column POLYLINE containing all the GPS coordinates of each trip. We prefer to split the data into two tables, with a parent table trip containing some trip features, and a child table trajectory containing each individual GPS coordinate as a single row in the coord column:

trip:
- parent table
- 100,000 records
- primary key: TRIP_ID
- feature columns: TAXI_ID, CALL_TYPE, TIMESTAMP
trajecotry:
- child table
- 4,377,175 records
- foreign key: TRIP_ID
- feature columns: coord
- max 100 GPS records per trip

The following parameters were used for all runs:

batch size: 2,048
training steps: 20,000

Model Size	CPU: Time per step (s)	CPU: RAM (GiB)	GPU: Time per step (s)	GPU: VRAM (GiB)	4x GPUs: Time per step (s)	4x GPUs: VRAM max per GPU (GiB)
Small	12.0	6.7	0.37	3.2	0.26	4.4
Medium	24.6	9.5	0.70	5.9	0.22	7.1
Large	48.7	14.9	1.30	11.4	0.35	12.6