Skip to content

View details and evaluate synthesis

The Aindo Platform allows you to visualize both real and synthetic data, as well as details about the synthetization process. The platform also provides evaluation metrics to assess the quality and degree of privacy protection of generated synthetic datasets. Below, we provide an overview of the viewing, visualizing, and evaluation functionalities.

Quick details in list page

On the synthesis list page, clicking on a synthesis will display a side panel with general information about the synthesis, including scores, source information, data types, and model settings used in the synthesis.

1-details.png

View page

On the synthesis list page, double-clicking on a synthesis will navigate you to the view page of the selected synthesis. Here, you can see the raw data.

The first section of the page, displays 2 important metrics needed to assess the quality of the synthesis:

  • Privacy score: Indicates how well generated synthetic data protects the privacy of real data subjects on a scale from 0 (all private information leaked) to 10 (no private information leaked). The privacy score is based on similarity metrics, a widely accepted approach to measuring privacy protection in synthetic and anonymized datasets.

  • Data accuracy: Indicates how well the synthetic data preserves the properties of the real data source on a scale from 0 (no information preserved) to 10 (all information relevant in analytics, software testing, research, AI development preserved). The data accuracy score is an aggregation of multiple widely accepted data fidelity and utility metrics.

1-scores-&-charts.png

The second section is similar to the source view page. Each column header provides additional information, such as the column type and a ‘shield’ icon.

  • A green shield icon indicates that the original column contained sensitive data that has been synthesized.
  • A gray shield icon indicates that the original column contained sensitive data, but the user chose not to synthesize it.

Under each column, you will find a small chart summarizing key statistics specific to that column.

2-scores-&-charts.png

Executions

To view and manage multiple runs of the synthetic data generation, use the button to access the execution history side panel, where you can create, rename, and delete executions of a specific synthesis.

1-executions-panel.png

You can run the same synthesis multiple times, which is useful if the source data changes over time or if you need to generate additional data.

Execution timeline

The execution timeline dialog shows the various steps and statuses related to generating synthetic data for the current run.

The steps involved in generating a synthesis are:

  • Data loading
  • Preprocessing
  • Building the Generative AI model
  • Training the Generative AI model
  • Synthesis (Output of the Generative AI model)
  • Report generation
  • Store the newly synthesized data

1-timeline.png

Report

The report is a PDF containing numerous statistics, needed to thoroughly assess the quality of the synthetic data compared to the original data. The report helps evaluate the synthetic data in detail.

1-report.png

Quick details synthesis

Clicking on the ‘info icon’ button, a side panel with general info about synthetic dataset info, synthetic data types and synthesis settings will appear.

1-details.png

Quick details column

Clicking on a column will display a side panel with more detailed information about the selected column.

3-col-details.png

Quick details table

Clicking on a table, a side panel with more detailed information about the selected table will be displayed.

2-table-details.png