DataFrameComparison.summary#

DataFrameComparison.summary(
show_perfect_column_matches: bool = True,
top_k_column_changes: int = 0,
sample_k_rows_only: int = 0,
show_sample_primary_key_per_change: bool = False,
left_name: str = Side.LEFT,
right_name: str = Side.RIGHT,
slim: bool = False,
hidden_columns: list[str] | None = None,
) Summary[source]#

Generate a summary of all aspects of the comparison.

Parameters:
  • show_perfect_column_matches – Whether to include column matches in the summary even if the column match rate is 100%. Setting this to False is useful when comparing very wide data frames.

  • top_k_column_changes – The maximum number of column values changes to display for columns with a match rate below 100% in the summary. When enabling this feature, make sure that no sensitive data is leaked.

  • sample_k_rows_only – The number of rows to show in the “Rows left/right only” section of the summary. If 0 (default), no rows are shown. Only the primary key will be printed. An error will be raised if a positive number is provided and any of the primary key columns is also in hidden_columns.

  • show_sample_primary_key_per_change – Whether to show a sample primary key per column change in the summary. If False (default), no primary key values are shown. A sample primary key can only be shown if top_k_column_changes is greater than 0, as each sample primary key is linked to a specific column change. An error will be raised if True and any of the primary key columns is also in hidden_columns.”

  • left_name – Custom display name for the left data frame.

  • right_name – Custom display name for the right data frame.

  • slim – Whether to generate a slim summary. In slim mode, the summary is as concise as possible, only showing sections that contain differences. As the structure of the summary can vary, it should only be used by advanced users who are familiar with the summary format.

  • hidden_columns – Columns for which no values are printed, e.g. because they contain sensitive information.

Returns:

A summary which can be printed or written to a file.