DataFrameComparison.joined_unequal#

DataFrameComparison.joined_unequal(
*subset: str,
select: Literal['all', 'subset'] | list[str] = 'all',
lazy: Literal[True],
) LazyFrame[source]#
DataFrameComparison.joined_unequal(
*subset: str,
select: Literal['all', 'subset'] | list[str] = 'all',
lazy: Literal[False] = False,
) DataFrame

The rows of both data frames that can be joined and have at least one mismatching value across any column in subset.

Parameters:
  • subset – The columns to check for mismatches. If not provided, all common columns are used. Must only contain common columns.

  • select – Which columns should be selected in the result. “all” (default) selects all columns. “subset” selects only the primary key and the columns from subset in the compared data frames. Providing a list of strings behaves the same as “subset” but additionally selects the columns in the list from the compared data frames. The list must only contain common columns.

  • lazy – If True, return a lazy frame. Otherwise, return an eager frame (default).

Returns:

A data frame or lazy frame containing the rows that can be joined and have at least one mismatching value across the specified columns.

Raises:

ValueError – If any of the provided columns are not common columns.

Columns which are not used for joining have a suffix _left for the left data frame and a suffix _right for the right data frame.