Skip to content

[FEATURE]: Option to skip checks quietly and improve logging of skipped checks #953

@hski-bayer

Description

@hski-bayer

Is there an existing issue for this?

  • I have searched the existing issues

Problem statement

Ability to skip checks as introduced with #608 and not stop check execution if check is not valid because of missing columns is great. I love this idea. It allows a lot of simplification in our check process, because removes the need to pre-select which checks are applicable for a data frame.

But if a check is skipped, then all rows of the data frame are added to the invalid dataframe and a log entry is added to eather _errors resp _warnings column. This is not a useful behaviour.

That a check was skipped should not be seen as invalidation of a row. Rows should not be added to invalid df just because of skipped checks.

And btw. these log entries are also not nicely identifyable in the array in _errors resp _warnings column. You need to text parse the message attribute, if the text starts with "Check evaluation skipped due to". There should be a structured way to identify these log entries

Proposed Solution

Prio 1: Provide a configuration to skip quietly, no entry added to _errors resp _warnings column for skipped checks then and only rows with true issues would be included in invalid df. e.g.

valid_df, invalid_df = dq_engine.apply_checks_by_metadata_and_split(test_df, checks, ref_dfs=ref_dfs, skip_quietly=true)

Prio 2 (in addition or instead of Prio 1): Introduce a new output data frame, which just contains the list of skipped checks similar to the current log messages at command line
08:49:37 WARN [d.l.dqx.manager] Skipping check 'dmo_ae_faae_fraction' due to invalid check filter: 'AECATTT = 'FRACTION''

valid_df, invalid_df, skipped_checks_df = dq_engine.apply_checks_by_metadata_and_split(test_df, checks, ref_dfs=ref_dfs)

Prio 3: Add a new attribute skipped = true to log entry in _errors resp _warnings to enable clear identification of these log entries avoid need to look for message attribute starting with "Check evaluation skipped due to" to identify skipped (should be done anyhow)

{
  "name": "dmo_ae_faae_fraction",
  "skipped": true,
  "message": "Check evaluation skipped due to invalid check filter: 'AECATTT = 'FRACTION''",
  "columns": ["SUBJECTNAME", "AEGRPID", "AECAT"],
  "filter": "AECATTT = 'FRACTION'",
  "function": "foreign_key",
  "run_time": "2025-11-28T08:49:37.886Z",
  "run_id": "c033c831-9b1e-44c3-9562-063bc0dac94c",
  "user_metadata": {}
}  

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions