fix: ensure aggregate checks run on original input DataFrame #35

flitzpiepe93 · 2025-05-19T17:14:01Z

📍 Context
Previously, aggregate checks like schema-check failed unexpectedly due to internal columns (_dq_errors, _dq_passed) being added by row-level validations before schema validation was performed.

🛠 What’s fixed
This MR ensures that all aggregate checks are executed on the original, unmodified input DataFrame. This prevents false failures caused by internal metadata columns.

🧪 Test coverage
Includes an end-to-end test verifying that a valid schema passes when combined with other checks like NullCheck.

🙏 Thanks
Thanks to @tongqqiu for reporting this! Your feedback helped fix this issue quickly.

codecov · 2025-05-19T17:17:13Z

Codecov Report

All modified and coverable lines are covered by tests ✅

📢 Thoughts on this report? Let us know!

fix: ensure aggregate checks run on original input DataFrame

d5563d9

flitzpiepe93 self-assigned this May 19, 2025

flitzpiepe93 added the bug Something isn't working label May 19, 2025

flitzpiepe93 merged commit 4dbd7a5 into main May 19, 2025
8 checks passed

flitzpiepe93 deleted the fix/input-of-aggregate-check branch May 19, 2025 17:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: ensure aggregate checks run on original input DataFrame #35

fix: ensure aggregate checks run on original input DataFrame #35

Uh oh!

flitzpiepe93 commented May 19, 2025 •

edited

Loading

Uh oh!

codecov bot commented May 19, 2025

Uh oh!

Uh oh!

Uh oh!

fix: ensure aggregate checks run on original input DataFrame #35

fix: ensure aggregate checks run on original input DataFrame #35

Uh oh!

Conversation

flitzpiepe93 commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented May 19, 2025

Codecov Report

Uh oh!

Uh oh!

Uh oh!

flitzpiepe93 commented May 19, 2025 •

edited

Loading