datadiff 0.4.7 - coverage fidele, rapport pointblank lazy, perf tables larges#3
Merged
Conversation
ed7e22a to
5d7965c
Compare
There was a problem hiding this comment.
Pull request overview
This PR updates datadiff to 0.4.7 with a focus on making the lazy Pointblank HTML report preserve real failure extracts (including downloadable CSV extracts) while still listing all checks via the coverage table, and adds regression tests + release metadata updates.
Changes:
- Enhance report generation to optionally build the report agent “on top of” a real interrogated Pointblank agent so failing columns keep their real extracts.
- Add tests asserting extract carry-over and that exported HTML contains failing values + CSV download affordance.
- Bump package version to 0.4.7 and update release notes / CRAN-related DESCRIPTION quoting.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
R/report.R |
Adds real-agent merge path for report building (extract remapping) and threads this into lazy render + HTML export. |
tests/testthat/test-report.R |
Adds regression tests for extract preservation and HTML export content on failures. |
NEWS.md |
Documents 0.4.7 changes (lazy report fidelity, thresholds, docs notes). |
DESCRIPTION |
Bumps version and adjusts quoting for CRAN conventions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
38
to
+41
| n <- nrow(coverage) | ||
|
|
||
| if (!is.null(real_agent) && nrow(real_agent$validation_set) > 0L) { | ||
| rvs <- real_agent$validation_set |
c070a3b to
dd2cd85
Compare
Comment on lines
+58
to
+61
| for (i in seq_len(n)) { | ||
| col <- coverage$column[i] | ||
| j <- match(col, real_cols) | ||
| if (!is.na(j)) { |
dd2cd85 to
983f1d2
Compare
Comment on lines
+96
to
+102
| new_vs <- dplyr::bind_rows(rows) | ||
| new_vs$i <- seq_len(nrow(new_vs)) | ||
| new_vs$assertion_type <- ifelse(coverage$check == "col_exists", | ||
| "col_exists", "col_vals_equal") | ||
| real_agent$validation_set <- new_vs | ||
| real_agent$extracts <- new_extracts | ||
| return(real_agent) |
983f1d2 to
1bba597
Compare
…0.4.7 - R/report.R: le rapport HTML lazy part du vrai agent interroge et conserve, par colonne en erreur, le nombre de lignes fautives, les cellules et l'extrait CSV, tout en listant l'ensemble des checks (build_report_agent remappe agent$extracts; warn_at/stop_at threades). - DESCRIPTION: Version 0.4.7; quoting 'pointblank'/'YAML' (convention CRAN). - NEWS.md: section 0.4.7. - tests/testthat/test-report.R: tests de la fusion des extraits reels + HTML.
1bba597 to
2fcc665
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
datadiff 0.4.7
Rend les comparaisons beaucoup plus rapides sur les tables larges, tout en
gardant une visibilite complete de ce qui est teste et le detail des erreurs.
Performance
compare_datasets_from_yaml()calcule le verdict en une passe vectorisee surdes colonnes booleennes. Quand tout passe, l'agent pointblank par-colonne
(une etape par colonne, ~quadratique) est court-circuite. En cas d'echec,
seules les colonnes reellement fautives recoivent une etape.
all_passedetles cellules en echec sont strictement inchanges.
add_tolerance_columns()reecrit pour eviter la croissance quadratique dudata.frame (un seul bind).
Visibilite des tests
result$coverage: une ligne par verification effectuee (colonne,type, n lignes, n_failed, statut), toujours produit, meme en cas de succes.
result$summary: compteurs agreges.summary$all_passedcolletoujours a
all_passed.Rapport HTML
result$reponsereste un vrai agent pointblank (all_passed()/get_data_extracts()fonctionnent). L'imprimer rend, a la demande, unrapport pointblank complet listant tous les checks.
lignes fautives, cellules concernees, et extrait telechargeable en CSV.
datadiff_report_html(res, file = )pour exporter le rapport.Documentation / conformite
conformite CRAN (
R CMD check --as-cran: 0/0/0 ; win-builder verifie).@noRd.Suite de tests : 695 verts (dont 40 garde-fous d'equivalence verdict + cellules KO).