Skip to content

datadiff 0.4.7 - coverage fidele, rapport pointblank lazy, perf tables larges#3

Merged
VincentGuyader merged 1 commit into
mainfrom
release-0.4.7
Jun 14, 2026
Merged

datadiff 0.4.7 - coverage fidele, rapport pointblank lazy, perf tables larges#3
VincentGuyader merged 1 commit into
mainfrom
release-0.4.7

Conversation

@VincentGuyader

Copy link
Copy Markdown
Member

datadiff 0.4.7

Rend les comparaisons beaucoup plus rapides sur les tables larges, tout en
gardant une visibilite complete de ce qui est teste et le detail des erreurs.

Performance

  • compare_datasets_from_yaml() calcule le verdict en une passe vectorisee sur
    des colonnes booleennes. Quand tout passe, l'agent pointblank par-colonne
    (une etape par colonne, ~quadratique) est court-circuite. En cas d'echec,
    seules les colonnes reellement fautives recoivent une etape. all_passed et
    les cellules en echec sont strictement inchanges.
  • add_tolerance_columns() reecrit pour eviter la croissance quadratique du
    data.frame (un seul bind).

Visibilite des tests

  • Nouveau result$coverage : une ligne par verification effectuee (colonne,
    type, n lignes, n_failed, statut), toujours produit, meme en cas de succes.
  • Nouveau result$summary : compteurs agreges. summary$all_passed colle
    toujours a all_passed.

Rapport HTML

  • result$reponse reste un vrai agent pointblank (all_passed() /
    get_data_extracts() fonctionnent). L'imprimer rend, a la demande, un
    rapport pointblank complet listant tous les checks.
  • Sur une colonne en erreur, le rapport conserve le detail reel : nombre de
    lignes fautives, cellules concernees, et extrait telechargeable en CSV.
  • Nouveau datadiff_report_html(res, file = ) pour exporter le rapport.

Documentation / conformite

  • Vignette et README mis a jour (coverage, summary, rapport), normalises en ASCII.
  • DESCRIPTION : noms de logiciels entre quotes ('pointblank', 'YAML'),
    conformite CRAN (R CMD check --as-cran : 0/0/0 ; win-builder verifie).
  • Helpers internes en @noRd.

Suite de tests : 695 verts (dont 40 garde-fous d'equivalence verdict + cellules KO).

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates datadiff to 0.4.7 with a focus on making the lazy Pointblank HTML report preserve real failure extracts (including downloadable CSV extracts) while still listing all checks via the coverage table, and adds regression tests + release metadata updates.

Changes:

  • Enhance report generation to optionally build the report agent “on top of” a real interrogated Pointblank agent so failing columns keep their real extracts.
  • Add tests asserting extract carry-over and that exported HTML contains failing values + CSV download affordance.
  • Bump package version to 0.4.7 and update release notes / CRAN-related DESCRIPTION quoting.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
R/report.R Adds real-agent merge path for report building (extract remapping) and threads this into lazy render + HTML export.
tests/testthat/test-report.R Adds regression tests for extract preservation and HTML export content on failures.
NEWS.md Documents 0.4.7 changes (lazy report fidelity, thresholds, docs notes).
DESCRIPTION Bumps version and adjusts quoting for CRAN conventions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread R/report.R
Comment on lines 38 to +41
n <- nrow(coverage)

if (!is.null(real_agent) && nrow(real_agent$validation_set) > 0L) {
rvs <- real_agent$validation_set

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Comment thread R/report.R
Comment on lines +58 to +61
for (i in seq_len(n)) {
col <- coverage$column[i]
j <- match(col, real_cols)
if (!is.na(j)) {

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Comment thread R/report.R
Comment on lines +96 to +102
new_vs <- dplyr::bind_rows(rows)
new_vs$i <- seq_len(nrow(new_vs))
new_vs$assertion_type <- ifelse(coverage$check == "col_exists",
"col_exists", "col_vals_equal")
real_agent$validation_set <- new_vs
real_agent$extracts <- new_extracts
return(real_agent)
…0.4.7

- R/report.R: le rapport HTML lazy part du vrai agent interroge et conserve,
  par colonne en erreur, le nombre de lignes fautives, les cellules et
  l'extrait CSV, tout en listant l'ensemble des checks (build_report_agent
  remappe agent$extracts; warn_at/stop_at threades).
- DESCRIPTION: Version 0.4.7; quoting 'pointblank'/'YAML' (convention CRAN).
- NEWS.md: section 0.4.7.
- tests/testthat/test-report.R: tests de la fusion des extraits reels + HTML.
@VincentGuyader VincentGuyader merged commit 82a5fd3 into main Jun 14, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants