Skip to content

Clean up download filenames#2320

Merged
nikhilb4a merged 2 commits into
mainfrom
nikhil/2293-clean-up-filenames
May 26, 2026
Merged

Clean up download filenames#2320
nikhilb4a merged 2 commits into
mainfrom
nikhil/2293-clean-up-filenames

Conversation

@nikhilb4a
Copy link
Copy Markdown
Contributor

Closes #2293

  1. Initial ticket was to update the suffix of the bundle hash from sha256sum to txt so downloaded files could be easily opened by the OS.
  2. I also felt that the filenames were pretty hard to parse because of the UTC encoding adding the +00:00. I replaced that with Z for UTC indication. I noticed Add UTC timezone to all timestamps #903 changed election_timestamp_name to use the UTC ISO format, but I don't think we need that here since it is used as the filename and not displayed on the frontend, which is what the PR was protecting against. Feel free to correct me if I'm missing something (or if you suggest a different filename format).
  3. Plus on my OS (mac), the colons also get transformed to / in Finder which made the filename even harder to parse. Removed the remaining colon
  4. I removed .zip from the hash filename because it made the name harder to read and didn't seem to add much.
  5. Also noticed that we had differing behavior on local dev vs staging/prod for the outer bundle downloaded file name. On local we used the file.name (set by route handler, not used in staging/prod) but staging/prod rely on the file.storage_path filename (set by task) as parsed by the browser. They had different values because they used slightly different formats. I aligned them to share the same filename, now both set by the worker. I think this is fine and leaves things more consistent.

Example local file downloaded on mac (before):

Outer zip - candidate_totals_2026-05-26T18_59_45.594897+00_00.zip
Inner zip - Georgia-2-2026-05-26T18/59+00/00-candidate-totals.zip
Hash - Georgia-2-2026-05-26T18/59+00/00-candidate-totals.zip.sha256sum

Example local file downloaded on mac (after):

Outer zip - Georgia-test-2026-05-26T2058Z-candidate-totals_bundle.zip
Inner zip - Georgia-test-2026-05-26T2058Z-candidate-totals.zip
Hash - Georgia-test-2026-05-26T2058Z-candidate-totals-sha256-hash.txt

Example staging file downloaded on mac (before):

Outer zip - Georgia-test-2026-05-26T20_12+00_00-manifests_bundle.zip
Inner zip - Georgia-test-2026-05-26T20/12+00/00-manifests.zip
Hash - Georgia-test-2026-05-26T20/12+00/00-manifests.zip.sha256sum

Expected staging file downloaded on mac (after):

Outer zip - Georgia-test-2026-05-26T2058Z-manifests_bundle.zip
Inner zip - Georgia-test-2026-05-26T2058Z-manifests.zip
Hash - Georgia-test-2026-05-26T2058Z-manifests-sha256-hash.txt

Open to feedback for sure

nikhilb4a and others added 2 commits May 26, 2026 21:11
Switch election/jurisdiction timestamp names to a filesystem-safe
YYYY-MM-DDTHHMMZ format (no colons, no microseconds) so downloaded
filenames are readable across all OSes.

For the audit inputs bundle, drop the route handler's separate isoformat
placeholder name and let the background task write both bundle.file.name
and storage_path from the same outer_filename, so local-dev downloads
match production. Also drop the redundant .zip in the hash filename.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add FILENAME_DATETIME_REGEX and scrub_filename_datetime for the new
YYYY-MM-DDTHHMMZ format used in Content-Disposition headers, and switch
the six filename assertions in test_reports/test_ballots/test_batches
to use it. The original scrub_datetime is unchanged so report-body
timestamps (still full isoformat) continue to be scrubbed correctly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@jonahkagan jonahkagan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really have context on this feature - I think @arsalansufi built it. But seems fine to me

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR standardizes and simplifies download filenames (CSV reports and batch file bundles), addressing OS-unfriendly characters (e.g., : and +00:00) and making SHA256 hash downloads more discoverable by switching to a .txt file.

Changes:

  • Update timestamp-in-filename formatting to YYYY-MM-DDTHHMMZ for UTC, avoiding +00:00 and colons.
  • Rename generated batch bundle hash file to a clearer *-sha256-hash.txt naming scheme.
  • Align local-dev and staging/prod outer bundle download filenames by setting File.name from the background worker; update tests to scrub the new filename datetime format.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
server/util/csv_download.py Changes timestamp formatting used in downloadable CSV filenames to a filesystem-friendly UTC format.
server/api/batch_files.py Updates bundle File.name handling and renames the hash sidecar file to a .txt filename.
server/tests/helpers.py Adds a scrubber for the new filename datetime format used in Content-Disposition assertions.
server/tests/batch_comparison/test_batches.py Updates Content-Disposition assertions to use the new scrubber.
server/tests/api/test_reports.py Updates report download filename assertions for the new timestamp format.
server/tests/api/test_ballots.py Updates ballot retrieval list download filename assertions for the new timestamp format.
.basedpyright/baseline.json Updates the basedpyright baseline to reflect the changed typing diagnostics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread server/api/batch_files.py
@arsalansufi
Copy link
Copy Markdown
Contributor

@eventualbuddha built this! But overall, many possible options here. And these updates are in line with what I was imagining. Just something to make it easier for Georgia to open and view the hash

@nikhilb4a
Copy link
Copy Markdown
Contributor Author

nikhilb4a commented May 26, 2026

The change from the initial ticket to make the hash file easier to open was straightforward, yeah. Dropping .zip was a small addition too just for readability. But beyond that, I thought it would be good to update the filenames created by election_timestamp_name which is also used in the audit report/discrepancy report, and @jonahkagan had seen your PR which set that value (but back in 2020)

@nikhilb4a nikhilb4a merged commit 8396aa9 into main May 26, 2026
6 checks passed
@nikhilb4a nikhilb4a deleted the nikhil/2293-clean-up-filenames branch May 26, 2026 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

File downloads: Put the hash for manifests and ctbb in a dedicated .txt file for visibility

4 participants