Skip to content

Add types for read_metadata()#1923

Merged
victorlin merged 2 commits intomasterfrom
victorlin/io-metadata-types
Feb 20, 2026
Merged

Add types for read_metadata()#1923
victorlin merged 2 commits intomasterfrom
victorlin/io-metadata-types

Conversation

@victorlin
Copy link
Copy Markdown
Member

@victorlin victorlin commented Oct 27, 2025

Description of proposed changes

For type checking.

I considered adding types for read_csv_with_index_col() but decided against it, as it relies on fragile imports from pandas._typing.¹

¹ https://github.com/pandas-dev/pandas-stubs/blob/64e978cbce28de05ff38105c169387283b6c95c2/pandas-stubs/io/parsers/readers.pyi#L39

Related issue(s)

Follow-up to #1917 (review)

Checklist

  • Automated checks pass
  • Check if you need to add a changelog message
  • Check if you need to add tests
  • Check if you need to update docs

@victorlin victorlin self-assigned this Oct 27, 2025
@codecov
Copy link
Copy Markdown

codecov Bot commented Oct 27, 2025

Codecov Report

❌ Patch coverage is 80.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 74.45%. Comparing base (57c36c4) to head (9312413).
⚠️ Report is 7 commits behind head on master.

Files with missing lines Patch % Lines
augur/io/metadata.py 80.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1923      +/-   ##
==========================================
- Coverage   74.46%   74.45%   -0.02%     
==========================================
  Files          82       82              
  Lines        9098     9102       +4     
  Branches     1853     1855       +2     
==========================================
+ Hits         6775     6777       +2     
  Misses       2020     2020              
- Partials      303      305       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@victorlin victorlin force-pushed the victorlin/io-metadata-types branch from eaa9c3e to 682af0f Compare February 18, 2026 21:50
@victorlin victorlin force-pushed the victorlin/io-metadata-types branch 2 times, most recently from c0f7b24 to cfbbf99 Compare February 18, 2026 23:09
@victorlin victorlin changed the title Add types for read_metadata() and read_csv_with_index_col() Add types for read_metadata() Feb 18, 2026
@victorlin victorlin mentioned this pull request Feb 18, 2026
4 tasks
Comment thread augur/io/metadata.py
dtype: dict[str, Any] | str | None = ...,
) -> pd.DataFrame: ...

@overload
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cost of overloaded signatures allows us to infer the return type based on the chunk_size type, yeah? I'm somewhat ambivalent about this as I find overload types a little confusing, but I presume this has been useful to you when writing scripts etc?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I added this to address the pyright error in #1961:

metadata = read_metadata(

dates, metadata[METADATA_DATE_COLUMN],

error: "__getitem__" method not defined on type "Generator[Unknown, None, None]" (reportIndexIssue)

Comment thread augur/io/metadata.py
import csv
import os
from typing import Iterable, Sequence
from typing import Any, Iterable, Iterator, Sequence, overload
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered adding types for read_csv_with_index_col() but decided against it, as it relies on fragile imports from pandas._typing.¹

I don't really know the state of python typing, how does this tend to work for non-core libraries? The TS community has been really proactive about publishing separate type libraries for most common JS-only packages, although ensuring the versions are in-sync has been a little painful. Pandas clearly has some types (pd.DataFrame), do they hide FilePath, ReadCsvBuffer etc away because the types aren't stable?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pandas-stubs is basically the same thing (separate type library) and it's official, so sync is not much of an issue.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why pandas._typing is hidden away, but it's explicitly noted in docs.

pandas.api.typing should be stable. v3 includes FilePath and ReadCsvBuffer, but we still target compatibility with v1.

This is cleaner and prevents false-positive 'call-overload' Mypy errors.
For type checking.

I considered adding types for read_csv_with_index_col() but decided
against it, as it relies on fragile imports from pandas._typing.¹

¹ https://github.com/pandas-dev/pandas-stubs/blob/64e978cbce28de05ff38105c169387283b6c95c2/pandas-stubs/io/parsers/readers.pyi#L39
@victorlin victorlin force-pushed the victorlin/io-metadata-types branch from cfbbf99 to 9312413 Compare February 18, 2026 23:54
@victorlin victorlin marked this pull request as ready for review February 19, 2026 19:07
Base automatically changed from victorlin/types to master February 20, 2026 02:00
@victorlin victorlin merged commit a974866 into master Feb 20, 2026
32 checks passed
@victorlin victorlin deleted the victorlin/io-metadata-types branch February 20, 2026 02:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants