Skip to content

Proposal: Allow non-overlapping duplicate service_ids in calendar.txt #584

@DomeQdev

Description

@DomeQdev

Describe the problem

Currently, modeling a single conceptual trip (e.g., the 8:15 AM train from City A to City B) that operates on a complex, fragmented calendar requires creating multiple unique service_ids. For each of these service_ids, a separate, nearly identical row must be created in trips.txt.

This leads to significant data redundancy and several issues:

  • The trips.txt file (and consequently stop_times.txt) becomes unnecessarily large, as one conceptual journey is split into many trip_ids.
  • The link that these separate trip entries all represent the same recurring service is lost. This is especially problematic for services that have a consistent identity (like a named train line) but a very fragmented schedule.

The core issue is that calendar.txt enforces a "one service_id per one continuous date range" rule, which doesn't align with the operational reality of many transport services, especially in rail.

Use cases

Consider a train with a complex, non-contiguous schedule common in European rail systems:
"Runs: Aug 31-Sep 4 (daily); Sep 19-27 (on Mon, Wed, Fri, Sat); Oct 6-25 (on Mon, Wed, Fri, Sat, Sun)"

To model this correctly today, a producer must create 3 unique service_ids and 3 corresponding rows in trips.txt, even though it's the exact same train service.

Current trips.txt:

route_id service_id trip_id ...
route_123 service_A trip_001 ...
route_123 service_B trip_002 ...
route_123 service_C trip_003 ...

This is inefficient. The desired state is to define this entire complex schedule under a single service_id and have only one corresponding trip entry.

Proposed solution

Modify the GTFS specification to allow multiple entries for the same service_id in calendar.txt under one strict condition:

The [start_date, end_date] ranges for any given service_id MUST NOT overlap.

When a consumer application parses the feed, it should treat all entries for a single service_id as a logical UNION (OR). A trip associated with that service_id is considered active if the date falls within any of its defined date ranges and matches the day-of-week flags for that specific range.

Example of the proposed calendar.txt:
Using the schedule from the use case, calendar.txt would look like this:

service_id monday tuesday wednesday thursday friday saturday sunday start_date end_date
train_815_service 1 1 1 1 1 1 1 20230831 20230904
train_815_service 1 0 1 0 1 1 0 20230919 20230927
train_815_service 1 0 1 0 1 1 1 20231006 20231025

Consequence for trips.txt:
This would allow the trips.txt file to be simplified to a single, logical entry:

route_id service_id trip_id ...
route_123 train_815_service trip_815 ...

This solution directly addresses the problem of data duplication for producers while remaining relatively simple for consumers to implement, as the logic is a straightforward union of non-overlapping date ranges. It is far more concise than listing potentially hundreds of individual dates in calendar_dates.txt for services that run in fragmented multi-week blocks.

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Change type: FunctionalRefers to modifications that significantly affect specification functionalities.GTFS ScheduleIssues and Pull Requests that focus on GTFS Schedule

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions