Skip to content

EMSL: Provide Data in Ingest Directory and Organize ETL Scripts #11

@vchendrix

Description

@vchendrix

Implement EMSL data ingest per standardized requirements:

  • Create an ingest/emsl subfolder.
  • Place all EMSL data files in this directory, formatted as JSON lists (enclosed in brackets) and named using the convention emsl_00001.json, emsl_00002.json, etc.
  • Ensure each file is limited to ~25 MB and contains only complete records.
  • All files must conform to the latest release schema.
  • Document and implement a file splitting strategy if necessary.
  • All ETL scripts for EMSL should be placed in contrib/emsl.
  • Ensure independent file validation is possible.
  • Document the EMSL ingest process, folder structure, file format, and splitting strategy.

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions