Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions .github/workflows/validate-pr.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
name: "validate-pr"

on:
pull_request:

permissions:
contents: "read"

concurrency:
group: "validate-pr-${{ github.event.pull_request.number || github.run_id }}"
cancel-in-progress: true

defaults:
run:
shell: "bash -euxo pipefail {0}"

env:
PYTHONUNBUFFERED: "1"

jobs:
validate:
runs-on: "ubuntu-24.04"

steps:
- name: "Checkout PR code"
uses: "actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683" # v4.2.2
with:
repository: "${{ github.event.pull_request.head.repo.full_name }}"
ref: "${{ github.event.pull_request.head.ref }}"
fetch-depth: 0
fetch-tags: true

- name: "Install system dependencies"
run: "sudo apt-get install genometools python3 --yes -qq >/dev/null"

- name: "Install Python dependencies"
run: "pip3 install -r requirements.txt"

- name: "Validate GFF files"
run: "./scripts/validate-gff 'data/'"

- name: "Rebuild (validation only)"
id: "rebuild"
continue-on-error: true
run: "./scripts/rebuild --input-dir 'data/' --output-dir 'data_output/' --no-pull --allow-dirty"

- name: "Upload validation artifacts"
if: "always()"
uses: "actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02" # v4.6.2
with:
name: "validation-output"
path: "data_output/"

- name: "Fail if rebuild or validation found errors"
if: "steps.rebuild.outcome == 'failure'"
run: "exit 1"
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
## Unreleased

- Remove invalid `qc.missingData.scoreWeight` and `qc.mixedSites.scoreWeight`

## 2025-09-09T12:13:13Z

Add schema definition url to `pathogen.json`. This is a purely technical change, for convenience of dataset authors. The data itself is not modified.

## 2025-03-26T14:04:54Z

- Clarified dataset name as PRRSV-1 ORF5
- Removed 'X' marking Lelystad as a vaccine


## 2025-03-26T11:47:13Z

Fix GFF3 format issues in genome annotation


## 2025-03-12T21:00:00Z

Initial release.
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Betaarterivirus europensis 1 (PRRSV-1) based on reference M96262.2 (Lelystad)


## Dataset attributes

| attribute | value |
| -------------------- | ---------------------------------------- |
| name | PRRSV-1 ORF5 Lineages, Yim-im & Zhang 2025 Vet Microbiol |
| refName | LEYPOLYENV |
| refAccession | M96262.2 |
| refProtein | AAA46278.1 |


## Scope of this dataset

This dataset is based on [Yim-im et al., 2025](https://doi.org/10.1016/j.vetmic.2025.110413) for defining PRRSV-1 lineages on a global scale. This
nomeclature builds upon the following past publications:
1. [Shi et al., 2010](https://doi.org/10.1016/j.virusres.2010.08.014)
2. [Balka et al., 2018](https://doi.org/10.1038/s41598-018-26036-w)
3. [Lee et al., 2023](https://doi.org/10.3390/pathogens12060757)

## Authors and contacts

Maintainer: [Michael Zeller](mailto:mazeller@iastate.edu?subject=[Nextclade]%20PRRSV1)

For questions regarding the dataset, please contact the corresponding author of [Yim-im et al., 2025](https://doi.org/10.1016/j.vetmic.2025.110413).

## What is Nextclade dataset

Read more about Nextclade datasets in Nextclade documentation: https://docs.nextstrain.org/projects/nextclade/en/stable/user/datasets.html
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
>lelystad_orf5
ATGAGATGTTCTCACAAATTGGGGCGTTTCTTGACTCCGCACTCTTGCTTCTGGTGGCTTTTTTTGCTGTGTACCGGCTTGTCCTGGTCCTTTGCCGATGGCAACGGCGACAGCTCGACATACCAATACATATATAACTTGACGATATGCGAGCTGAATGGGACCGACTGGTTGTCCAGCCATTTTGGTTGGGCAGTCGAGACCTTTGTGCTTTACCCGGTTGCCACTCATATCCTCTCACTGGGTTTTCTCACAACAAGCCATTTTTTTGACGCGCTCGGTCTCGGCGCTGTATCCACTGCAGGATTTGTTGGCGGGCGGTACGTACTCTGCAGCGTCTACGGCGCTTGTGCTTTCGCAGCGTTCGTATGTTTTGTCATCCGTGCTGCTAAAAATTGCATGGCCTGCCGCTATGCCCGTACCCGGTTTACCAACTTCATTGTGGACGACCGGGGGAGAGTTCATCGATGGAAGTCTCCAATAGTGGTAGAAAAATTGGGCAAAGCCGAAGTCGATGGCAACCTCGTCACCATCAAACATGTCGTCCTCGAAGGGGTTAAAGCTCAACCCTTGACGAGGACTTCGGCTGAGCAATGGGAGGCCTAG
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
##gff-version 3
##sequence-region lelystad_orf5 1 606
lelystad_orf5 Geneious extracted region 1 606 . + . Name=Extracted region from LEYPOLYENV
lelystad_orf5 Geneious gene 1 606 . + 0 Name=ORF5 CDS
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
{
"$schema": "https://raw.githubusercontent.com/nextstrain/nextclade/refs/heads/release/packages/nextclade-schemas/input-pathogen-json.schema.json",
"schemaVersion": "3.0.0",
"files": {
"reference": "lelystad_orf5.fasta",
"pathogenJson": "pathogen.json",
"genomeAnnotation": "lelystad_orf5.gff",
"examples": "sequences.fasta",
"readme": "README.md",
"changelog": "CHANGELOG.md",
"treeJson": "tree.json"
},
"alignmentParams": {
"penaltyGapOpen": 8,
"penaltyGapOpenInFrame": 12,
"penaltyGapOpenOutOfFrame": 14,
"gapAlignmentSide": "left",
"minSeedCover": 0.01,
"minLength": 400
},
"attributes": {
"name": "PRRSV-1 ORF5",
"reference name": "Porcine reproductive and respiratory syndrome virus 1, ORF5 cds",
"reference accession": "M96262.2"
},
"qc": {
"missingData": {
"enabled": true,
"missingDataThreshold": 2000,
"scoreBias": 500
},
"snpClusters": {
"enabled": true,
"windowSize": 100,
"clusterCutOff": 6,
"scoreWeight": 50
},
"mixedSites": {
"enabled": true,
"mixedSitesThreshold": 15
},
"frameShifts": {
"enabled": true,
"scoreWeight": 20
},
"stopCodons": {
"enabled": true,
"scoreWeight": 50
}
},
"version": {
"tag": "unreleased"
}
}
Loading