Welcome to domainMASSTs

This repository contains the code and data for the different domain-specific MASSTs currently under development in the Dorrestein Lab at UC San Diego. This includes microbeMASST, plantMASST, tissueMASST, microbiomeMASST, and foodMASST. Aggregated search outputs can be generated and visualized using metadataMASST.

The code for the different standalone web applications, which allow users to search one spectrum at a time, can be found in GNPS_MASST

Standalone Web Apps:

Publications associated with the search tools:

Batch search of multiple spectra against all domainMASSTs

Running jobs.py allows users to leverage the Fast Search API and execute a batch search of multiple MS/MS spectra against the current indexed data in GNPS/MassIVE, Metabolomics Workbench, Metabolights, and NORMAN and generate multiple outputs for all listed domainMASSTs simultaneously.

A series of interactive HTML trees files will be generated for each domain-specific MASST ending with _domain.html (e.g., _microbe.html)
A series of JSON files for the different trees will be generated (e.g., _microbe.json)
A _matches.tsv file will be generated. This contains all the scans found to match your searched spectrum of interest in the data that have been currently indexed. This includes also samples that are not part of the curated domain-specific MASSTs.
A _library.tsv file will be generated. This contains a list of spectra from the GNPS libraries found to match your spectrum of interest. This enables a Level 2 annotation according the Metabolomics Standards Initiative.
A _datasets.tsv file will be generated. This contains the number of unique samples found to be matching your searched spectrum in each currently indexed dataset.
A series of _count_domain.tsv files will be generated, containing information on matches found for each specific domain MASST.

Execute batch run

Navigate to the jobs.py and add entries to the files list as ("input_directory/input_file", "output_directory/output_prefix)
Check and adjust the different parameters for the search, such as minimum cosine score, mz tolerance, and number of minimum matching peaks based on your research question.
Run jobs.py

Note:

You can run either a single .mgf file generated via MZmine, from the molecular networking in GNPS workflow, or a list of USIs provided either via a .csv or .tsv file.
Make sure to run jobs.py a couple of times, until no new output is generated by having the option: skip_existing=True. Due to the Fast Search API some of the entries will fail. Nevertheless sequent re-runs should catch all the possible matches. (This should not be an issue anymore)
Please make user to use Python 3.10

Lineages

Within the folder lineages you can find the complete lineage information of each NCBI taxonomy IDs used in microbeMASST and plantMASST. These tools currently cover

Tool	Kingdom	Phylum	Class	Order	Family	Genus	Species	Strain
microbeMASST	8	20	48	124	278	561	1379	542
plantMASST	1	1	11	81	319	1796	3712	NA

How to cite?

Please cite the following paper: microbeMASST: a taxonomically informed mass spectrometry search tool for microbial metabolomics data

Name		Name	Last commit message	Last commit date
Latest commit History 217 Commits
code		code
data		data
examples		examples
lineages		lineages
tests		tests
trees		trees
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Welcome to domainMASSTs

Batch search of multiple spectra against all domainMASSTs

Execute batch run

Note:

Lineages

How to cite?

About

Uh oh!

Releases 10

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Welcome to domainMASSTs

Batch search of multiple spectra against all domainMASSTs

Execute batch run

Note:

Lineages

How to cite?

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 10

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages