Skip to main content

ImmuneWatch DETECT

Accurately annotate the epitope specificity of your T-cell receptors

Install

ImmuneWatch DETECT is compatible with Linux and MacOS operating systems. Please ensure Docker is installed on your system as a prerequisite. For Docker installation guidelines, refer to the official documentation.

Following Docker setup, you can easily install this tool using the following command:

docker pull public.ecr.aws/q5z5g5i0/detect:latest && wget https://imw-public.s3.eu-central-1.amazonaws.com/detect/imw_detect && chmod +x imw_detect

Run

To run ImmuneWatch DETECT, execute the minimal command provided below from your command line. This process involves taking your TCR repertoire data, retraining the algorithm using the specified database, and generating predictions.

Note that a valid licence is required for operation. To obtain a license, please request access through our website.

./imw_detect \
-i repertoire.tsv \
-o predictions.tsv \
-d imwdb \
-l license

Arguments

-i, --inputfile

ImmuneWatch DETECT supports multiple TCR repertoire file formats, detailed in the table below. Ensure files contain a header row with at least the required columns. All common delimiters, including CSV and TSV, are accepted.

Single cell pairing is supported by including one of the following columns: cell_id, clone_id or barcode. Values in any of these columns cannot be empty. It's also possible to annotate both single cell and bulk data within the same file.

Supported File Formats

File FormatRequired Columns
AIRRjunction_aa, v_call, j_call
Adaptive Biotech ImmunoSEQaminoAcid, vGeneName, jGeneName
Adaptive Biotech ImmunoSEQ v4amino_acid, v_resolved, j_resolved
MiXCRaaSeqCDR3, allVHitsWithScore, allJHitsWithScore
10x Genomicscdr3, v_gene, j_gene, is_cell

Example AIRR

junction_aav_callj_call
CASSIRSSYEQYFTRBV19*02TRBJ2-7*01
CARNTGNQFYFTRAV24*01TRAJ49*01

Example AIRR Single Cell

clone_idjunction_aav_callj_call
1CASSIRSSYEQYFTRBV19*02TRBJ2-7*01
1CAGDDQGGKLIFTRAV27*01TRAJ23*01
2CARNTGNQFYFTRAV24*01TRAJ49*01

-o, --outputfile

The output of ImmuneWatch DETECT is provided in TSV format, and contains the AIRR standard columns junction_aa, v_call, and j_call. Below is a small example of what the output file may look like, including the header and a few sample rows.

ImmuneWatch DETECT will add two extra columns:

  • Epitope: The ImmuneWatch DETECT algorithm annotates the epitope specificity of TCRs based on the likeliest epitope from the provided database to be recognized by the TCR. If no corresponding epitope is identified, the value will be 'None'.
  • Score: Represents the binding score, which ranges from 0 (no binding) to 1 (highest binding), indicating the likelihood of the TCR recognizing the annotated epitope. For most purposes, when using the IMWdb as the database, we recommend a score cut-off of 0.2 for determining reliable predictions. This recommendation also applies when using the --epitope argument. Detailed information on scoring can be found here.
junction_aav_callj_callEpitopeScore
CASSIRSSYEQYFTRBV19*02TRBJ2-7*01GILGFVFTL0.3987
CARNTGNQFYFTRAV24*01TRAJ49*01NLVPMVATV0.2836

Explainability

When using the IMWdb database, the output file will include an additional column: Reference TCRs. These are TCRs that support the given epitope annotation, and come together with the DOI of the publication where the TCR-Epitope pair was reported.

junction_aav_callj_callEpitopeScoreReference TCRs
CASSIRSSYEQYFTRBV19*02TRBJ2-7*01GILGFVFTL0.3987[('CASSSRSSYEQYF', '10.1073/pnas.1603106113')]
CARNTGNQFYFTRAV24*01TRAJ49*01NLVPMVATV0.2836[('CAFNTGNQFYF', '10.4049/jimmunol.1303147'), ('CASNTGNQFYF', '10.1016/j.celrep.2017.03.072')]

Additional Epitope Information: Antigen and Species

When using the IMWdb or VDJdb databases, the output file will include two additional columns: Antigen and Species.

junction_aav_callj_callEpitopeScoreAntigenSpecies
CASSIRSSYEQYFTRBV19*02TRBJ2-7*01GILGFVFTL0.3987MInfluenzaA
CARNTGNQFYFTRAV24*01TRAJ49*01NLVPMVATV0.2836pp65CMV

-d, --database

ImmuneWatch DETECT is designed to work with the IMWdb. However the core program is sufficiently versatile to start from any TCR-Epitope data, allowing users to leverage their own training datasets to annotate TCR specificity. Unfortunately, the same level of high quality predictions cannot be guaranteed. Drop us an email if you would like to discuss this further or see your data included.

Supported File Formats

For those opting to use their own TCR-Epitope datasets, ImmuneWatch DETECT currently supports data in the AIRR (Adaptive Immune Receptor Repertoire) format. All common delimiters, including CSV and TSV, are accepted

File FormatRequired Columns
AIRRjunction_aa, v_call, j_call, epitope

If you do not have your own TCR-Epitope data, we have curated a list of recommended databases that are compatible with ImmuneWatch DETECT. You can follow the download instructions and subsequent database argument available below to directly use these databases to make predictions.

DatabaseDownloadArgument
IMWdbImmuneWatch's own database. Use this for best performance. No additional download necessary-d imwdb
VDJdbwget https://github.com/antigenomics/vdjdb-db/releases/download/2023-06-01/vdjdb-2023-06-01.zip && unzip vdjdb-2023-06-01.zip -d vdjdb-d vdjdb/vdjdb.txt

Note that it is the responsibility of each user to ensure that they comply with the terms and conditions of any external database before downloading and using it. We strongly advise you to review these terms and conditions carefully to ensure full compliance.


--epitope

Utilising the optional argument --epitope shifts ImmuneWatch DETECT's functionality from identifying the most likely binding epitope for your TCRs to calculating the binding score of each TCR against a specific epitope you provide. This feature allows for targeted analysis, focusing on the interaction between your TCR repertoire and a particular epitope of interest.

The structure of the output file remains consistent with the standard output format, including columns such as junction_aa, v_call, j_call, along with Epitope and Score. However, when using the --epitope argument, there are two notable differences:

  • The Epitope column will consistently display the epitope specified via the --epitope argument.
  • The Score range is extended from -1 to 1. A negative score indicates that the predicted target space of the TCR does not contain the query epitope, with a degree of confidence reflected by the magnitude of the (negative) score.

--epitope GILGFVFTL

junction_aav_callj_callEpitopeScore
CASSIRSSYEQYFTRBV19*02TRBJ2-7*01GILGFVFTL0.3987
CASSLLAGPYNEQFFTRBV19*01TRBJ2-1*01GILGFVFTL0.1966
CASGPLLLMTNEQFFTRBV12-4*01TRBJ2-1*01GILGFVFTL-0.3432

Unseen Epitope Predictions?

ImmuneWatch DETECT is an algorithm that falls into the seen-epitope category. An annotation with a certain epitope generally requires the database to contain training data for that epitope. However, predictions for unseen-epitopes are also supported to a limited extent. When an epitope is similar to an epitope in the database, ImmuneWatch DETECT can make predictions for it. You can use the check-epitope-support command to verify whether your epitope of interest is supported by the IMWdb.


Citation

When using ImmuneWatch DETECT please cite as follows:

ImmuneWatch DETECT, Version 1.0. Developed by ImmuneWatch BV. 2024. Available at: "https://www.immunewatch.com/detect"