IgQUAST manual

1. What is IgQUAST?
    1.1. Input
    1.2. Output
2. Installation
3. IgQUAST usage
    3.1. Input options
    3.2. Output options
    3.3. Performed scenarios
    3.4. Miscellaneous options
    3.5. Examples
    3.6. Output files
4. Citations
5. Feedback and bug reports

1. What is IgQUAST?

IgQUAST (ImmunoGlobulin QUality ASsessment Tool) is a tool for adaptive immune repertoires quality assessment. IgQUAST can be used for benchmarking of adaptive immune repertoire construction tools and for quality estimation of constructed repertoires. IgQUAST performs reference-based and reference-free analysis:

1.1. Input

IgQUAST takes as an input: Initial Rep-seq library should be in FASTA or FASTQ format. Reads should be properly cropped (should start from V gene beginning and finish by J gene ending), reads obtained from negative strand should be reversed, and contaminative reads should be filtered out. Cropping, strand correction and contamination filtering may be performed by the VJFinder tool from the IgRepertoireConstructor package (see IgRepertoireConstructor manual for VJFinder input-output format description).
Analyzed constructed and reference repertoires should be presented by two files each, repertoire sequences in FASTA format and read-to-cluster map (RCM) file in a special format. See IgRepertoireConstructor manual for the comprehensive repertoire format description. If you have only one of these files, the tool can reconstruct it using available information (use --reconstruct option).

1.2. Output

IgQUAST reports: Plots are reported in PNG, PDF, and SVG formats. Metrics are reported in text (brief) and JSON (full) formats.

2. Installation

IgQUAST has the following dependencies: IgQUAST is a part of IgRepertoireConstructor package. See IgRepertoireConstructor manual for the installation instructions.

Please verify your IgQUAST installation before the first run of IgQUAST:

    ./igquast.py --test

If the installation is succeeded, you will find the following information at the end of the log:

  Thank you for using IgQUAST!
  Log was written to igquast_test/igquast.log

3. IgQUAST usage

To run IgQUAST, type:
    
    ./igquast.py [options] -s <initial reads> -c <constructed repertoire FASTA> -C <constructed repertoire RCM> -r <reference repertoire FASTA> -R <reference repertoire RCM> -o <output dir for plots>
    

3.1. Input options

-c / --constructed-repertoire <constructed repertoire FASTA>
FASTA file with constructed repertoire sequences. Can be gzipped.

-C / --constructed-rcm <constructed repertoire RCM>
RCM file with constructed repertoire read-cluster map. Can be gzipped.

-r / --reference-repertoire <reference repertoire FASTA>
FASTA file with reference repertoire sequences. Can be gzipped.

-R / --reference-rcm <reference repertoire RCM>
RCM file with reference repertoire read-cluster map. Can be gzipped.

-s / --initial-reads <initial reads>
Initial Rep-seq reads in FASTA or FASTQ format. Can be gzipped.

--reconstruct | --no-reconstruct
Whether to reconstruct missing repertoire files if it is possible. Disabled by default.

3.2. Output options

-o / --output-dir <output dir>
output directory (required).

--text <text report file>
File for text report output. Default: <output dir>/igquast.txt.

--json <JSON report file>
File for JSON report output. Default: <output dir>/igquast.json.

-F / --figure-format <figure formats(s)>
Figure format(s) for plots. Allowed values are png, pdf and svg. One can pass several values separated by commas. Empty string means do not produce plots. Default value is png,pdf,svg.

3.3. Performed scenarios

--repertoire-to-repertoire-matching | --no-repertoire-to-repertoire-matching
Whether to perform repertoire-to-repertoire matching. Enabled by default.

--partition-based | --no-partition-based
Enable/disable partition-based metrics and plots. Enabled by default.

--reference-free | --no-reference-free
Enable/disable reference-free metrics and plots. Disabled by default.

--export-bad-clusters | --no-export-bad-clusters
Whether to export untrustworthy clusters during reference-free analysis. Disabled by default.

3.4. Algorithm parameters

--reference-size-cutoff <positive integer>
Cutoff for reference cluster size. Smaller reference clusters are discarded during repertoire-to-repertoire comparison. Default value is 5.

3.4. Miscellaneous options

--test
Running on the toy test dataset. Command line corresponding to the test run is equivalent to the following:
    
    ./igquast.py -s igquast_test_dataset/test/input_reads.fa.gz -c igquast_test_dataset/igrec/final_repertoire.fa.gz -C igquast_test_dataset/igrec/final_repertoire.rcm -r igquast_test_dataset/test/repertoire.fa.gz -R igquast_test_dataset/test/repertoire.rcm -o igquast_test_test
    
-h / --help
Show help and exit.

3.5. Examples

Perform reference-free analysis only:
    
    ./igquast.py -s igquast_test_dataset/test/input_reads.fa.gz -c igquast_test_dataset/test/igrec_bad/final_repertoire.fa.gz -C igquast_test_dataset/test/igrec_bad/final_repertoire.rcm -o igquast_test --reference-free
    
Do not plot figures, make reports only:
    
    ./igquast.py -s igquast_test_dataset/test/input_reads.fa.gz -c igquast_test_dataset/test/igrec/final_repertoire.fa.gz -C igquast_test_dataset/test/igrec/final_repertoire.rcm -r igquast_test_dataset/test/repertoire.fa.gz -R igquast_test_dataset/test/repertoire.rcm --figure-format= -o igquast_test
    

3.6. Output files

IgQUAST creates output directory (its name is specified using option -o) and outputs the following files there: Files for the reports (in text and JSON formats) are specified by the corresponding options. Some files can be absent depending on provided input. Note that reference-free analysis is disabled by default since it is very time- and memory-consuming. One should use the option --reference-free to enable it.

4. Citations

Alexander Shlemov, Sergey Bankevich, Andrey Bzikadze, Dmitriy M. Chudakov, Yana Safonova, and Pavel A. Pevzner. Reconstructing antibody repertoires from error-prone immunosequencing datasets (submitted)

5. Feedback and bug reports

Your comments, bug reports, and suggestions are very welcome. They will help us to further improve IgQUAST.

If you have any trouble running IgQUAST, please provide us the log file from the output directory.

Address for communications: igtools_support@googlegroups.com.