Description

Subworkflow that optionally preprocesses amino acid FASTA sequences (seqkit seq/replace/rmdup), computes sequence statistics before and after preprocessing using seqfu stats, and exports MultiQC-compatible statistics and software versions.

Input

name
description
pattern

ch_fasta

Amino acid sequences fasta file.
Structure: [ val(meta), [ path(fasta) ] ]

skip_preprocessing

If true, skip seqkit-based preprocessing steps and only compute
initial sequence statistics.

Output

name
description
pattern

fasta

Contains the final amino acid FASTA file
(either preprocessed or original if preprocessing is skipped).

multiqc_files

Statistics file for MultiQC.

versions

Versions file containing the software versions used in the workflow.