.. _list_of_scripts:

List of Modules and their Options
*********************************

The following sections show a list the available scripts that can be used in the
configuration file, along with their options for customization.


Contamination
=============

The name to use in the configuration file is ``contamination`` and the
:ref:`contamination_table` table shows its configuration.

.. tabularcolumns:: p{6.6cm}Lp{7.5cm}
.. _contamination_table:

.. table:: List of options for the **contamination** script.

    +------------------------------+------------+-----------------------------+
    | Option                       | Type       | Description                 |
    +==============================+============+=============================+
    | ``--raw-dir``                | ``STRING`` | Directory containing the raw|
    |                              |            | data (one file per sample,  |
    |                              |            | where the name of the file  |
    |                              |            | (minus the extension) is the|
    |                              |            | sample identification       |
    |                              |            | number.                     |
    +------------------------------+------------+-----------------------------+
    | ``--colsample``              | ``STRING`` | The sample column.          |
    +------------------------------+------------+-----------------------------+
    | ``--colmarker``              | ``STRING`` | The marker column.          |
    +------------------------------+------------+-----------------------------+
    | ``--colbaf``                 | ``STRING`` | The B allele frequency      |
    |                              |            | column.                     |
    +------------------------------+------------+-----------------------------+
    | ``--colab1``                 | ``STRING`` | The AB Allele 1 column.     |
    +------------------------------+------------+-----------------------------+
    | ``--colab2``                 | ``STRING`` | The AB Allele 2 column.     |
    +------------------------------+------------+-----------------------------+
    | ``--sge``                    |            | Use SGE for parallelization.|
    +------------------------------+------------+-----------------------------+
    | ``--sge-walltime``           | ``STRING`` | The walltime for the job to |
    |                              |            | run on the cluster. Do not  |
    |                              |            | use if you are not required |
    |                              |            | to specify a walltime for   |
    |                              |            | your jobs on your cluster   |
    |                              |            | (*e.g.*                     |
    |                              |            | '``qsub -lwalltime=1:0:0``' |
    |                              |            | on the cluster).            |
    +------------------------------+------------+-----------------------------+
    | ``--sge-nodes``              | ``INT``    | The number of nodes and the |
    |                              |            | number of processor per     |
    |                              |            | nodes to use (*e.g.*        |
    |                              |            | '``qsub -lnodes=X:ppn=Y``'  |
    |                              |            | on the cluster, where X is  |
    |                              |            | the number of nodes and Y is|
    |                              |            | the number of processor to  |
    |                              |            | use. Do  not use if you are |
    |                              |            | not required to specify the |
    |                              |            | number of nodes  for your   |
    |                              |            | jobs on the cluster.        |
    +------------------------------+------------+-----------------------------+
    | ``--sample-per-run-for-sge`` | ``INT``    | The number of sample to run |
    |                              |            | for a single SGE job.       |
    +------------------------------+------------+-----------------------------+

The name of the standalone script is ``pyGenClean_check_contamination``.


.. _duplicated_samples_options:

Duplicated Samples
==================

The name to use in the configuration file is ``duplicated_samples`` and the
:ref:`duplicated_samples_table` table shows its configuration.


.. tabularcolumns:: p{6.6cm}Lp{7.5cm}
.. _duplicated_samples_table:

.. table:: List of options for the **duplicated_samples** script.

    +------------------------------------+-----------+-------------------------+
    | Option                             | Type      | Description             |
    +====================================+===========+=========================+
    | ``--sample-completion-threshold``  | ``FLOAT`` | The completion          |
    |                                    |           | threshold to consider a |
    |                                    |           | replicate when choosing |
    |                                    |           | the best replicates and |
    |                                    |           | for creating the        |
    |                                    |           | composite samples.      |
    |                                    |           | [default: 0.9]          |
    +------------------------------------+-----------+-------------------------+
    | ``--sample-concordance-threshold`` | ``FLOAT`` | The concordance         |
    |                                    |           | threshold to consider a |
    |                                    |           | replicate when choosing |
    |                                    |           | the best replicates and |
    |                                    |           | for creating the        |
    |                                    |           | composite samples.      |
    |                                    |           | [default: 0.97]         |
    +------------------------------------+-----------+-------------------------+

The name of the standalone script is ``pyGenClean_duplicated_samples``.


Duplicated Markers
==================

The name to use in the configuration file is ``duplicated_snps`` and the
:ref:`duplicated_markers_table` table shows its configuration.

.. tabularcolumns:: p{6.6cm}Lp{7.5cm}
.. _duplicated_markers_table:

.. table:: List of options for the **duplicated_snps** script.

    +---------------------------------+-----------+--------------------------+
    | Option                          | Type      | Description              |
    +=================================+===========+==========================+
    | ``--snp-completion-threshold``  | ``FLOAT`` | The completion threshold |
    |                                 |           | to consider a replicate  |
    |                                 |           | when choosing the best   |
    |                                 |           | replicates and for       |
    |                                 |           | composite creation.      |
    |                                 |           | [default: 0.9]           |
    +---------------------------------+-----------+--------------------------+
    | ``--snp-concordance-threshold`` | ``FLOAT`` | The concordance          |
    |                                 |           | threshold to consider a  |
    |                                 |           | replicate when choosing  |
    |                                 |           | the best replicates and  |
    |                                 |           | for composite creation.  |
    |                                 |           | [default: 0.98]          |
    +---------------------------------+-----------+--------------------------+
    | ``--frequency_difference``      | ``FLOAT`` | The maximum difference   |
    |                                 |           | in frequency between     |
    |                                 |           | duplicated markers       |
    |                                 |           | [default: 0.05]          |
    +---------------------------------+-----------+--------------------------+

The name of the standalone script is ``pyGenClean_duplicated_snps``.


Clean No Call and Only Heterozygous Markers
===========================================

The name to use in the configuration file is ``noCall_hetero_snps`` and there
are no customization possible.

The name of the standalone script is ``pyGenClean_clean_noCall_hetero_snps``.


Sample Missingness
==================

The name to use in the configuration file is ``sample_missingness`` and the
:ref:`sample_missingness_table` table shows its configuration.


.. tabularcolumns:: p{6.6cm}Lp{7.5cm}
.. _sample_missingness_table:

.. table:: List of options for the **sample_missingness** script.

    +------------+-----------+------------------------------------------------+
    | Option     | Type      | Description                                    |
    +============+===========+================================================+
    | ``--mind`` | ``FLOAT`` | The missingness threshold (remove samples with |
    |            |           | more than x percent missing genotypes).        |
    |            |           | [Default: 0.100]                               |
    +------------+-----------+------------------------------------------------+

The name of the standalone script is ``pyGenClean_sample_missingness``.


Marker Missingness
==================

The name to use in the configuration file is ``snp_missingness`` and the
:ref:`snp_missingness_table` table shows its configuration.


.. tabularcolumns:: p{6.6cm}Lp{7.5cm}
.. _snp_missingness_table:

.. table:: List of options for the **snp_missingness** script.

    +------------+-----------+---------------------------------------------+
    | Option     | Type      | Description                                 |
    +============+===========+=============================================+
    | ``--geno`` | ``FLOAT`` | The missingness threshold (remove SNPs with |
    |            |           | more than x percent missing genotypes).     |
    |            |           | [Default: 0.020]                            |
    +------------+-----------+---------------------------------------------+

The name of the standalone script is ``pyGenClean_snp_missingness``.


Sex Check
=========

The name to use in the configuration file is ``sex_check`` and the
:ref:`sex_check_table` table shows its configuration.


.. tabularcolumns:: p{6.3cm}Lp{7.5cm}
.. _sex_check_table:

.. table:: List of options for the **sex_check** script.

    +---------------------------+------------+---------------------------------+
    | Option                    | Type       | Description                     |
    +===========================+============+=================================+
    | ``--femaleF``             | ``FLOAT``  | The female F threshold.         |
    |                           |            | [default: < 0.300000]           |
    +---------------------------+------------+---------------------------------+
    | ``--maleF``               | ``FLOAT``  | The male F threshold.           |
    |                           |            | [default: > 0.700000]           |
    +---------------------------+------------+---------------------------------+
    | ``--nbChr23``             | ``INT``    | The minimum number of markers   |
    |                           |            | on chromosome 23 before         |
    |                           |            | computing Plink's sex check     |
    |                           |            | [default: 50]                   |
    +---------------------------+------------+---------------------------------+
    | ``--gender-plot``         |            | Create the gender plot          |
    |                           |            | (summarized chr Y intensities   |
    |                           |            | in function of summarized chr X |
    |                           |            | intensities) for problematic    |
    |                           |            | samples. Not used by default.   |
    +---------------------------+------------+---------------------------------+
    | ``--sex-chr-intensities`` | ``FILE``   | A file containing alleles       |
    |                           |            | intensities for each of the     |
    |                           |            | markers located on the X and Y  |
    |                           |            | chromosome for the gender plot. |
    +---------------------------+------------+---------------------------------+
    | ``--gender-plot-format``  | ``STRING`` | The output file format for the  |
    |                           |            | gender plot (png, ps, or pdf    |
    |                           |            | formats are available).         |
    |                           |            | [default: png]                  |
    +---------------------------+------------+---------------------------------+
    | ``--lrr-baf``             |            | Create the LRR and BAF plot for |
    |                           |            | problematic samples. Not used   |
    |                           |            | by default.                     |
    +---------------------------+------------+---------------------------------+
    | ``--lrr-baf-raw-dir``     | ``DIR``    | Directory or list of            |
    |                           |            | directories containing          |
    |                           |            | information about every samples |
    |                           |            | (BAF and LRR).                  |
    +---------------------------+------------+---------------------------------+
    | ``--lrr-baf-format``      | ``STRING`` | The output file format for the  |
    |                           |            | LRR and BAF plot (png, ps or    |
    |                           |            | pdf formats are available).     |
    |                           |            | [default: png]                  |
    +---------------------------+------------+---------------------------------+
    | ``--lrr-baf-dpi``         | ``INT``    | The pixel density of the        |
    |                           |            | figure(s) (DPI).                |
    +---------------------------+------------+---------------------------------+

The name of the standalone script is ``pyGenClean_sex_check``. If you want to
redo the BAF and LRR plot or the gender plot, you can use the
``pyGenClean_baf_lrr_plot`` and ``pyGenClean_gender_plot`` scripts,
respectively.


Plate Bias
==========

The name to use in the configuration file is ``plate_bias`` and the
:ref:`plate_bias_table` table shows its configuration.


.. tabularcolumns:: p{6.6cm}Lp{7.5cm}
.. _plate_bias_table:

.. table:: List of options for the **plate_bias** script.

    +------------------+-----------+-----------------------------------------+
    | Option           | Type      | Description                             |
    +==================+===========+=========================================+
    | ``--loop-assoc`` | ``FILE``  | The file containing the plate           |
    |                  |           | organization of each samples. Must      |
    |                  |           | contains three column (with no header): |
    |                  |           | famID, indID and plateName.             |
    +------------------+-----------+-----------------------------------------+
    | ``--pfilter``    | ``FLOAT`` | The significance threshold used for the |
    |                  |           | plate effect. [default: 1.0e-07]        |
    +------------------+-----------+-----------------------------------------+

The name of the standalone script is ``pyGenClean_plate_bias``.


Heterozygous Haploid
====================

The name to use in the configuration file is ``remove_heterozygous_haploid`` and
there are no customization possible.

The name of the standalone script is ``pyGenClean_remove_heterozygous_haploid``.


Related Samples
===============

The name to use in the configuration file is ``find_related_samples`` and the
:ref:`find_related_samples_table` table shows its configuration.


.. tabularcolumns:: p{5.1cm}Lp{7.5cm}
.. _find_related_samples_table:

.. table:: List of options for the **find_related_samples** script.

    +-----------------------------+------------+-------------------------------+
    | Option                      | Type       | Description                   |
    +=============================+============+===============================+
    | ``--genome-only``           |            | Only create the genome file.  |
    |                             |            | Not selected by default.      |
    +-----------------------------+------------+-------------------------------+
    | ``--min-nb-snp``            | ``INT``    | The minimum number of markers |
    |                             |            | needed to compute IBS values. |
    |                             |            | [Default: 10000]              |
    +-----------------------------+------------+-------------------------------+
    | ``--indep-pairwise``        | ``INT``    | Three numbers: window size,   |
    |                             | ``INT``    | window shift and the r2       |
    |                             | ``FLOAT``  | threshold. [default: ['50',   |
    |                             |            | '5', '0.1']]                  |
    +-----------------------------+------------+-------------------------------+
    | ``--maf``                   | ``FLOAT``  | Restrict to SNPs with MAF >=  |
    |                             |            | threshold. [default: 0.05]    |
    +-----------------------------+------------+-------------------------------+
    | ``--ibs2-ratio``            | ``FLOAT``  | The initial IBS2* ratio (the  |
    |                             |            | minimum value to show in the  |
    |                             |            | plot. [default: 0.8]          |
    +-----------------------------+------------+-------------------------------+
    | ``--sge``                   |            | Use SGE for parallelization.  |
    +-----------------------------+------------+-------------------------------+
    | ``--sge-walltime``          | ``STRING`` | The time limit (for clusters).|
    |                             |            | Do not use if you are not     |
    |                             |            | required to specify a walltime|
    |                             |            | for your jobs on your cluster |
    |                             |            | (e.g. ``-lwalltime=1:0:0`` on |
    |                             |            | the cluster). Allow enough    |
    |                             |            | time for proper job           |
    |                             |            | completion.                   |
    +-----------------------------+------------+-------------------------------+
    | ``--sge-nodes``             | ``INT``    | The number of nodes and the   |
    |                             | ``INT``    | number of processor per nodes |
    |                             |            | to use (e.g. ``qsub           |
    |                             |            | -lnodes=X:ppn=Y`` on the      |
    |                             |            | cluster, where X is the number|
    |                             |            | of nodes and Y is the number  |
    |                             |            | of processor to use. Do not   |
    |                             |            | use if you are not required to|
    |                             |            | specify the number of nodes   |
    |                             |            | for your jobs on the cluster. |
    |                             |            | Allow enough ressources for   |
    |                             |            | proper job completion.        |
    +-----------------------------+------------+-------------------------------+
    | ``--line-per-file-for-sge`` | ``INT``    | The number of line per file   |
    |                             |            | for SGE task array.           |
    |                             |            | [default: 100]                |
    +-----------------------------+------------+-------------------------------+

The name of the standalone script is ``pyGenClean_find_related_samples``. Even
though randomly choosing a subset of related samples is done automatically, you
can use the ``pyGenClean_merge_related_samples`` to perform it again.


Ethnicity
=========

The name to use in the configuration file is ``check_ethnicity`` and the
:ref:`check_ethnicity_table` table shows its configuration.


.. tabularcolumns:: p{5.1cm}Lp{7.5cm}
.. _check_ethnicity_table:

.. table:: List of options for the **check_ethnicity** script.

    +-----------------------------+------------+-------------------------------+
    | Option                      | Type       | Description                   |
    +=============================+============+===============================+
    | ``--skip-ref-pops``         |            | Perform the MDS computation,  |
    |                             |            | but skip the three reference  |
    |                             |            | panels.                       |
    +-----------------------------+------------+-------------------------------+
    | ``--ceu-bfile``             | ``FILE``   | The input file prefix (will   |
    |                             |            | find the plink binary files   |
    |                             |            | by appending the prefix to    |
    |                             |            | the .bim, .bed and .fam       |
    |                             |            | files, respectively.) for the |
    |                             |            | CEU population.               |
    +-----------------------------+------------+-------------------------------+
    | ``--yri-bfile``             | ``FILE``   | The input file prefix (will   |
    |                             |            | find the plink binary files   |
    |                             |            | by appending the prefix to    |
    |                             |            | the .bim, .bed and .fam       |
    |                             |            | files, respectively.) for the |
    |                             |            | CEU population.               |
    +-----------------------------+------------+-------------------------------+
    | ``--jpt-chb-bfile``         | ``FILE``   | The input file prefix (will   |
    |                             |            | find the plink binary files   |
    |                             |            | by appending the prefix to    |
    |                             |            | the .bim, .bed and .fam       |
    |                             |            | files, respectively.) for the |
    |                             |            | JPT-CHB population.           |
    +-----------------------------+------------+-------------------------------+
    | ``--min-nb-snp``            | ``FILE``   | The minimum number of markers |
    |                             |            | needed to compute IBS values. |
    |                             |            | [Default: 8000]               |
    +-----------------------------+------------+-------------------------------+
    | ``--indep-pairwise``        | ``INT``    | Three numbers: window size,   |
    |                             | ``INT``    | window shift and the r2       |
    |                             | ``FLOAT``  | threshold. [default: ['50',   |
    |                             |            | '5', '0.1']]                  |
    +-----------------------------+------------+-------------------------------+
    | ``--maf``                   | ``INT``    | Restrict to SNPs with MAF >=  |
    |                             |            | threshold. [default: 0.05]    |
    +-----------------------------+------------+-------------------------------+
    | ``--sge``                   |            | Use SGE for parallelization.  |
    +-----------------------------+------------+-------------------------------+
    | ``--sge-walltime``          | ``STRING`` | The time limit (for clusters).|
    |                             |            | Do not use if you are not     |
    |                             |            | required to specify a walltime|
    |                             |            | for your jobs on your cluster |
    |                             |            | (e.g. ``-lwalltime=1:0:0`` on |
    |                             |            | the cluster). Allow enough    |
    |                             |            | time for proper job           |
    |                             |            | completion.                   |
    +-----------------------------+------------+-------------------------------+
    | ``--sge-nodes``             | ``INT``    | The number of nodes and the   |
    |                             | ``INT``    | number of processor per nodes |
    |                             |            | to use (e.g. ``qsub           |
    |                             |            | -lnodes=X:ppn=Y`` on the      |
    |                             |            | cluster, where X is the number|
    |                             |            | of nodes and Y is the number  |
    |                             |            | of processor to use. Do not   |
    |                             |            | use if you are not required to|
    |                             |            | specify the number of nodes   |
    |                             |            | for your jobs on the cluster. |
    |                             |            | Allow enough ressources for   |
    |                             |            | proper job completion.        |
    +-----------------------------+------------+-------------------------------+
    | ``--ibs-sge-walltime``      | ``STRING`` | The time limit (for clusters) |
    |                             |            | for the IBS jobs. Do not use  |
    |                             |            | if you are not required to    |
    |                             |            | specify a walltime for your   |
    |                             |            | jobs on your cluster (e.g.    |
    |                             |            | ``-lwalltime=1:0:0`` on the   |
    |                             |            | cluster). Allow enough time   |
    |                             |            | for proper job completion.    |
    +-----------------------------+------------+-------------------------------+
    | ``--ibs-sge-nodes``         | ``INT``    | The number of nodes and the   |
    |                             | ``INT``    | number of processor per nodes |
    |                             |            | to use for the IBS jobs (e.g. |
    |                             |            | ``qsub                        |
    |                             |            | -lnodes=X:ppn=Y`` on the      |
    |                             |            | cluster, where X is the number|
    |                             |            | of nodes and Y is the number  |
    |                             |            | of processor to use. Do not   |
    |                             |            | use if you are not required to|
    |                             |            | specify the number of nodes   |
    |                             |            | for your jobs on the cluster. |
    |                             |            | Allow enough ressources for   |
    |                             |            | proper job completion.        |
    +-----------------------------+------------+-------------------------------+
    | ``--line-per-file-for-sge`` | ``INT``    | The number of line per file   |
    |                             |            | for SGE task array.           |
    |                             |            | [default: 100]                |
    +-----------------------------+------------+-------------------------------+
    | ``--nb-components``         | ``INT``    | The number of component to    |
    |                             |            | compute. [default: 10]        |
    +-----------------------------+------------+-------------------------------+
    | ``--outliers-of``           | ``STRING`` | Finds the outliers of this    |
    |                             |            | population. [default: CEU]    |
    +-----------------------------+------------+-------------------------------+
    | ``--multiplier``            | ``FLOAT``  | To find the outliers, we look |
    |                             |            | for more than x times the     |
    |                             |            | cluster standard deviation.   |
    |                             |            | [default: 1.9]                |
    +-----------------------------+------------+-------------------------------+
    | ``--xaxis``                 | ``STRING`` | The component to use for the  |
    |                             |            | X axis. [default: C1]         |
    +-----------------------------+------------+-------------------------------+
    | ``--yaxis``                 | ``STRING`` | The component to use for the  |
    |                             |            | Y axis. [default: C2]         |
    +-----------------------------+------------+-------------------------------+
    | ``--format``                | ``STRING`` | The output file format (png,  |
    |                             |            | ps, pdf, or X11 formats are   |
    |                             |            | available). [default: png]    |
    +-----------------------------+------------+-------------------------------+
    | ``--title``                 | ``STRING`` | The title of the MDS plot.    |
    |                             |            | [default: C2 in function of   |
    |                             |            | C1 - MDS]                     |
    +-----------------------------+------------+-------------------------------+
    | ``--xlabel``                | ``STRING`` | The label of the X axis.      |
    |                             |            | [default: C1]                 |
    +-----------------------------+------------+-------------------------------+
    | ``--ylabel``                | ``STRING`` | The label of the Y axis.      |
    |                             |            | [default: C2]                 |
    +-----------------------------+------------+-------------------------------+
    | ``--create-scree-plot``     |            | Computes Eigenvalues and      |
    |                             |            | creates a scree plot.         |
    +-----------------------------+------------+-------------------------------+
    | ``--scree-plot-title``      | ``STRING`` | The main title of the scree   |
    |                             |            | plot                          |
    +-----------------------------+------------+-------------------------------+

The name of the standalone script is ``pyGenClean_check_ethnicity``. If you want
to redo the outlier detection using a different multiplier, have a look at the
``pyGenClean_find_outliers`` script. If you want to redo any MDS plot, have a
look at the ``pyGenClean_plot_MDS`` script. If you want to compute the
*Eigenvectors* using the ``smartpca`` tool, have a look at the
``pyGenClean_plot_eigenvalues`` script.


Minor Allele Frequency of Zero
==============================

The name to use in the configuration file is ``flag_maf_zero`` and there
are no customization possible.

The name of the standalone script is ``pyGenClean_flag_maf_zero``.


Hardy Weinberg Equilibrium
==========================

The name to use in the configuration file is ``flag_hw`` and the
:ref:`flag_hw_table` table shows its configuration.


.. tabularcolumns:: p{6.6cm}Lp{7.5cm}
.. _flag_hw_table:

.. table:: List of options for the **flag_hw** script.

    +-----------+-----------+-------------------------------------------+
    | Option    | Type      | Description                               |
    +===========+===========+===========================================+
    | ``--hwe`` | ``FLOAT`` | The Hardy-Weinberg equilibrium threshold. |
    |           |           | [default: 1e-4]                           |
    +-----------+-----------+-------------------------------------------+

The name of the standalone script is ``pyGenClean_flag_hw``.


Subsetting the Data
===================

The name to use in the configuration file is ``subset`` and the
:ref:`subset_table` table shows its configuration.


.. tabularcolumns:: p{6.6cm}Lp{7.5cm}
.. _subset_table:

.. table:: List of options for the **subset** script.

    +---------------+----------+--------------------------------------------+
    | Option        | Type     | Description                                |
    +===============+==========+============================================+
    | ``--exclude`` | ``FILE`` | A file containing SNPs to exclude from the |
    |               |          | data set.                                  |
    +---------------+----------+--------------------------------------------+
    | ``--extract`` | ``FILE`` | A file containing SNPs to extract from the |
    |               |          | data set.                                  |
    +---------------+----------+--------------------------------------------+
    | ``--remove``  | ``FILE`` | A file containing samples (FID and IID) to |
    |               |          | remove from the data set.                  |
    +---------------+----------+--------------------------------------------+
    | ``--keep``    | ``FILE`` | A file containing samples (FID and IID) to |
    |               |          | keep from the data set.                    |
    +---------------+----------+--------------------------------------------+

The name of the standalone script is ``pyGenClean_subset_data``.


Comparison with a Gold Standard
===============================

The name to use in the configuration file is ``compare_gold_standard`` and the
:ref:`compare_gold_standard_table` table shows its configuration.


.. tabularcolumns:: p{6.6cm}Lp{7.5cm}
.. _compare_gold_standard_table:

.. table:: List of options for the **compare_gold_standard** script.

    +------------------------+----------+--------------------------------------+
    | Option                 | Type     | Description                          |
    +========================+==========+======================================+
    | ``--gold-bfile``       | ``FILE`` | The input file prefix (will find the |
    |                        |          | plink binary files by appending the  |
    |                        |          | prefix to the .bim, .bed and .fam    |
    |                        |          | files, respectively.) for the Gold   |
    |                        |          | Standard .                           |
    +------------------------+----------+--------------------------------------+
    | ``--same-samples``     | ``FILE`` | A file containing samples which are  |
    |                        |          | present in both the gold standard    |
    |                        |          | and the source panel. One line by    |
    |                        |          | identity and tab separated. For each |
    |                        |          | row, first sample is Gold Standard,  |
    |                        |          | second is source panel.              |
    +------------------------+----------+--------------------------------------+
    | ``--source-manifest``  | ``FILE`` | The illumina marker manifest.        |
    +------------------------+----------+--------------------------------------+
    | ``--source-alleles``   | ``FILE`` | A file containing the source alleles |
    |                        |          | (TOP). Two columns (separated by     |
    |                        |          | tabulation, one with the marker      |
    |                        |          | name, the other with the alleles     |
    |                        |          | (separated by space). No header.     |
    +------------------------+----------+--------------------------------------+
    | ``--sge``              |          | Use SGE for parallelization.         |
    +------------------------+----------+--------------------------------------+
    | ``--do-not-flip``      |          | Do not flip SNPs. WARNING: only use  |
    |                        |          | this option only if the Gold         |
    |                        |          | Standard was generated using the     |
    |                        |          | same chip (hence, flipping is        |
    |                        |          | unnecessary).                        |
    +------------------------+----------+--------------------------------------+
    | ``--use-marker-names`` |          | Use marker names instead of (chr,    |
    |                        |          | position). WARNING: only use this    |
    |                        |          | options only if the Gold Standard    |
    |                        |          | was generated using the same chip    |
    |                        |          | (hence, they have the same marker    |
    |                        |          | names).                              |
    +------------------------+----------+--------------------------------------+

The name of the standalone script is ``pyGenClean_compare_gold_standard``.