pyGenClean.NoCallHetero package¶
For more information about how to use this module, refer to the Clean No Call and Only Heterozygous Markers Module.
Module contents¶
Submodules¶
pyGenClean.NoCallHetero.clean_noCall_hetero_snps module¶
-
exception
pyGenClean.NoCallHetero.clean_noCall_hetero_snps.ProgramError(msg)[source]¶ Bases:
exceptions.ExceptionAn
Exceptionraised in case of a problem.Parameters: msg (str) – the message to print to the user before exiting.
-
pyGenClean.NoCallHetero.clean_noCall_hetero_snps.checkArgs(args)[source]¶ Checks the arguments and options.
Parameters: args (argparse.Namespace) – an object containing the options of the program. Returns: Trueif everything was OK.If there is a problem with an option, an exception is raised using the
ProgramErrorclass, a message is printed to thesys.stderrand the program exists with code 1.
-
pyGenClean.NoCallHetero.clean_noCall_hetero_snps.main(argString=None)[source]¶ The main function of the module.
Parameters: argString (list) – the options. These are the steps:
- Prints the options.
- Reads the
tfamandtpedfiles and find all heterozygous and all failed markers (processTPEDandTFAM()).
-
pyGenClean.NoCallHetero.clean_noCall_hetero_snps.parseArgs(argString=None)[source]¶ Parses the command line options and arguments.
Parameters: argString (list) – the options. Returns: A argparse.Namespaceobject created by theargparsemodule. It contains the values of the different options.Options Type Description --tfilestring The input file prefix (Plink tfile). --outstring The prefix of the output files Note
No option check is done here (except for the one automatically done by argparse). Those need to be done elsewhere (see
checkArgs()).
-
pyGenClean.NoCallHetero.clean_noCall_hetero_snps.processTPEDandTFAM(tped, tfam, prefix)[source]¶ Process the TPED and TFAM files.
Parameters: Copies the original
tfamfile intoprefix.tfam. Then, it reads thetpedfile and keeps in memory two sets containing the markers which are all failed or which contains only heterozygous genotypes.It creates two output files,
prefix.allFailedandprefix.allHetero, containing the markers that are all failed and are all heterozygous, respectively.Note
All heterozygous markers located on the mitochondrial chromosome are not remove.
pyGenClean.NoCallHetero.heterozygosity_plot module¶
-
exception
pyGenClean.NoCallHetero.heterozygosity_plot.ProgramError(msg)[source]¶ Bases:
exceptions.ExceptionAn
Exceptionraised in case of a problem.Parameters: msg (str) – the message to print to the user before exiting.
-
pyGenClean.NoCallHetero.heterozygosity_plot.checkArgs(args)[source]¶ Checks the arguments and options.
Parameters: args (argparse.Namespace) – an object containing the options of the program. Returns: Trueif everything was OK.If there is a problem with an option, an exception is raised using the
ProgramErrorclass, a message is printed to thesys.stderrand the program exists with code 1.
-
pyGenClean.NoCallHetero.heterozygosity_plot.compute_heterozygosity(in_prefix, nb_samples)[source]¶ Computes the heterozygosity ratio of samples (from tped).
-
pyGenClean.NoCallHetero.heterozygosity_plot.compute_nb_samples(in_prefix)[source]¶ Check the number of samples.
Parameters: in_prefix (str) – the prefix of the input file. Returns: the number of sample in prefix.fam.
-
pyGenClean.NoCallHetero.heterozygosity_plot.is_heterozygous(genotype)[source]¶ Tells if a genotype “A B” is heterozygous.
Parameters: genotype (str) – the genotype to test for heterozygosity. Returns: Trueif the genotype is heterozygous,Falseotherwise.The genotype must contain two alleles, separated by a space. It then compares the first allele (
genotype[0]) with the last one (genotype[-1]).>>> is_heterozygous("A A") False >>> is_heterozygous("G C") True >>> is_heterozygous("0 0") # No call is not heterozygous. False
-
pyGenClean.NoCallHetero.heterozygosity_plot.main(argString=None)[source]¶ The main function of the module.
Parameters: argString (list) – the options. These are the steps:
- Prints the options.
- Checks the number of samples in the
tfamfile (compute_nb_samples()). - Computes the heterozygosity rate (
compute_heterozygosity()). - Saves the heterozygosity data (in
out.het). - Plots the heterozygosity rate (
plot_heterozygosity()).
-
pyGenClean.NoCallHetero.heterozygosity_plot.parseArgs(argString=None)[source]¶ Parses the command line options and arguments.
Parameters: argString (list) – the options. Returns: A argparse.Namespaceobject created by theargparsemodule. It contains the values of the different options.Options Type Description --tfilestring The prefix of the transposed file. --boxplotbool Draw a boxplot instead of a histogram. --formatstring The output file format. --binsint The number of bins for the histogram. --xlimfloat The limit of the x axis. --ymaxfloat “The maximal Y value. --outstring The prefix of the output files. Note
No option check is done here (except for the one automatically done by argparse). Those need to be done elsewhere (see
checkArgs()).
-
pyGenClean.NoCallHetero.heterozygosity_plot.plot_heterozygosity(heterozygosity, options)[source]¶ Plots the heterozygosity rate distribution.
Parameters: - heterozygosity (numpy.array) – the heterozygosity data.
- options (argparse.Namespace) – the options.
Plots a histogram or a boxplot of the heterozygosity distribution.
-
pyGenClean.NoCallHetero.heterozygosity_plot.safe_main()[source]¶ A safe version of the main function (that catches ProgramError).
-
pyGenClean.NoCallHetero.heterozygosity_plot.save_heterozygosity(heterozygosity, samples, out_prefix)[source]¶ Saves the heterozygosity data.
Parameters: - heterozygosity (numpy.array) – the heterozygosity data.
- samples (list of tuples of str) – the list of samples.
- out_prefix (str) – the prefix of the output files.
