pyGenClean.NoCallHetero package¶
For more information about how to use this module, refer to the Clean No Call and Only Heterozygous Markers Module.
Module contents¶
Submodules¶
pyGenClean.NoCallHetero.clean_noCall_hetero_snps module¶
- 
exception pyGenClean.NoCallHetero.clean_noCall_hetero_snps.ProgramError(msg)[source]¶
- Bases: - exceptions.Exception- An - Exceptionraised in case of a problem.- Parameters: - msg (str) – the message to print to the user before exiting. 
- 
pyGenClean.NoCallHetero.clean_noCall_hetero_snps.checkArgs(args)[source]¶
- Checks the arguments and options. - Parameters: - args (argparse.Namespace) – an object containing the options of the program. - Returns: - Trueif everything was OK.- If there is a problem with an option, an exception is raised using the - ProgramErrorclass, a message is printed to the- sys.stderrand the program exists with code 1.
- 
pyGenClean.NoCallHetero.clean_noCall_hetero_snps.main(argString=None)[source]¶
- The main function of the module. - Parameters: - argString (list) – the options. - These are the steps: - Prints the options.
- Reads the tfamandtpedfiles and find all heterozygous and all failed markers (processTPEDandTFAM()).
 
- 
pyGenClean.NoCallHetero.clean_noCall_hetero_snps.parseArgs(argString=None)[source]¶
- Parses the command line options and arguments. - Parameters: - argString (list) – the options. - Returns: - A - argparse.Namespaceobject created by the- argparsemodule. It contains the values of the different options.- Options - Type - Description - --tfile- string - The input file prefix (Plink tfile). - --out- string - The prefix of the output files - Note - No option check is done here (except for the one automatically done by argparse). Those need to be done elsewhere (see - checkArgs()).
- 
pyGenClean.NoCallHetero.clean_noCall_hetero_snps.processTPEDandTFAM(tped, tfam, prefix)[source]¶
- Process the TPED and TFAM files. - Parameters: - Copies the original - tfamfile into- prefix.tfam. Then, it reads the- tpedfile and keeps in memory two sets containing the markers which are all failed or which contains only heterozygous genotypes.- It creates two output files, - prefix.allFailedand- prefix.allHetero, containing the markers that are all failed and are all heterozygous, respectively.- Note - All heterozygous markers located on the mitochondrial chromosome are not remove. 
pyGenClean.NoCallHetero.heterozygosity_plot module¶
- 
exception pyGenClean.NoCallHetero.heterozygosity_plot.ProgramError(msg)[source]¶
- Bases: - exceptions.Exception- An - Exceptionraised in case of a problem.- Parameters: - msg (str) – the message to print to the user before exiting. 
- 
pyGenClean.NoCallHetero.heterozygosity_plot.checkArgs(args)[source]¶
- Checks the arguments and options. - Parameters: - args (argparse.Namespace) – an object containing the options of the program. - Returns: - Trueif everything was OK.- If there is a problem with an option, an exception is raised using the - ProgramErrorclass, a message is printed to the- sys.stderrand the program exists with code 1.
- 
pyGenClean.NoCallHetero.heterozygosity_plot.compute_heterozygosity(in_prefix, nb_samples)[source]¶
- Computes the heterozygosity ratio of samples (from tped). 
- 
pyGenClean.NoCallHetero.heterozygosity_plot.compute_nb_samples(in_prefix)[source]¶
- Check the number of samples. - Parameters: - in_prefix (str) – the prefix of the input file. - Returns: - the number of sample in - prefix.fam.
- 
pyGenClean.NoCallHetero.heterozygosity_plot.is_heterozygous(genotype)[source]¶
- Tells if a genotype “A B” is heterozygous. - Parameters: - genotype (str) – the genotype to test for heterozygosity. - Returns: - Trueif the genotype is heterozygous,- Falseotherwise.- The genotype must contain two alleles, separated by a space. It then compares the first allele ( - genotype[0]) with the last one (- genotype[-1]).- >>> is_heterozygous("A A") False >>> is_heterozygous("G C") True >>> is_heterozygous("0 0") # No call is not heterozygous. False 
- 
pyGenClean.NoCallHetero.heterozygosity_plot.main(argString=None)[source]¶
- The main function of the module. - Parameters: - argString (list) – the options. - These are the steps: - Prints the options.
- Checks the number of samples in the tfamfile (compute_nb_samples()).
- Computes the heterozygosity rate (compute_heterozygosity()).
- Saves the heterozygosity data (in out.het).
- Plots the heterozygosity rate (plot_heterozygosity()).
 
- 
pyGenClean.NoCallHetero.heterozygosity_plot.parseArgs(argString=None)[source]¶
- Parses the command line options and arguments. - Parameters: - argString (list) – the options. - Returns: - A - argparse.Namespaceobject created by the- argparsemodule. It contains the values of the different options.- Options - Type - Description - --tfile- string - The prefix of the transposed file. - --boxplot- bool - Draw a boxplot instead of a histogram. - --format- string - The output file format. - --bins- int - The number of bins for the histogram. - --xlim- float - The limit of the x axis. - --ymax- float - “The maximal Y value. - --out- string - The prefix of the output files. - Note - No option check is done here (except for the one automatically done by argparse). Those need to be done elsewhere (see - checkArgs()).
- 
pyGenClean.NoCallHetero.heterozygosity_plot.plot_heterozygosity(heterozygosity, options)[source]¶
- Plots the heterozygosity rate distribution. - Parameters: - heterozygosity (numpy.array) – the heterozygosity data.
- options (argparse.Namespace) – the options.
 - Plots a histogram or a boxplot of the heterozygosity distribution. 
- 
pyGenClean.NoCallHetero.heterozygosity_plot.safe_main()[source]¶
- A safe version of the main function (that catches ProgramError). 
- 
pyGenClean.NoCallHetero.heterozygosity_plot.save_heterozygosity(heterozygosity, samples, out_prefix)[source]¶
- Saves the heterozygosity data. - Parameters: - heterozygosity (numpy.array) – the heterozygosity data.
- samples (list of tuples of str) – the list of samples.
- out_prefix (str) – the prefix of the output files.
 
