pyplink¶
pyplink
is a Python 3 binary Plink file parser and writer
released under the MIT licence. The difference with existing parsers (and Plink
itself) is that pyplink
does not load the BED file in memory, making
possible to work with extremely large files (e.g. 1000 Genomes Phase 3
files).
For more information on how to use pyplink
, refer to the
API documentation. Below is a snippet describing the most
common usage of the module.
from pyplink import PyPlink
with PyPlink("plink_file_prefix") as bed:
# Getting the BIM and FAM
bim = bed.get_bim()
fam = bed.get_fam()
# Iterating over all loci
for loci_name, genotypes in bed:
pass
# Getting the genotypes of a single marker (numpy.ndarray)
genotypes = bed.get_geno_marker("rs12345")