ancIBD.IO.h5_load

Functions to load data from a HDF5 file @ Author: Harald Ringbauer, 2021

Module Contents

Functions

get_idx_iid(f, sample[, unique])

Return Index of sample samples in hdf5 f

get_idx_iid_exact(f, sample[, unique])

Return Index of sample samples in hdf5 f

get_coverage(f, j)

Get Coverage of sample j in hdf5 f

get_markers_good(f, j[, output, cutoff, ploidy])

Get markers

get_genos(f[, iid, min_gp, output, phased, exact])

Return Genotypes and Map of pairs at intersection with GP>cutoff.

load_individual_h5([path_h5, min_gp, chs, iid, output])

Load individual data from set of chromosomal hdf5s.

get_genos_pairs(f[, sample1, sample2, cutoff, output, ...])

Return Genotypes and Map of pairs at intersection with GP>cutoff.

opp_homos(g1, g2)

Return opposing homozygotes

get_opp_homos_f([f_path, iid1, iid2, ch, cutoff, ...])

Return opposing homozygotes boolean array and map array at intersection with

get_opp_homos_X(f_path, iid1, iid2[, ploidy, cutoff, ...])

return oppo homo boolean array and map array at intersection with GP>cutoff. This function is tailored for X chromosome.

get_diff_gt_f(f_path, iid1, iid2, ch[, cutoff, ...])

Return diff genotype boolean array and map array at intersection of maxGP > cutoff

ancIBD.IO.h5_load.get_idx_iid(f, sample, unique=True)

Return Index of sample samples in hdf5 f

ancIBD.IO.h5_load.get_idx_iid_exact(f, sample, unique=True)

Return Index of sample samples in hdf5 f

ancIBD.IO.h5_load.get_coverage(f, j)

Get Coverage of sample j in hdf5 f

ancIBD.IO.h5_load.get_markers_good(f, j, output=True, cutoff=0.99, ploidy=2)

Get markers

ancIBD.IO.h5_load.get_genos(f, iid='SUC002', min_gp=0.98, output=True, phased=False, exact=True)

Return Genotypes and Map of pairs at intersection with GP>cutoff. phased: Whether to return [lx2] phased vector or [l] vetor of #derived. exact: Whether IID has to be an exact match

ancIBD.IO.h5_load.load_individual_h5(path_h5='/n/groups/reich/hringbauer/git/hapBLOCK/data/hdf5/1240k_v43/ch', min_gp=0.98, chs=range(1, 23), iid='SUC002', output=False)

Load individual data from set of chromosomal hdf5s.

ancIBD.IO.h5_load.get_genos_pairs(f, sample1='SUC006', sample2='R26.SG', cutoff=0.98, output=True, phased=False, exact=False, ploidy=(2, 2))

Return Genotypes and Map of pairs at intersection with GP>cutoff. phased: Whether to return [lx2] phased vector or [l] vetor of #derived. exact: Whether IID has to be an exact match

ancIBD.IO.h5_load.opp_homos(g1, g2)

Return opposing homozygotes

ancIBD.IO.h5_load.get_opp_homos_f(f_path='/n/groups/reich/hringbauer/git/hapBLOCK/data//hdf5/1240k_v43/ch', iid1='SUC006', iid2='R26.SG', ch=3, cutoff=0.99, output=True, exact=False)

Return opposing homozygotes boolean array and map array at intersection with GP>cutoff.

ancIBD.IO.h5_load.get_opp_homos_X(f_path, iid1, iid2, ploidy=(2, 2), cutoff=0.99, output=True, exact=False)

return oppo homo boolean array and map array at intersection with GP>cutoff. This function is tailored for X chromosome. Be careful with the ploidy difference between males and females. The ploidy argument should be a tuple of two integers, indicating the ploidy of iid1, iid2, respectively.

ancIBD.IO.h5_load.get_diff_gt_f(f_path, iid1, iid2, ch, cutoff=0.99, output=True, exact=False)

Return diff genotype boolean array and map array at intersection of maxGP > cutoff