ancIBD.IO.h5_qc

Functions to QC imputed data (in hdf5 format) @ Author: Harald Ringbauer, 2021

Module Contents

Functions

get_gp_df([path_h5, chs, iids, cutoffs])

Check Genotype Probabilities for list of [iids] and list of [chs].

plot_gp([ch, iid, path_h5, figsize, color, savefolder])

Plot Max. Genotype Posteriors for

ancIBD.IO.h5_qc.get_gp_df(path_h5='', chs=[1], iids=['iid1'], cutoffs=[0.5, 0.6, 0.7, 0.8, 0.9])

Check Genotype Probabilities for list of [iids] and list of [chs]. Calculate max gp fractions <[cutoffs]. A value >0.01 for c=0.5 seems highly problematic, and should be flagged! Return summary dataframe for each chromosome and iid. path_h5: Path of the hdf5 up to the chr. number.

ancIBD.IO.h5_qc.plot_gp(ch=1, iid='Sz2', path_h5='/mnt/archgen/users/hringbauer/data/lango.2021.may/hdf5/ch', figsize=(9, 2.5), color='maroon', savefolder='')

Plot Max. Genotype Posteriors for [iid] on [ch].