ancIBD.IO.h5_duplicates

Functions to detect duplicate Samples in imputed HDF5 file @ Author: Harald Ringbauer, 2023

Module Contents

Functions

get_match_df([path_h5, ch, iids, gp_cutoff])

Create diploid Match Rate for all pairs in list of iids

get_fraction_identical(f[, sample1, sample2, ...])

Get Fraction of Identical Genotype Configurations.

get_exact_idx_iid(f, sample[, unique])

Return Index of sample samples in hdf5 f

get_markers_good(f, j[, output, cutoff])

Get markers

ancIBD.IO.h5_duplicates.get_match_df(path_h5='./data/hdf5/1240k_v54.1/ch', ch=3, iids=[], gp_cutoff=0.98)

Create diploid Match Rate for all pairs in list of iids

ancIBD.IO.h5_duplicates.get_fraction_identical(f, sample1='SUC006', sample2='R26.SG', gp_cutoff=0.98, output=False)

Get Fraction of Identical Genotype Configurations. Return Fraction same IID, and fraction SNPs both IIDs tested

ancIBD.IO.h5_duplicates.get_exact_idx_iid(f, sample, unique=True)

Return Index of sample samples in hdf5 f

ancIBD.IO.h5_duplicates.get_markers_good(f, j, output=True, cutoff=0.99)

Get markers