ancIBD.IO.h5_modify

Module Contents

Functions

get_af(f[, min_gp, af_def])

Get Allele Frequency from HDF file f (using GP)

get_af1000G(f)

Get Allele Frequency - ASSUME ALL GT are set!

merge_in_af(path_h5, af[, col_af])

Merge in AF into hdf5 file. Save modified h5 in place

lift_af(h5_target, h5_original[, field, match_col, ...])

Bring over field from one h5 to another. Assume field does not exist in target

lift_af_df(h5_target, path_df[, field, match_col, dt, ...])

Load allele frequencies from dataframe at path_df [string]

merge_in_ld_map(path_h5, path_snp1240k[, chs, write_mode])

Merge in MAP from eigenstrat .snp file into

save_h5(gt, ad, ref, alt, pos, rec, samples, path[, ...])

Create a new HDF5 File with Input Data.

ancIBD.IO.h5_modify.get_af(f, min_gp=0.99, af_def=0.5)

Get Allele Frequency from HDF file f (using GP) min_gp: Minimum Genotype Probability used for calculation. af_def: Default Allele Frequency if no data.

ancIBD.IO.h5_modify.get_af1000G(f)

Get Allele Frequency - ASSUME ALL GT are set!

ancIBD.IO.h5_modify.merge_in_af(path_h5, af, col_af='AF_ALL')

Merge in AF into hdf5 file. Save modified h5 in place af: Array of allele frequencies to save

ancIBD.IO.h5_modify.lift_af(h5_target, h5_original, field='variants/AF_ALL', match_col='variants/POS', dt=np.float64, p_def=0.5)

Bring over field from one h5 to another. Assume field does not exist in target h5_original: The original hdf5 path h5_target: The target hdf5 path field: Which fielw to copy over p_def: Default Value of allele frequency

ancIBD.IO.h5_modify.lift_af_df(h5_target, path_df, field='variants/AF_ALL', match_col='variants/POS', dt=np.float64, p_def=0.5)

Load allele frequencies from dataframe at path_df [string] and merge into hdf5 file at h5_target [string] at field [string] Match positions on match_col [string]

ancIBD.IO.h5_modify.merge_in_ld_map(path_h5, path_snp1240k, chs=range(1, 23), write_mode='a')

Merge in MAP from eigenstrat .snp file into hdf5 file. Save modified h5 in place path_h5: Path to hdf5 file to modify. path_snp1240k: Path to Eigenstrat .snp file whose map to use chs: Which Chromosomes to merge in HDF5 [list]. write_mode: Which mode to use on hdf5. a: New field. r+: Change Field

ancIBD.IO.h5_modify.save_h5(gt, ad, ref, alt, pos, rec, samples, path, gp=[], compression='gzip', ad_group=True, gt_type='int8')

Create a new HDF5 File with Input Data. gt: Genotype data [l,k,2] ad: Allele depth [l,k,2] ref: Reference Allele [l] alt: Alternate Allele [l] pos: Position [l] m: Map position [l] samples: Sample IDs [k]. Save genotype data as int8, readcount data as int16. ad: whether to save allele depth gt_type: What genotype data type save