ancIBD.IO.ind_ibd
Class for post-processing pw. IBD list into a single summary dataframe for each pair of individuals, for various IBD length classes @ Author: Harald Ringbauer, 2020
Module Contents
Functions
|
Post Process ROH Dataframe. Filter to rows that are okay. |
|
Gives out Summary statistic of ROH df |
|
Gives out IBD df row summary statistics. |
|
Create dataframe with summary statistics for each individual. |
|
Create dataframe with summary statistics for each individual. |
|
Create dataframe with all IBD for each indivdiual pair |
|
Returns list of IBD lengths in IBD dataframe df. [in cM] |
|
Return a new IBD dataframe with all possible IID pairs, |
|
Combine All Chromosomes. |
- ancIBD.IO.ind_ibd.filter_ibd_df(df, min_cm=4, snp_cm=60, output=True)
Post Process ROH Dataframe. Filter to rows that are okay. min_cm: Minimum Length in CentiMorgan snp_cm: How many SNPs per CentiMorgan
- ancIBD.IO.ind_ibd.roh_statistic_df(df, min_cm=0, col_lengthM='lengthM')
Gives out Summary statistic of ROH df
- ancIBD.IO.ind_ibd.roh_statistics_df(df, min_cms=[8, 12, 16, 20], col_lengthM='lengthM')
Gives out IBD df row summary statistics. Return list of sum_roh, n_roh, max_roh for each of them [as list] min_cm: List of minimum IBD lengths [in cM]
- ancIBD.IO.ind_ibd.create_ind_ibd_df(ibd_data='/n/groups/reich/hringbauer/git/yamnaya/output/ibd/v43/ch_all.tsv', min_cms=[8, 12, 16, 20], snp_cm=220, min_cm=6, sort_col=-1, savepath='', output=True)
Create dataframe with summary statistics for each individual. Return this novel dataframe in hapROH format [IBD in cM] ibd_data: If string, what ibd file to load. Or IBD dataframe. savepath: If given: Save post-processed IBD dataframe to there. min_cms: What IBD lengths to use as cutoff in analysis [cM]. snp_cm: Minimum Density of SNP per cM of IBD block. sort_col: Which min_cms col to use for sort. If <0 no sort conducted.
- ancIBD.IO.ind_ibd.create_ind_ibd_df_IBD2(ibd_data='/n/groups/reich/hringbauer/git/yamnaya/output/ibd/v43/ch_all.tsv', min_cms=[8, 12, 16, 20], snp_cm=220, min_cm=6, sort_col=-1, savepath='', output=True)
Create dataframe with summary statistics for each individual. !!!This should only be used for ancIBD run with the IBD2 mode.!!! Return this novel dataframe in hapROH format [IBD in cM] ibd_data: If string, what ibd file to load. Or IBD dataframe. savepath: If given: Save post-processed IBD dataframe to there. min_cms: What IBD lengths to use as cutoff in analysis [cM]. Note that this filter only applies to IBD1. snp_cm: Minimum Density of SNP per cM of IBD block. Note that this filter only applies to IBD1. sort_col: Which min_cms col to use for sort. If <0 no sort conducted.
- ancIBD.IO.ind_ibd.ind_all_ibd_df(path_ibd='/n/groups/reich/hringbauer/git/yamnaya/output/ibd/v43/ch_all.tsv', col_lengthM='lengthM', snp_cm=220, min_cm=5, output=True, sort=True, decimals=2, col_new='ibd', savepath='')
Create dataframe with all IBD for each indivdiual pair Return this novel dataframe in hapROH format [IBD in cM] path_ibd: What ibd file to load. snp_cm: Minimum Density of SNP per cM of IBD block. sort: If True sort by longest IBD decimals: To how many decimals to round
- ancIBD.IO.ind_ibd.ibd_lengths(df, col_lengthM='lengthM', string=True, sort=True, decimals=2, mpl=100)
Returns list of IBD lengths in IBD dataframe df. [in cM] string: If True - return comma seperated string. sort: Whether to sort IBD list
- ancIBD.IO.ind_ibd.all_pairs_ibd(df_res, df_iid)
Return a new IBD dataframe with all possible IID pairs, set to 0 IBD if not in IBD dataframe df_iid: where to take iids from df_res: IBD dataframe (standard format)
- ancIBD.IO.ind_ibd.combine_all_chroms(chs=[], folder_base='PATH/ch', path_save='PATH/ch_all.tsv')
Combine All Chromosomes. chs: Which Chromosomes to run [list] folder_base: Where to load from (path part up to including ch) path_save: Where to save the combined file to.