ancIBD.ibd_stats.funcs

Function to do statistics on individual IBD dataframes In particular functions to get population level IBD rates, normalized per pair. @ Author: Harald Ringbauer, 2020, All rights reserved

Module Contents

Functions

new_columns(df, df_meta[, col, col_new, match_col])

Maps Entries from meta dataframe onto the IBD dataframe.

find_relatives(df[, iid, min_ibd, cm])

Identify all relatives of a given sample

rc_date(age)

Ascertain whether age is rc.

plot_age_diff(df[, figsize, title, xlim, ylim, cs, ...])

Plot the Age Difference between two samples

give_sub_df(df[, pop1, pop2, col, output, exact])

Return sub dataframe where pair across pop1 and pop2

remove_iids(df[, iids])

Remove sample from pw. Indivdiual IID dataframe if pair contains one of flagged samples.

give_stats_cm_bin(df[, cms, binary, output])

Return counts of IBD in bins.

get_IBD_stats(df[, pop1, pop2, col, exact, cms, ...])

Get IBD fraction statistics.

get_IBD_stats_pops(df[, pops1, pops2, col, cms, ...])

Get IBD fraction statistics for list of pop pairs.

get_ci_counts(counts, n[, a, minc])

Get Confidence Intervalls from counts and

ibd_stats_pop_pairs(df, pop_pairs[, cms, col, a, ...])

For a list of population pairs, get ibd summary statistics for all pairs of

create_ibd_pop_pair_df(pop_pairs, ns, fracs, cis[, ...])

Save Population Pair dataframe if needed

ancIBD.ibd_stats.funcs.new_columns(df, df_meta, col='New Clade', col_new='', match_col='iid')

Maps Entries from meta dataframe onto the IBD dataframe. Return modified dataframe

ancIBD.ibd_stats.funcs.find_relatives(df, iid='', min_ibd=20, cm=20)

Identify all relatives of a given sample

ancIBD.ibd_stats.funcs.rc_date(age)

Ascertain whether age is rc. age: Can be Array

ancIBD.ibd_stats.funcs.plot_age_diff(df, figsize=(8, 8), title='', xlim=[-2000, 2000], ylim=[10, 5000], cs=['red', 'green'], fs=14, yscale='log', rcdate=False)

Plot the Age Difference between two samples

ancIBD.ibd_stats.funcs.give_sub_df(df, pop1='La Caleta', pop2='La Caleta', col='clst', output=True, exact=False)

Return sub dataframe where pair across pop1 and pop2

ancIBD.ibd_stats.funcs.remove_iids(df, iids=[])

Remove sample from pw. Indivdiual IID dataframe if pair contains one of flagged samples. Return Updated Dataframe. df: Input Dataframe iids: List of IIDs to remove

ancIBD.ibd_stats.funcs.give_stats_cm_bin(df, cms=[8, 12, 16, 20], binary=True, output=True)

Return counts of IBD in bins. df: IBD dataframe df of pw. individuals cms: Which bins to look into binary: Only count existing (e.g. 1/0 values) If upper bound 0: Take infinite bin

ancIBD.ibd_stats.funcs.get_IBD_stats(df, pop1='', pop2='', col='clade', exact=False, cms=[4, 6, 8, 10, 12], binary=True, output=False, a=0.05)

Get IBD fraction statistics. a: Signficance level binary: Only count existence of IBD, not total count [0/1 per pair] Return fractions, confidence intervalls as well as number of pairsise comparisons

ancIBD.ibd_stats.funcs.get_IBD_stats_pops(df, pops1=[], pops2=[], col='clade', cms=[4, 6, 8, 10, 12], output=False, binary=True, exact=False, a=0.05)

Get IBD fraction statistics for list of pop pairs. Return lists of fractions, confidence intervalls as well as number of pairwise comparisons. a: Significance Level binary: Only count existence [0/1 possible per pair]

ancIBD.ibd_stats.funcs.get_ci_counts(counts, n, a=0.05, minc=0.0001)

Get Confidence Intervalls from counts and trials. Return list of CIS (lentght 2 each) counts: Array of Counts n: Total number of trials a: Signficance level

ancIBD.ibd_stats.funcs.ibd_stats_pop_pairs(df, pop_pairs, cms=[8, 12, 16, 20, 0], col='label_region', a=0.32, binary=True, output=False, exact=True)

For a list of population pairs, get ibd summary statistics for all pairs of individuals. Return list of ractions, list of CIs, list of #comparisons

ancIBD.ibd_stats.funcs.create_ibd_pop_pair_df(pop_pairs, ns, fracs, cis, cms=[8, 12, 16, 20, 0], savepath='')

Save Population Pair dataframe if needed