
Tips

1.  Input

    READ "Input" DOCUMENT CAREFULLY.
 
    The program will stop if any errors are detected in the format of
    the pedigree file, the format of the marker files or the format of
    the genotype data files.  So read the "Input" document carefully
    and make sure the input files are in the correct format.

2.  Option 1 or Option 2

    Apply option 1 first, then use option 2.
 
    Option 1 performs the fast EIBD, AIBS and IBS tests only.
    Option 2 does everything that is done in option 1, then 
    proceeds to apply the MLLR test to a subset of the relative pairs. 
    Although the MLLR test has higher power than the EIBD, AIBS and
    IBS tests, it requires simulation for each pair to assess
    significance.  For each relative pair, the MLRT algorithm using
    100,000 simulated realizations takes approximately 5 minutes on a
    Sun Ultra-2.

    If you have a large number of pedigrees, in general, use option 1
    first.  The output file "prest_out1" contains the information on 
    the number of the pairs for which PREST would perform the MLRT 
    if option 2 is chosen.  If you multiply this number by 5 minutes, 
    you would get the approximate running time of your data on a Sun
    Ultra-2, if you choose option 2.

    Note that the Number of REPlicates (NREP) in the simulation can be
    changed in the original program "prest.c".  Reducing the number of
    replicates speeds up the simulation, but reduces the accuracy of
    the empirical p-value calculation.  Conversely, you can increase
    the accuracy by increasing NREP.  The default value of NREP 
    is 100,000.

3.  EIBD, AIBS, IBS or MLLR

    A general order(descending) of the tests in terms of their
    power is: MLLR, EIBD, AIBS, IBS. 

    In our simulations, the power of the MLLR, EIBD and AIBS tests
    (if applicable) are higher than that of the IBS test in all cases
    considered.  The power of the EIBD test is generally higher than
    that of the AIBS test, except in the case when the null
    relationship has p2 = 0 (the probability of sharing two alleles
    IBD by the pair) and the true relationship has moderate value of
    p_2, say full-sib.  The power of the MLLR test is usually
    higher than that of the EIBD test.  However when the true
    relationship is not contained in the alternative set for the MLLR
    test, the power of the EIBD test and that of the MLLR test tend to
    be very close for all the cases we have simulated.  In practice,
    we often do not have a specific alternative relationship in mind.
    In that case, the EIBD and AIBS tests, which do not require
    specification of a set of alternative relationships, can be
    useful.

    Parent-offspring relationship is a special case.  If the null
    relationship is parent-offspring and the true relationship has the
    same or close kinship coefficient, say full-sib, the EIBD test
    does not apply and both the AIBS and IBS tests have low power.  
    If the null and true relationships, as described above are
    reversed, all the EIBD, AIBS, and IBS tests have low power.  
    To overcome the unsatisfactory power of the EIBD, AIBS, IBS tests
    in both cases, while avoiding applying the MLLR test to every
    pair, the estimate of p = (p0, p1, p2) is used to identify
    possible misspecified parent-offspring pairs, which are
    listed in the output file "prest_out3".  Note that if option 2 is
    chosen, such pairs will then be tested by the use of the MLLR
    test.  However if one use option 1 only, one should pay extra
    attention to the pairs listed in the file "prest_out3".
    
4.  Significance Level 

    For a single hypothesis test, it is conventional to use a
    significance level of 0.05 or 0.01.  In practice, we often want to
    check for pedigree errors in a moderate number of pedigrees with a
    large number of relative pairs.  This creates a problem of
    multiple comparisons, e.g. even if the null hypothesis is true in
    all cases, some fraction of the pairs would be expected to have
    p-values below 0.05.  To reduce the number of false positives that
    would be expected to occur in testing a large number of pairs, a
    lower significance level is recommended.  A level of 0.001 could
    be used to "flag" pairs that are problematic.  However, to draw
    conclusion that some pairs are significant pairs, one should not
    strictly follow this 0.001 threshold.  One may want to
    Bonferroni-correct, i.e. multiply the most significant p-value by
    the number of hypothesis tests performed.  One may also be able to
    combine the information across several relative pairs that all
    point to the same relationship misspecification (see 6).

5.  Estimation of Relationships

    The estimation of p = (p0, p1, p2) can be used to suggest some
    likely relationships for a pair. This is especially useful if the
    relationship specified by the pedigree does not fit the data.

    Both "PREST" and "ALTERTEST" estimates p = (p0, p1, p2), 
    the probability of sharing zero, one or two alleles IBD.  
    To avoid non-convergence in the EM algorithm, the Maximum number
    of ITERations  (MITER) is 5000.  For any pair, if the no. of
    iterations is bigger than MITER, then the output is -1.  You can
    re-define MITER in the original program "prest.c" or "altertest.c".  

    Following are the null values of p = (p0, p1, p2) for some
    relationships and corresponding kinship coefficients.
                              
			            p0       p1      p2      kin.coef.

        MZ twins                    0        0       1       .5
	parent-offspring            0        1       0       .25             
	full-sib	            0.25     0.5     0.25    .25	
	half-sib-plus-first-cousin  0.375    0.5     0.125   .1875	
	half-sib		    0.5      0.5     0       .125
	grandparent-child	    0.5      0.5     0       .125
	avuncular		    0.5      0.5     0       .125
	first-cousin		    0.75     0.25    0       .0625
	half-avuncular		    0.75     0.25    0       .0625
	half-first-cousin	    0.875    0.125   0       .03125
	unrelated		    1        0       0       0

    (A pair is called half-sib+first-cousin, if they have the same
     mother, different fathers and two fathers are full-sib, or
     the same father, different mothers who are sisters.)

6.  Draw Conclusion of Pedigree Error
   
    First, take into account the amount of data, or the number of
    markers used in the tests.  Simulation studies show that our
    method has satisfactory power for detecting pedigree errors, 
    if genome-screen data are available, or the number of typed
    markers is large.  However, the tests are not reliable if only a
    few markers are typed.

    Second, consider the pattern of results among multiple pairs 
    from the same pedigree.  

    When a particular pair is observed to have significant deviation
    from its null relationship, the pattern among other relatives can
    often confirm and point to a likely explanation for the finding. 
    For instance, consider a family with a sibship of size 3 
    (ID 1, ID 2 and ID 3) with parents untyped.  If ID 1 were half-sib
    to the other two, then you would expect to detect an error in two
    different pairs (ID 1, ID 2) and (ID 1, ID 3), and not expect to see
    a relationship misfit in pair (ID 2, ID 3).  When an apparent
    error is detected, by considering the pattern of results among
    multiple pairs from the same pedigree, one many be able to
    distinguish a true relationship error from chance rejection of the null.  

    You can also use ALTERTEST to check whether some particular
    relationships are compatible with the data.  This will provide
    more evidence about the likely relationship for the pair.

7.  Alleles must have the same meaning across the families.

    Note that if allele numbers in the genotype data file refer to
    different alleles in different families, then the results of PREST
    and ALTERTEST will not be meaningful.  

8.  
