Tuesday 2 February 2016

A second study tracking DNA segments through time and space

I recently had my first confirmed match with a known genealogical cousin at AncestryDNA. This was also my first ever shaky leaf hint at AncestryDNA which was very exciting and it was good to see how the system worked in practice. In this case the hint was spot on and correctly predicted that we are related through our mutual great-great-grandparents Charles James Wiggins and Mary Ann Thorn. Charles Wiggins was born in 1828 in Clapham in South London. Mary Ann Thorn was born in Colchester, Essex. They married in 1848 in Lewisham, London, and went on to have 11 children, though their youngest daughter Catherine died in infancy.
My cousin has now transferred her data to Family Tree DNA which has allowed me to do some comparisons with my other family members who have tested there. One of the interesting aspects of autosomal DNA testing is that it allows us to study the process of inheritance, and to see how the segments are passed on from one generation to the next. (Unfortunately this is not possible at AncestryDNA because they do not provide us with a chromosome browser.)

I did a previous exercise in tracking segments through the generations when I got my first confirmed match with a fourth cousin at Family Tree DNA, and I thought it would be interesting to repeat the exercise now that I have a second confirmed match with a cousin.

The first chromosome browser view below is a comparison between my dad and our new genetic cousin. They are second cousins once removed. For the first three comparisons I've screened out the segments under 5 cMs in size. The segment on chromosome is only 6.54 cMs and the one on chromosome 16 is 7.94 cMs. Small segments are often false positives but I'll leave them in here for the purposes of this exercise.


The second chromosome browser view is a comparison between me and my genetic cousin. We are third cousins. As can be seen I've inherited all four of the large segments from my dad and one of the two smaller segments.


The third chromosome browser view is a comparison between my son and the cousin. They are third cousins once removed. Here you can see that in just one generation four of the five segments I inherited from Charles James Wiggins and Mary Ann Thorn have been lost completely.



I thought it would also be interesting to take a look at the smaller segments. The chromosome browser view below is from the perspective of my son, and shows the segments that he shares with his third cousin once removed (blue) and with his paternal grandfather (orange). Any segment that my son has received from Charles James Wiggins and Mary Ann Thorn must have been inherited from his paternal grandfather. However, we can see here that on chromosomes one and two there are floating segments that do not line up with the segments he's inherited from his maternal grandfather. The other segments do at least match in the right place but I suspect they would probably all disappear with phasing.

 My son's maternal grandmother has also been tested at Family Tree DNA so let's have a look now at what happens when I do a comparison between my son and his maternal grandparents. His maternal grandfather is shown in orange and his maternal grandmother in blue in the view below. You can see how the segments from his two grandparents fit together like large teeth. There's a slight oddity on chromosome 13 where a stray small false segment seems to have crept in. However, the basic principle remains that segments are generally passed on in large chunks and are not broken down into a myriad of tiny segments in the noisy pattern that we see in the chromosome browser view above.


This exercise also gives me the opportunity to compare the centiMorgan count at both AncestryDNA and Family Tree DNA. I can only do comparisons between myself and my third cousin as we are the only ones who are in both databases. Ancestry phase the data before calculating the matches and do not report segments under 5 cMs. Family Tree DNA do not phase the data and report segments right down to 1 cMs in size.

According to AncestryDNA I share 110 cMs spread across 8 segments. There appears to be something involved in the process of phasing which breaks large segments into smaller units which explains the higher than expected segment count at AncestryDNA. At Family Tree DNA I share 144.49 cMs over 14 segments with my third cousin. If I subtract all the segments under 5 cMs this brings the count down to 124.28 cMs. If I remove the 6.54 segment from the calculations the count comes down to 117.74 which is pretty close to the figure from AncestryDNA. If the AncestryDNA phasing is correct then it seems likely that the 6.54 cM segment is a false positive.

Exercises like this help us to understand the inheritance process and I would be very interested to see similar studies from other genetic genealogists.

Update
The blog post Behind the new AncestryDNA feature: amount of shared DNA by Anna Swayne explains how the company's phasing and Timber algorithms work.

© 2016 Debbie Kennett

4 comments:

Linda Reid said...

If you share mutual great-great-great grandparents with a new genetic cousin, wouldn't that make you 4th cousins?

Debbie Kennett said...

Linda

You're quite right. Thanks for spotting the error. It is a third cousin relationship so the common ancestors are our great-great-grandparents. I was looking at the relationships in Family Historian where I have a column that tells you how you're related. Charles James Wiggins was the son of Charles James Wiggins and I looked at the wrong row.

Andrew Millard said...

Does the 'oddity' on Chr 13 show up if you compare your parents to one another, or if you compare them to your husband? These comparisons will show if it is a real match due to a common segment or a false match.

The small segments that match between your father and the new cousin could be false positives, either from lack of phasing or common segments within the population that are not useful for identifying relatives, but they could be due to them being what I think should be called 'cryptic cousins' - cousins within a genealogical timeframe but not identified as such due to incomplete or erroneous trees.

Debbie Kennett said...

Andrew, You can only do comparisons in the chromosome browser between people who are already matches so I have no way of comparing my parents to each other or to my husband. I could do this if I put all the kits on GedMatch but I haven't yet done that.

With the currently available tests most of the small segments under 5 cMs are false positive. Even with phasing there's a very high false positive rate:

http://www.ncbi.nlm.nih.gov/pubmed/24784137

We need whole genome sequencing to detect small segments but in general segments under 5 cMs will fall well beyond the reach of genealogical records.

23andMe used the term "cryptic relatives" to define hidden genealogical relationships in this paper:

http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0034267