CDCB changes to evaluation system (December 2018)
New Reference Genome Assembly in Use at CDCB
By Dan Null, Derek Bickhart, Paul VanRaden, Lillian Bacheller, John Cole, Jeff O’Connell and Ben Rosen
In 2018, AGIL researchers – in cooperation with other ARS locations and the University of California-Davis – released a new version of the cattle DNA reference genome (Rosen et al., 2018) to replace the University of Maryland version used since 2009 by researchers around the world. Both maps sequenced DNA from the inbred Hereford cow Dominette, but the new map used longer reads to improve accuracy over repetitive sections of DNA. The new map showed improved imputation of genotypes, alignment of sequence from other animals and annotation of gene structure. International researchers in the 1000 Bull Genomes Project have agreed to use the new ARS-UCD1 reference genome instead of UMD3 as the common language for tracking variation in the next release (run7), and Bob Schnabel (University of Missouri) provided new locations for previous markers.
Further testing at AGIL compared properties of new and previous maps for imputation. The current list of 60K SNPs used routinely by CDCB excludes several sections of UMD3 that were mismapped. Null et al. (2018; abstract and powerpoint) initially compared our edited UMD3 map to a pre-release version of the new map. More recent testing used the public version of ARS-UCD1 plus a further-edited version obtained by removing some apparently mismapped regions that cause haplotype non-inheritance. Lower non-inheritance and fewer haplotypes per segment in Tables 1, 2 and 3 indicate that the new map better matches true DNA sequences.
Table 1. Average non-inheritance of haplotypes (%).
|UMD3 edited||ARS-UCD1||ARS-UCD1 edited|
Table 2. Maximum non-inheritance of haplotypes (%).
|Breed||UMD3 edited||ARS-UCD1||ARS-UCD edited|
Table 3. Maximum number of haplotypes per segment.
|Breed||UMD3 edited||ARS-UCD1||ARS-UCD edited|
Genomic Evaluations Using an 80k SNP Set
By George Wiggans, Daniel Null, Lillian Bacheller and Paul VanRaden
The number of markers used in genomic predictions increased to 79,276 (or 80k) from the previous 60,671 used since 2014. The revised list includes more exact gene tests added recently to chips, removes poorer performing markers, adds new variants with larger effects on traits and changes the marker order based on the new map. Recent chips from Zoetis, Neogen and Genetic Visions each include new variants selected by AGIL from sequence or high-density chips. The original 50K list is still used by nearly all other countries.
Reliability gains from the current 60,671 SNP set versus 77,321 SNPs in a preliminary study were estimated to average 1.4 percentage points across traits for Holsteins when the added SNP were selected from high-density (HD) chips including gene tests (Wiggans et al., 2016). Reliability gains were estimated to average 2.7 percentage points when the added SNP were selected from both sequence and HD data (VanRaden et al., 2017). The final SNP set implemented included a total of 79,294 SNPs and was a combination of these two projects. The final set included about 3,000 instead of 16,000 of the SNP selected from the sequence data, because only those 3,000 had been added to chips.
The research was conducted in two phases for Holstein animals – first by imputing all bulls and their ancestors, and then using those haplotypes as priors to impute the remaining two million females. The first phase took about two days of computing, and the second phase took one week requiring 25 processors and 270 Gbytes memory (22% of available). The new list increased run times for some key programs by about 30%. Priors computed at AGIL were then transferred to CDCB for use in a November test run and the upcoming December official evaluation for all traits and breeds. Estimates of breed base representation (BBR) are now also upgraded to the new list at CDCB.
One important mutation controlling about 30% of fat yield is now directly included (DGAT1; Gautier et al., 2007). Genomic predictions improved the most for the Jersey and Holstein breeds that have larger reference populations and larger effects of DGAT1. More gene tests, QTLs, selected sequence SNPs and high-density SNPs with larger effects are now included in the SNP list. DGAT1 now has larger effects on yield and Net Merit than any marker in Jersey and Holstein, and about the same size as the markers near DGAT1 in Guernsey. As a result, predictions for those breeds changed more than the predictions for Ayrshire and Brown Swiss breeds where the DGAT1 effects were smaller or where the minor allele frequency was lower.
Large reference populations obtain more benefit from more SNPs because of more phenotypes to estimate each SNP effect. For Holsteins and Jerseys, correlations with previous Predicted Transmitting Abilities (PTAs) are about 0.99 and reliability increases are only about 1% for yield traits.
Table 4. Correlations of previous with new genomic PTAs for yield traits.
|Breed||Animals with phenotypes||Young animals without phenotypes|
Correlations of PTAs for many other traits were lower than for yield traits. For young Holsteins, the six new health traits averaged a little less than 0.99, whereas type traits and calving traits both averaged slightly less than 0.98. The largest individual PTA changes in each breed were observed on foreign animals that are less connected to the US population, animals with less complete pedigrees, and animals genotyped with the lowest density chips. The correlations in Table 4 excluded older cows genotyped with the 3K chip and imputed dams; correlations are lower when those animals are included.
Changes in Haplotype Distribution in Holstein, Brown Swiss and Jersey
By Paul VanRaden and Daniel Null
A new recessive haplotype in Holsteins (HH6) was discovered in France on chromosome 16, along with its mutation in the initiator codon of gene SDE2 at location 29,773,628 on the UMD3 map. Further details are published by Fritz et al., 2018 and presented by Escouflaire et al., 2018 at the ICAR meeting in New Zealand. The current frequency of HH6 is 1% in France. The current frequency in U.S. data is 0.5%, but it was about 1% in previous decades. French researchers traced HH6 back to “2070579 Mountain”, and AGIL researchers traced HH6 back further to his maternal grandsire (MGS) “1723741 Chairman” and even 4 generations further back to “1244845 Skyliner”, born in 1954. “Chairman” had contributed almost 7% of the genes to the U.S. bull population by 1998, but his influence then declined to 5%, and the expected carrier frequency would be 5% if the haplotype was neutral. The fertility effects of HH6 were confirmed using CDCB data from 371 carrier sire x carrier MGS matings, with a larger than expected 9% +- 2% drop in conception rate.
Previous haplotypes BH1 and JH2 for Brown Swiss and Jersey, respectively, will be discontinued – effective with the December 2018 run. Several European bulls are homozygous for BH1, and its fertility effect is no longer significant. Thus, European Brown Swiss breeders have also decided to discontinue reporting BH1. JH2 is very difficult to trace with the new ARS-UCD reference map, and the previous absence of homozygous haplotypes was possibly an artifact of map issues.
Haplotypes BH1 and JH2 met the initial statistical tests for publication in 2011 and 2013, with no homozygous animals found and initial estimates of -3.4 ± 1.5% for the conception rate effect of BH1 based on 936 carrier sire by carrier maternal grandsire matings and an initial effect of -4.0% ± 1.5% for JH2 based on 1,098 carrier matings. The fertility losses from BH1 and from JH2 carrier matings were retested using the most current four years of data, and neither were significant. The JH2 carrier frequency in decades before 1990 was 14-28% but had decreased steadily to only 2%, and the reason for the JH2 frequency decline was never clear. Causative mutations were not found for either BH1 or JH2. Thus, both haplotypes will not be reported any more, effective with the December 2018 run.
Large Number of Gene Tests Added to Haplotype Determination
By Daniel Null and Paul VanRaden
Numbers of SNPs, inclusion of gene tests and presence or absence of nearby SNPs with poorer quality can affect carrier status for fertility haplotypes. The new 80k SNP set now contains many more gene tests that were added to recent chips and provided to CDCB, primarily from Neogen. Those tests help impute carrier status for all other animals, but the quality of the gene tests must also be monitored. The large change reported for JH1 was already implemented and announced on September 13 by CDCB and is also reflected in the stats below because these AGIL tests compared to August status. We intended to include the HH5 gene test, but it reported many homozygous animals whereas the haplotype had none. An investigation on this gene test results is underway at Neogen, following these findings.
Comparisons of carrier status from the new versus old list in Table 5 reveal that most haplotypes are very stable, but a few more animals switched to being carrier than to being non-carrier. That may result from the gene tests revealing additional families not previously known to be carriers or from better haplotype inheritance with the new map and more rigorous SNP edits. The statistics for Holstein are from bulls and their ancestors, whereas the status changes more for females with incomplete pedigrees or fewer genotyped ancestors. The statistics for other breeds include all animals. Further refinements of the methods are possible before December.
Table 5. Changes in haplotype status from the old to new map and SNP list.
|Haplotype||Same status||Changed to carrier (%)||Changed to non-carrier (%)||Haplotype frequency||Comment|
|HH0||99.8||.17||.05||3.2||Gene test added|
|HH1||99.7||.24||.05||2.6||Gene test already included|
|HH3||99.5||.46||.03||4.6||Gene test added|
|HH4||99.9||.02||.01||0.5||Gene test added|
|HH5||98.2||1.69||.15||6.2||Gene test not added|
|HH6||n/a||.54||.00||0.5||New, discovered in FRA|
|HHR||97.1||1.33||1.53||9.4||Gene tests added|
|HHBR||99.7||.21||.08||1.2||Gene test added|
|HHDR||99.9||.03||.00||0.2||Gene test added|
|JH1||98.4||1.27||.29||18.4||Corrected GMD gene test|
|BH2||99.4||.42||.20||13.3||Gene test added|
|AH1||98.2||1.55||.23||22.2||Gene test added|
|AH2||99.0||.69||.35||21.0||Gene test added|
Criteria update for young heifers in female file (format 105)
By Gary Fok
As of August 2018, the criteria used to differentiate heifers from young lactating cows will be based on number or records and not on comparison of PTA vs PA of the animal. The old criteria, used for years, blanked the PTAs of animals without lactation records. Since April, some additional cows had their PTAs blanked if reliability was not sufficiently higher than their PA reliability. Effective December 2018, PTA is again reported for all animals having a lactation record. In a test run based on the previous August 2018 evaluation, the new criteria resulted in 21,604 additional animals receiving a PTA in the female (format 105) file.