TY - JOUR
T1 - Estimating breed composition for pigs: A case study focused on Mangalitsa pigs and two methods
AU - Chinchilla-Vargas, Josue
AU - Bertolini, Francesca
AU - Stalder, K. J.
AU - Steibel, J. P.
AU - Rothschild, M. F.
PY - 2021
Y1 - 2021
N2 - Breed associations and registries maintain breed purity by enforcing certain conformational characteristics defining the breed along with cataloging the pedigree of every animal in the registry. Furthermore, developing niche markets is often based on specialized products using heritage breeds that need to guarantee breed purity. Genomic technology and the progressively lower costs of genotyping can be helpful when assessing breed purity by estimating breed composition. In this research, genotypes from 648 pigs and 11 breeds were used to develop marker panels to estimate breed composition with special emphasis on Mangalitsa pigs as a heritage breed. Two sets of panels were created. The first set was based on Fst scores that were calculated individually for ~31,000 available markers across the pig genome. Here, panels composed of the 10, 50, 100, 500 and 1000 markers with the highest Fst scores were generated. The second set was composed by randomly selected markers and had the same number of markers as the Fst-derived panels. Two statistical methods, linear regression and random forest were then used on the marker panels to estimate breed composition, of 107 pigs including 47 individuals known to have Mangalitsa background. Fst appeared to be better at identifying Mangalitsa individuals when compared to random markers regardless of the method used to estimate breed composition. However, random markers were more accurate at estimating breed composition for non-Mangalitsa individuals. When the results were compared across methods for estimating breed composition, linear regression produced more accurate estimates of breed composition than random forest. However, both methods lacked accuracy when estimating breed composition for crossbred individuals. It must also be noted that these methods were focused on estimating breed composition of Mangalitsa pigs and different markers should be selected if different breeds will be the focus and accuracy of prediction will depend on the breeds that are available to be used as references for the Fst calculations. The results presented in this study allow us to conclude that: 1) Random forest was effective at classifying individuals into breeds, but not at estimating breed composition when compared to the linear regression method. 2) Markers filtered using Fst scores are more effective at identifying Mangalitsa breed composition while not as effective at identifying other breeds. 3) If Fst-filtered markers that are effective at identifying Mangalitsa from other breeds are being used to estimate breed composition for individuals of other breeds, a greater number of markers is needed.
AB - Breed associations and registries maintain breed purity by enforcing certain conformational characteristics defining the breed along with cataloging the pedigree of every animal in the registry. Furthermore, developing niche markets is often based on specialized products using heritage breeds that need to guarantee breed purity. Genomic technology and the progressively lower costs of genotyping can be helpful when assessing breed purity by estimating breed composition. In this research, genotypes from 648 pigs and 11 breeds were used to develop marker panels to estimate breed composition with special emphasis on Mangalitsa pigs as a heritage breed. Two sets of panels were created. The first set was based on Fst scores that were calculated individually for ~31,000 available markers across the pig genome. Here, panels composed of the 10, 50, 100, 500 and 1000 markers with the highest Fst scores were generated. The second set was composed by randomly selected markers and had the same number of markers as the Fst-derived panels. Two statistical methods, linear regression and random forest were then used on the marker panels to estimate breed composition, of 107 pigs including 47 individuals known to have Mangalitsa background. Fst appeared to be better at identifying Mangalitsa individuals when compared to random markers regardless of the method used to estimate breed composition. However, random markers were more accurate at estimating breed composition for non-Mangalitsa individuals. When the results were compared across methods for estimating breed composition, linear regression produced more accurate estimates of breed composition than random forest. However, both methods lacked accuracy when estimating breed composition for crossbred individuals. It must also be noted that these methods were focused on estimating breed composition of Mangalitsa pigs and different markers should be selected if different breeds will be the focus and accuracy of prediction will depend on the breeds that are available to be used as references for the Fst calculations. The results presented in this study allow us to conclude that: 1) Random forest was effective at classifying individuals into breeds, but not at estimating breed composition when compared to the linear regression method. 2) Markers filtered using Fst scores are more effective at identifying Mangalitsa breed composition while not as effective at identifying other breeds. 3) If Fst-filtered markers that are effective at identifying Mangalitsa from other breeds are being used to estimate breed composition for individuals of other breeds, a greater number of markers is needed.
KW - Mangalitsa
KW - Mangalica
KW - Swine
KW - Breed composition
KW - Random forest
KW - Linear regression
U2 - 10.1016/j.livsci.2021.104398
DO - 10.1016/j.livsci.2021.104398
M3 - Journal article
VL - 244
JO - Livestock Science
JF - Livestock Science
SN - 1871-1413
M1 - 104398
ER -