Identification and analysis of SNP markers associated with traits of interest in rice using Machine Learning methodology
Abstract
The use of molecular markers to select superior individuals for traits of interest is essential to accelerate the development of rice cultivars. Quantitative traits are challenging to work with in marker-assisted selection, and new methodologies must be continually evaluated. This study aimed to identify SNP markers associated with five quantitative traits through the Machine Learning (ML) methodology, which used genotyping (4,709 SNPs) and grain yield data from 541 accessions from Embrapa’s Rice Core Collection evaluated in nine locations. Fifteen TaqMan® hydrolysis probebased assays were developed from SNPs associated with key traits, and 31
rice varieties were both genotyped and phenotyped for validation. Using simple linear regression analysis, four SNPs were significantly associated with panicle number and grain yield, while three were linked to the percentage of filled grains. The application of machine learning methods, coupled with the evaluation of selected SNPs and the development of TaqMan® assays, provided an effective approach for identifying markers to support routine marker-assisted selection in rice breeding programs.




