Identification and functional annotation of genic simple sequence repeats from leaves tissue transcriptome dataset of Stevia rebaudiana Bertoni

Stevia rebaudiana is an important agricultural crop that yields diterpenoid steviol glycosides (SGs) commonly used to substitute sugar in food products and nutraceuticals. Despite the increasing demand for Stevia leaf and Stevia-based products, the genetic background of this crop remains poorly e...

Full description

Saved in:
Bibliographic Details
Main Author: Azmi Murad, Azrul Afiq
Format: Thesis
Language:English
Published: 2021
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/104312/1/AZRUL%20AFIQ%20BIN%20AZMI%20MURAD%20-%20IR.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-upm-ir.104312
record_format uketd_dc
spelling my-upm-ir.1043122023-08-08T02:06:05Z Identification and functional annotation of genic simple sequence repeats from leaves tissue transcriptome dataset of Stevia rebaudiana Bertoni 2021-06 Azmi Murad, Azrul Afiq Stevia rebaudiana is an important agricultural crop that yields diterpenoid steviol glycosides (SGs) commonly used to substitute sugar in food products and nutraceuticals. Despite the increasing demand for Stevia leaf and Stevia-based products, the genetic background of this crop remains poorly elucidated. The genetic markers available for this species are also extremely lacking. The current study investigated an in-house leaf tissue transcriptome dataset of Stevia rebaudiana and developed genic-SSR markers for the species using in silico approaches. In total, 103,890 de novo assembled contig sequences were analysed. Out of that, 8,065 contigs containing 8,789 genic-SSR loci were unearthed via MIcroSAtellite identification (MISA) tool. From the 8,065 contigs containing genic-SSR (CCGS) found, 7,400 CCGS contained single genic-SSR per locus; while 665 CCGS contained multiple SSR per locus (ML). Furthermore, amongst the 8,789 genic-SSR, 8,302 were identified as pure genic-SSRs, 105 were complex genic-SSRs and the remaining 382 were compound genic-SSRs. From the functional annotation of the 8,065 CCGS identified, 6,447 CCGS were annotated with functional genes; while remaining 1,618 CCGS were unannotated. Out of 6,447 annotated CCGS, 5,494 CCGS matched significantly to protein sequences of various plant species with an E-value cut off at 1.0E-15. Among the 5,494 CCGS, 3,069 CCGS were annotated with known functional genes and containing only single pure genic-SSR per locus. Pure trinucleotide genic-SSRs (52.66%) were the predominant repeats. This was followed by pure di- (35.32%), hexa- (6.48%), penta- (3.84%) and tetranucleotides (1.69%). Microsatellite di- and trinucleotides are preponderant in S. rebaudiana leaf transcriptome. Repeat motif AT/TA (50.28%) was the most abundant among the dinucleotides, and the repeat motif GAT/ATC (12.87%) was predominant among the trinucleotides. From the 3,069 annotated CCGS, 1,617 were mapped to proteins available in the Kyoto Encyclopaedia of Genes and Genomes (KEGG) database. The biosynthesis pathways with the highest number of annotated CCGS mapped to them were the metabolic pathways, secondary metabolite biosynthesis pathway, and antibiotics biosynthesis pathway. Most studies on S. rebaudiana focused on the biosynthesis of secondary metabolites with a particular interest in SGs that contribute to the natural sweetness of Stevia. In this study, a total number of 14 genic-SSR loci associated with genes involved in the SGs biosynthesis pathway were identified. In addition, twenty pairs of genic-SSR primers were also designed and further validated in this study. From the 20 primer pairs, 17 (85.00%) were successfully cross-amplified in three different varieties of S. rebaudiana (SweetStevia, UKMB408 and AKHL1 var.). Three out of 17 loci screened were found to be polymorphic as revealed by polyacrylamide gel electrophoresis and confirmed by bidirectional amplicon sequencing of the PCR products. In conclusion, the transcriptome dataset has served as an excellent resource for the discovery of genic-SSRs in Stevia rebaudiana, and it also shows promising potential to develop polymorphic genic- SSR markers. As DNA markers available for this species is still very limited, the genic-SSR loci identified in this study will contribute substantially to the development of more DNA markers for the species, which may be applied in population and functional studies in the future. It may also be used as the baseline data towards developing DNA markers for selective breeding in the future. Stevia rebaudiana Metabolites 2021-06 Thesis http://psasir.upm.edu.my/id/eprint/104312/ http://psasir.upm.edu.my/id/eprint/104312/1/AZRUL%20AFIQ%20BIN%20AZMI%20MURAD%20-%20IR.pdf text en public masters Universiti Putra Malaysia Stevia rebaudiana Metabolites Yong, Christina Seok Yien
institution Universiti Putra Malaysia
collection PSAS Institutional Repository
language English
advisor Yong, Christina Seok Yien
topic Stevia rebaudiana
Metabolites

spellingShingle Stevia rebaudiana
Metabolites

Azmi Murad, Azrul Afiq
Identification and functional annotation of genic simple sequence repeats from leaves tissue transcriptome dataset of Stevia rebaudiana Bertoni
description Stevia rebaudiana is an important agricultural crop that yields diterpenoid steviol glycosides (SGs) commonly used to substitute sugar in food products and nutraceuticals. Despite the increasing demand for Stevia leaf and Stevia-based products, the genetic background of this crop remains poorly elucidated. The genetic markers available for this species are also extremely lacking. The current study investigated an in-house leaf tissue transcriptome dataset of Stevia rebaudiana and developed genic-SSR markers for the species using in silico approaches. In total, 103,890 de novo assembled contig sequences were analysed. Out of that, 8,065 contigs containing 8,789 genic-SSR loci were unearthed via MIcroSAtellite identification (MISA) tool. From the 8,065 contigs containing genic-SSR (CCGS) found, 7,400 CCGS contained single genic-SSR per locus; while 665 CCGS contained multiple SSR per locus (ML). Furthermore, amongst the 8,789 genic-SSR, 8,302 were identified as pure genic-SSRs, 105 were complex genic-SSRs and the remaining 382 were compound genic-SSRs. From the functional annotation of the 8,065 CCGS identified, 6,447 CCGS were annotated with functional genes; while remaining 1,618 CCGS were unannotated. Out of 6,447 annotated CCGS, 5,494 CCGS matched significantly to protein sequences of various plant species with an E-value cut off at 1.0E-15. Among the 5,494 CCGS, 3,069 CCGS were annotated with known functional genes and containing only single pure genic-SSR per locus. Pure trinucleotide genic-SSRs (52.66%) were the predominant repeats. This was followed by pure di- (35.32%), hexa- (6.48%), penta- (3.84%) and tetranucleotides (1.69%). Microsatellite di- and trinucleotides are preponderant in S. rebaudiana leaf transcriptome. Repeat motif AT/TA (50.28%) was the most abundant among the dinucleotides, and the repeat motif GAT/ATC (12.87%) was predominant among the trinucleotides. From the 3,069 annotated CCGS, 1,617 were mapped to proteins available in the Kyoto Encyclopaedia of Genes and Genomes (KEGG) database. The biosynthesis pathways with the highest number of annotated CCGS mapped to them were the metabolic pathways, secondary metabolite biosynthesis pathway, and antibiotics biosynthesis pathway. Most studies on S. rebaudiana focused on the biosynthesis of secondary metabolites with a particular interest in SGs that contribute to the natural sweetness of Stevia. In this study, a total number of 14 genic-SSR loci associated with genes involved in the SGs biosynthesis pathway were identified. In addition, twenty pairs of genic-SSR primers were also designed and further validated in this study. From the 20 primer pairs, 17 (85.00%) were successfully cross-amplified in three different varieties of S. rebaudiana (SweetStevia, UKMB408 and AKHL1 var.). Three out of 17 loci screened were found to be polymorphic as revealed by polyacrylamide gel electrophoresis and confirmed by bidirectional amplicon sequencing of the PCR products. In conclusion, the transcriptome dataset has served as an excellent resource for the discovery of genic-SSRs in Stevia rebaudiana, and it also shows promising potential to develop polymorphic genic- SSR markers. As DNA markers available for this species is still very limited, the genic-SSR loci identified in this study will contribute substantially to the development of more DNA markers for the species, which may be applied in population and functional studies in the future. It may also be used as the baseline data towards developing DNA markers for selective breeding in the future.
format Thesis
qualification_level Master's degree
author Azmi Murad, Azrul Afiq
author_facet Azmi Murad, Azrul Afiq
author_sort Azmi Murad, Azrul Afiq
title Identification and functional annotation of genic simple sequence repeats from leaves tissue transcriptome dataset of Stevia rebaudiana Bertoni
title_short Identification and functional annotation of genic simple sequence repeats from leaves tissue transcriptome dataset of Stevia rebaudiana Bertoni
title_full Identification and functional annotation of genic simple sequence repeats from leaves tissue transcriptome dataset of Stevia rebaudiana Bertoni
title_fullStr Identification and functional annotation of genic simple sequence repeats from leaves tissue transcriptome dataset of Stevia rebaudiana Bertoni
title_full_unstemmed Identification and functional annotation of genic simple sequence repeats from leaves tissue transcriptome dataset of Stevia rebaudiana Bertoni
title_sort identification and functional annotation of genic simple sequence repeats from leaves tissue transcriptome dataset of stevia rebaudiana bertoni
granting_institution Universiti Putra Malaysia
publishDate 2021
url http://psasir.upm.edu.my/id/eprint/104312/1/AZRUL%20AFIQ%20BIN%20AZMI%20MURAD%20-%20IR.pdf
_version_ 1776100429585711104