• 검색 결과가 없습니다.

Comparative analysis of the complete chloroplast genomes and 45S nrDNAs of six Lonicera species

N/A
N/A
Protected

Academic year: 2021

Share "Comparative analysis of the complete chloroplast genomes and 45S nrDNAs of six Lonicera species"

Copied!
67
0
0

로드 중.... (전체 텍스트 보기)

전체 글

(1)

저작자표시-비영리-변경금지 2.0 대한민국 이용자는 아래의 조건을 따르는 경우에 한하여 자유롭게 l 이 저작물을 복제, 배포, 전송, 전시, 공연 및 방송할 수 있습니다. 다음과 같은 조건을 따라야 합니다: l 귀하는, 이 저작물의 재이용이나 배포의 경우, 이 저작물에 적용된 이용허락조건 을 명확하게 나타내어야 합니다. l 저작권자로부터 별도의 허가를 받으면 이러한 조건들은 적용되지 않습니다. 저작권법에 따른 이용자의 권리는 위의 내용에 의하여 영향을 받지 않습니다. 이것은 이용허락규약(Legal Code)을 이해하기 쉽게 요약한 것입니다. Disclaimer 저작자표시. 귀하는 원저작자를 표시하여야 합니다. 비영리. 귀하는 이 저작물을 영리 목적으로 이용할 수 없습니다. 변경금지. 귀하는 이 저작물을 개작, 변형 또는 가공할 수 없습니다.

(2)

1

Comparative analysis of the complete

chloroplast genomes and 45S nrDNAs of

six Lonicera species

SHIN-JAE KANG

DEPARTMENT OF PLANT SCIENCE

THE GRADUATE SCHOOL OF SEOUL NATIONAL UNIVERSITY

ABSTRACT

The genus Lonicera belongs to the family Caprifoliaceae and comprises approximately 210 species distributed in East Asia. Many Lonicera species such as L. japonica and L. maackii are ornamental shrub plants, and used as herbal medicines. Despite their usefulness, its genetic, genomics and molecular phylogenetics are rarely reported. Here, we collected six Lonicera species from Medicinal Plant

(3)

2

Garden, College of Pharmacy, Seoul National University, and produced 2.7 - 4.1 Gbp of whole genome sequencing (WGS) data. We obtained complete sequences of chloroplast genome and 45S nuclear ribosomal DNA (45S nrDNA) sequences using de novo assembly of Low-Coverage Whole genome sequence (dnaLCW). The chloroplast genome of the six Lonicera species ranged from 154,892 to 155,318 bp and showed high similarities (97.4%) each other. There were 114 genes in L. insularis, L. sachalinensis, L. praeflorens and L. maackii, 113 in L. vesicaria, and 112 in L. japonica. Comparative analysis of chloroplast genome and 45S nrDNA sequences revealed 17~2,261 single nucleotide polymorphisms (SNPs) and 5~278 insertion and deletions (InDels) between species in chloroplast, and a total of 45 SNPs and 4 InDels in 45S nrDNA. Furthermore, 266 large repetitive sequences and 288 simple sequence repeats (SSRs) were detected among six chloroplast genomes. In addition, we found several chloroplast protein-coding genes that showed high Ka/Ks values or highly conserved among six Lonicera species. Four genes, psaJ, rbcL, rps18 and ycf2 genes that showed high Ka/Ks values, might be positively selected in genus Lonicera. On the other hand, four genes in large single copy (LSC), psbI, psbZ, psbL and petZ genes, were highly conserved in six Lonicera species. Estimation of divergence time and phylogenetic relationship of six Lonicera species revealed that L. japonica was diverged first from the common ancestor of six Lonicera species (9.19-10.74 MYA), and L. insularis and L. sachalinensis were recently diverged (0-0.03 MYA). Phylogenetic trees based on chloroplast genomes and 45S nrDNA sequences showed a similar topology that L. insularis and L. sachalinensis were the closest, and they consist a clade with L. maackii. Moreover, phylogenetic analysis with related species in Dipsacales revealed that Lonicera species clustered with genera Patrinia and Kolkwitzia of the same family Caprifoliaceae, as expected. A total of seven molecular markers were developed from polymorphic sites such as SNPs, InDels and copy number variation (CNV) of

(4)

3

tandem repeat (TR) in chloroplast genome. We could successfully discriminate six Lonicera species using these developed markers. The chloroplast genome and 45S nrDNA sequences of six Lonicera species along with DNA markers produced in this study will provide valuable information for further genetic diversity studies and authentication of six Lonicera species.

Keywords : Lonicera, Lonicera insularis, Lonicera sachalinensis, Lonicera

praeflorens, Lonicera maackii, Lonicera vesicaria,Lonicera japonica, chloroplast genome, 45S nrDNA, dnaLCW

(5)

IV

CONTENTS

ABSTRACT ... I CONTENTS ... IV LIST OF TABLES ... VI LIST OF FIGURES ... VII LIST OF ABBREVIATIONS ... VIII

INTRODUCTION ... 1

MATERIALS AND METHODS... 4

1. Plant materials ... 4

2.DNA extraction and Whole-Genome shotgun sequencing ... 4

3. Chloroplast genome and 45S nrDNA assembly ... 5

4. Gene and structure annotation ... 5

5.Comparative analysis and Characterization of simple sequence repeats and large sequence repeats ... 6

6.Development and validation of molecular markers ... 6

7. Estimation of divergence time and phylogenetic analysis ... 7

RESULTS... 8

1. Complete chloroplast genome and 45S nrDNA sequences of six Lonicera species ... 8

2.Comparative analysis of chloroplast genomes of six Lonicera species15 3. Characterization of SSRs and repetitive sequences among six Lonicera species ... 21

4. Sequence variations of 45S nrDNA sequences of six Lonicera species30 5. Divergence time estimation and phylogenetic relationship among six Lonicera species ... 35

6. Phylogenetic relationship within Dipsacales ... 42

(6)

V

DISCUSSION ... 49

1. Complete chloroplast genome and 45S nrDNA sequences of six Lonciera species derived from low-coverage whole-genome NGS data ... 49 2. Comparative analysis of chloroplast genome and 45S nrDNA sequences

among six Lonicera species ... 50 3. Repetitive sequences in Lonicera chloroplast genomes ... 51 4. Divergence and phylogenetic analysis based on chloroplast genome and 45S nrDNA sequences of the Lonicera species ... 52 5. Development of molecular markers for authentication of six Lonicera species ... 53

REFERENCES ... 54 ABSTRACT IN KOREAN ... 59

(7)

VI

LIST OF TABLES

Table 1.

Statistics of WGS and assembly summary for six Lonicera

species ... 10

Table 2.

Summary of SNPs and InDels found in chloroplast genomes

among the six Lonicera species ... 17

Table 3.

SSRs comparison in chloroplast genomes of six Lonicera species

... 23

Table 4.

CNVs of TR units in chloroplast genomes among six Lonicera

species ... 27

Table 5.

Summary of SNPs and InDels found in 45S nrDNA sequences

among the six Lonicera species ... 31

Table 6.

Summary of SNPs and InDels found between three hetero types

of L. insularis 45S nrDNA sequences ... 32

Table 7.

Summary of SNPs and InDels found between three hetero types

of L. sachalinensis 45S nrDNA sequences ... 33

Table 8.

Summary of SNPs and InDels found between three hetero types

of L. maackii 45S nrDNA sequences ... 34

Table 9.

Mean Ks values and estimated divergence time of six Lonicera

species ... 37

Table 10. Median Ks values and estimated divergence time of six Lonicera

species ... 38

Table 11. Information of developed molecular markers in this study for six

Lonicera discrimination ... 45

Table 12. Marker combinations for each Lonicera species ... 48

(8)

VII

LIST OF FIGURES

Figure 1. Complete chloroplast genome map of six Lonicera species12

Figure 2. Complete 45S nrDNA sequence assembly of six Lonicera species

and polymorphic sites for each species ... 13

Figure 3. Comparison of chloroplast genome sequences of six Lonicera

species using mVISTA program with L. insularis as a reference

... 19

Figure 4. Comparison of the border positions of LSC, SSC and IR regions

across six Lonicera plastid genomes ... 20

Figure 5. Repeat structure analysis in six Lonicera chloroplast genomes

... 26

Figure 6. Summary of Ka and Ks values among the 77 conserved

protein-coding genes in six Lonicera species ... 39

Figure 7. Ka and Ks values of candidate genes involved in positive

selection between six Lonicera species ... 40

Figure 8. Phylogenetic tree and divergence time of six Lonicera species

... 41

Figure 9. Phylogenetic analysis of Lonicera species with related species in

Dipsacales ... 43

Figure 10. Validation of seven developed molecular markers from InDel and

(9)

VIII

LIST OF ABBREVIATIONS

45S nrDNA

45S nuclear ribosomal DNA

CNV

Copy number variation

CTAB

Cetyltrimethylammonium bromide

IGS

Intergenic spacer

InDel

Insertions or Deletions

IR

Inverted repeat

ITS

Internal transcribed spacer

LSC

Large single copy

NGS

Next generation sequencing

PE

Paired-end

rRNA

ribosomal RNA

SSC

Small single copy

SNP

Single nucleotide polymorphism

TR

Tandem repeat

WGS

Whole genome sequence

dnaLCW

de novo assembly of Low-Coverage Whole

genome sequence

SSR

Simple sequence repeat

TRF

Tandem Repeats Finder

(10)

1

INTRODUCTION

Lonicera is the largest genus in Caprifoliaceae and separated into two subgenera, Caprifolium and Lonicera. It comprises approximately 210 species that are mainly distributed in East Asia. Among them, about 100 species are in China, 25 in Japan and 30 in Korea. Many Lonicera species have been widely used as herbal medicines and ornamental shrub plants, and contain loniceroside which is a triterpenoid saponin, and known for anti-inflammatory effects (Son et al. 1994; Liu et al. 2012).

For example, L. japonica, called golden-and-silver honeysuckle or Japanese honeysuckle, has been widely used in traditional herbal medicine (Peng et al. 2000), and its flower bud also has been prescribed to treat some infectious diseases due to its anti-inflammatory and antiviral effects (Chang and Hsu 1992). These effects come from many active compounds identified from the stems and leaves of L. japonica (Shang et al. 2011).

L. insularis and L. vesicaria are Korean endemic ornamental shrubs, and their flower color changes from white to yellow. Previous studies have reported that a new compound, argininosecologanin, was identified from the roots of L. insularis (Kang et al. 2018), and L. vesicaria contains many antioxidant compounds such as anthocyanin and flavonoids (Lee et al. 2016).

L. maackii is a woody perennial shrub which grow up to 5 m in height and sprout earlier in spring. It is native to Northeastern Asia and has been widely used for ornamental purpose. In the late 1800s, L. maackii was imported to the Eastern United States (Forman 2011). However, L. maackii has been treated as an invasive plant due to its allelochemical that can suppress seed germination of other plants (Bauer et al. 2012).

Through cytogenetic works, diverse ploidy distribution were reported in Lonicera species. Most Lonicera species have conserved chromosome number of x = 9, and 2n = 18 to 54 (Ammal and Saunders 1952; Chen et al. 2017).

(11)

2

Chloroplast is a plant-specific organelle located in a cell and conducts photosynthesis and carbon fixation. In most higher plants, the chloroplast genome is a double stranded circular DNA ranging from 120 to 217 kb, and exists in a high copy number. Chloroplast genomes are generally highly conserved, and have quadripartite structure with one large single copy (LSC) region, one small single copy (SSC) region and two inverted repeat blocks (IRs) (Palmer 1985).

45S nuclear ribosomal DNA (45S nrDNA) units are located in nucleolus organizer region (Goffinet et al. 2005) and have many copies in tandem repeats (Huang et al. 2017). A 45S nrDNA unit is composed of 18S, 5.8S and 26S transcribed subunits and separated by two internally transcribed spacers (ITS) regions, ITS1 and ITS2.

Chloroplast genome and 45S nrDNA sequences are useful target for genetic diversity study and phylogenetic analysis due to many polymorphisms in inter-species level, but there are also few polymorphisms in intra-inter-species level (Kim et al. 2015). In addition, molecular markers derived from chloroplast genome can be efficient tool for identifying plant species because of highly conserved genes and characteristic of maternal inheritance (Kim et al. 2015). Many studies for molecular marker and phylogenetic analysis were conducted based on intergenic spacer (IGS) regions in chloroplast genome and ITS regions in 45S nrDNA sequence (Kim Y-D and Kim 1999; Theis et al. 2008; Jeong et al. 2014). Furthermore, chloroplast genomes have been used for elucidating history of plant evolution owing to characteristic of low rate of nucleotide substitution (Wolfe et al. 1989; Wilson et al. 1990).

Although some divergence and phylogenetic relationship studies of Lonicera species using molecular markers derived from a few chloroplast and nuclear DNA sequences have been reported (Theis et al. 2008; Smith 2009; NAKAJI et al. 2015),

(12)

3

genetic diversity and taxonomical classification study about Lonicera species are still limited.

Since the emergence of next-generation sequencing technologies with rapid development, more than 1500 complete chloroplast genomes are available in Genbank (https://www.ncbi.nlm.nih.gov/genbank/), and of these, so far, only one chloroplast genome sequence in Lonicera genus have been reported (He et al. 2017). On this account, this study was conducted to generate the complete chloroplast genomes and 45S nrDNA sequences of six Lonicera species, L. insularis, L. sachalinensis, L. praeflorens, L. maackii, L. vesicaria and L. japonica. Through comparative analyses, I present the genetic diversity of six Lonicera species based on chloroplast genome and 45S nrDNA sequences, and developed molecular markers based on InDels and SNPs in the chloroplast genomes for the identification of each six Lonicera species.

(13)

4

MATERIALS AND METHODS

1. Plant materials

The six Lonicera species were provided by Medicinal Plant Garden, College of Pharmacy, Seoul National University, Koyang, Korea.

2. DNA extraction and Whole-Genome shotgun sequencing

Individual leaves and roots of each species were stored at -70℃ until use. Leaves or roots were ground using a mortar and pestle with liquid nitrogen, and then total genomic DNA was extracted using a modified cetyltrimethylammonium bromide (CTAB) protocol (Allen et al. 2006). The quality and concentration of extracted DNA was measured by agarose gel electrophoresis and UV-spectrophotometer (Thermo Scientific Nanodrop ND-1000), respectively. Paired-end (PE) libraries were sequenced using Illumina Miseq genome analyzer by Lab Genomics Inc., Seongnam, Korea according to the standard protocol provided by the manufacturer. Whole genome shotgun (WGS) sequencing data of 2.7 - 4.1 Gbp were generated from six Lonicera species.

(14)

5

3. Chloroplast genome and 45S nrDNA assembly

Chloroplast genomes and 45S nrDNA units were assembled by de novo assembly of Low-Coverage Whole genome sequence (dnaLCW) method using CLC genome assembler program (ver. 4.06 beta, CLC Inc, Rarhus, Denmark) (Kim K et al. 2015) and manual curation. In summary, raw PE reads were trimmed with offset value of 33, and trimmed reads were assembled with overlap distance set ranging 150 to 500 bp, window size set of 32 for chloroplast and 64 for 45S nrDNA. The initial contigs were extracted from assembled reads using MUMmer by mapping to reference chloroplast sequence, KJ170923. The extracted contigs were arranged and merged into a single draft sequence by comparison with reference sequence, KJ170923.

4. Gene and structure annotation

The genes in chloroplast genome were annotated using DOGMA program (http://dogma.ccbb.utexas.edu/) (Wyman et al. 2004) and GeSeq (https://chlorobox.mpimp-golm.mpg.de/geseq.html) (Tillich et al. 2017), and then manually curated based on BLAST searches. The chloroplast circular maps were generated using OGDRAW (http://ogdraw.mpimp-golm.mpg.de/) program (Lohse et al. 2007). The structure of 45S nrDNA unit was predicted by RNAmmer (http://www.cbs.dtu.dk/services/RNAmmer/) (Lagesen et al. 2007).

(15)

6

5. Comparative analysis and Characterization of simple sequence

repeats and large sequence repeats

Comparative analysis of complete chloroplast genomes and 45S nrDNA sequences among six Lonicera species were conducted using in-house script, MAFFT (Katoh and Standley 2013) and mVISTA program to identify the inter-species polymorphism.

Simple sequence repeats (SSRs) were identified using microsatellite search module, MISA (http://pgrc.ipk-gatersleben.de/misa/) (Thiel et al. 2003) with thresholds of ten repeat units for mononucleotides SSRs, five repeat units for di-, four repeat units for tri- and three repeat units for tetra-, penta- and hexanucleotides SSRs.

Repeats including tandem, dispersed, complementary, palindromic repeats were investigated using REPuter (Kurtz et al. 2001) and Tandem Repeats Finder (TRF) (Benson 1999)program with parameter setting of minimum repeat size of 10 bp, and the identity of repeats ≥80% for REPuter, and parameter of 2, 7 and 7 for match, mismatch, and InDel for TRF. All identified repeats were manually curated.

6. Development and validation of molecular markers

To validate inter-species polymorphism derived from chloroplast genomes, and authenticate the six Lonicera species, co-dominant and dominant markers were designed based on polymorphic regions such as InDels and SNPs using Primer-Blast program (Ye et al. 2012). The PCR amplification was performed as follows: 7 minutes at 94℃, 35 cycles of 94℃ for 20 ~ 30 sec, 58 ~ 64℃ for 20 ~ 30 sec, 72℃ for 20 ~ 30 sec; and final extension at 72℃ for 7 min. The PCR products were then separated by agarose gel to identify polymorphisms.

(16)

7

7. Estimation of divergence time and phylogenetic analysis

To elucidate the phylogenetic relationship among six Lonicera species, we analyzed not only the complete chloroplast genomes, but also the complete 45S nrDNA sequences of six Lonicera species.

Together, phylogenetic analysis with relative species in Dipsacales was investigated using complete chloroplast genomes. Additional complete chloroplast genome sequences of eight species (L. japonica Chinese, Patrinia saniculifolia, Kolkwitzia amabilis, Viburnum utile, Sambucus williamsii, Sinadoxa corydalifolia, Adoxa moschatellina, Tetradoxa omeiensis) were provided from Genbank and used for analysis. All phylogenetic trees were generated by neighbor-joining method with 1000 bootstrap values in MEGA6.0 program (Tamura et al. 2013).

Divergence time was calculated based on Ks value. Ka and Ks values are the rates of non-synonymous and synonymous substitution per site, respectively. 77 protein-coding genes conserved in the chloroplast genomes of six Lonicera species were extracted and concatenated. Mean and median values of Ka and Ks were calculated using PAML program by pair-wise comparison of shared protein-coding genes of six species. Divergence time was represented by T= Ks/2λ, where λ is 1.0 x 10-9 indicating substitution rate per site per year.

(17)

8

RESULTS

1. Complete chloroplast genome and 45S nrDNA sequences of six

Lonicera species

Whole genome sequencing data of six Lonicera species were generated by Illumina Miseq platform, and ranged from 2.7 to 4.1 Gbp for each species. Complete chloroplast genome and 45S nrDNA sequences of six Lonicera species were obtained by dnaLCW method (Table 1).

The complete chloroplast genomes were assembled by combining the primary chloroplast genome sequence contigs from WGS data for each of the species. The complete chloroplast genomes of six Lonicera species in single molecule were successfully obtained by combining three to four initial contigs, and manually curated. The complete chloroplast genome sequences of the six Lonicera species ranged from 154,892 to 155,318 bp in length, and they showed typical quadripartite structure with the large single copy (LSC), small single copy (SSC) and a pair of inverted repeat (IRa and IRb) regions (Figure 1). The length of the LSC regions ranged from 88,229 to 89,202 bp, and the SSC regions ranged from 18,612 to 18,929 bp, and two inverted repeat regions ranged from 23,718 to 24,060 bp. The average coverage of raw reads for chloroplast genome assembly ranged from 134.0× to 784.0×.

The 45S nrDNA sequence comprised 18S, 5.8S and 26S gene clusters with internal transcribed spacer sequences (ITS1 and ITS2) between genes, and an intergenic spacer region (IGS). The IGS region could not be assembled in this study, due to large gaps at G-C rich regions, as previously reported (Kim et al. 2015). The complete 45S nrDNA of each species consisted of one or two contigs. The length of the 45S nrDNA sequences of six Lonicera species ranged from 5,832 to 5,836 bp. L. vesicaria and L. japonica had only one type of 45S nrDNA sequence, whereas

(18)

9

heterotypes of 45S nrDNA sequences were confirmed in L. insularis, L. sachalinensis, L. praeflorens and L. maackii with 3, 7, 2 and 2 types, respectively (Figure 2). We categorized major and minor types by considering mapping depth per position and polymorphic sites within reads. The average coverage of raw reads for 45S `nrDNA sequence assembly ranged from 155.2× to 536.5×.

(19)

10

Table 1. Statistics of WGS and assembly summary for six Lonicera species

Feature L. insularis L. sachalinensis L. praeflorens L. maackii L. vesicaria L. japonica

Sequencing information

No. of raw read 4,941,334 4,764,738 4,342,742 4,920,926 5,596,064 6,308,194

No. of trimmed read 4,662,540 4,339,126 4,024,338 4,640,648 4,712,150 5,029,201

No. of trimmed bases 1,211,552,506 1,098,408,065 1,040,146,882 1,188,483,775 1,164,886,321 1,178,414,508 Chloroplast genome

Average read depth 634.83 214.83 165.39 784.00 134.00 668.84

Genome size (bp) 155,124 155,123 154,892 155,318 155,182 155,060

Large single copy 88,230 88,229 88,353 89,202 89,096 88,853

Small single copy 18,774 18,774 18,929 18,680 18,612 18,653

Inverted repeat 24,060 24,060 23,805 23,718 23,737 23,777

Number of genes 114 114 114 114 113 109

Protein-coding genes 80 80 80 80 79 77

Structure RNAs 34 34 34 34 34 32

GC contents (%) 38.35 38.34 38.31 38.47 38.39 38.59

(20)

11

Table 1. Continued

45S nrDNA

Average read depth 371.52 204.90 446.42 536.46 155.21 113.84

Coding region length 5,834 5,832 5,836 5,833 5,832 5,835

18S 1,809 1,809 1,809 1,809 1,809 1,809

ITS1 230 228 229 229 228 228

5.8S 164 164 164 164 164 164

ITS2 232 232 235 232 232 232

(21)

12

Figure 1. Complete chloroplast genome maps of six Lonicera species. Chloroplast

genome maps were generated by OGDRAW. Genes shown on the outside of the map are transcribed clockwise, on the other hand, genes on the inside are transcribed counter-clockwise. The four parts of chloroplast genome and GC contents are indicated on the inner circle. Blue and red bars in inner-circle represent SNP and InDel variations among six Lonicera species, respectively. A gene with black star is not presented in L. vesicaria and L. japonica, also a gene with blue star is not presented in L. japonica.

(22)
(23)

14

Figure 2. Complete 45S nrDNA sequence assembly of six Lonicera species and

polymorphic sites for each species. (A) L. insularis, (B) L. sachalinensis, (C) L. praeflorens, (D) L. maackii, (E) L. vesicaria, (F) L. japonica (a) mapping depth of raw PE reads on the assembled 45S nrDNA and GC content, windowsize is 100 bp. (b) polymorphic regions between heterotype sequences.

(24)

15

2. Comparative analysis of chloroplast genomes of six Lonicera species.

The gene annotation of six Lonicera species revealed that L. insularis, L. sachalinensis, L. praeflorens and L. maackii contains a total of 114 genes : 80 protein-coding, 30 transfer RNA (tRNA) and 4 ribosomal RNA (rRNA) genes. L. vesicaria and L. japonica contains a total of 113 and 112 genes, respectively : 79 and 78 protein-coding, 30 tRNA and 4 rRNA genes. Some genes were pseudogenized in Lonicera species : ycf15 gene in six Lonicera species, accD gene in L. vesicaria and L. japonica, rpoA gene in L. japonica.

To investigate genetic diversity of chloroplast genome of six Lonicera species, multiple alignment was performed. We identified 17~2,261 SNPs and 5~278 InDels between species (Table 2). The lowest numbers of SNPs and InDels (17 and 5) were identified between L. insularis and L. sachalinensis; meanwhile, the highest numbers of SNPs (2,261) were identified between L. vesicaria and L. japonica and the highest numbers of InDels (278) were identified between L. insularis and L. japonica (Table 2). The sequence identity plot of six chloroplast genomes was constructed by mVISTA program, using the L. insularis annotation as a reference (Figure 3). The six Lonicera species showed high similarity with each other (97.4%), and genic regions were more conserved than intergenic regions, as we expected.

We also compared the borders of LSC, SSC and two IR regions among six chloroplast genomes (Figure 4). The rpl23 gene spanned the LSC and IRB regions with approximately 120 bp in IR region for all six Lonicera species. The junction of IRB and SSC existed between trnN and ndhF genes except for L. japonica. The distance between trnN gene and IRB/SSC junction position ranged from 833 to 1220 bp. The junction of SSC and IRA existed between ycf1 and trnN genes, showing different distance of 208 ~ 523 bp and 832 ~ 1219 bp in length, respectively. The trnH gene was exactly located in the border region of IRA and LSC for five Lonicera species, whereas L praeflorens had a 20 bp gap between trnH and IRA/LSC junction.

(25)

16

Table 2. Summary of SNPs and InDels found in chloroplast genomes among the six Lonicera species

Species InDel Li Ls Lp Lm Lv Lj SNP Li / 5 246 153 247 278 Ls 17 / 246 156 245 277 Lp 1450 1439 / 227 235 271 Lm 754 743 1426 / 223 266 Lv 1550 1539 1446 1490 / 268 Lj 1964 1953 2072 1958 2261 /

(26)
(27)

18

Figure 3. Comparison of chloroplast genome sequences of six Lonicera species using mVISTA program with L. insularis as a

reference. Genic regions were annotated by GeSeq, tRNAscan. Red and black arrowheads indicate polymorphic regions for molecular markers development. Red arrowheads represent the regions for InDel marker development and black arrowhead for SNP marker. Li, L. insularis; Ls, L. sachalinensis; Lp, L. praeflorens; Lm, L. maackii; Lv, L. vesicaria; Lj, L. japonica.

(28)

19

Figure 4. Comparison of the border positions of LSC, SSC and IR regions across six Lonicera plastid genomes. The arrow boxes

(29)

20

3. Characterization of SSRs and repetitive sequences among six

Lonicera species

Comparative analyses of repeat were carried out with one IR region to avoid redundancy. Copy number variations of SSRs were identified among the chloroplast genome of six Lonicera species (Table 3). The longest SSRs were hexamer with 24 bp in length. The most abundant SSRs were mononucleotide with A and T. The lowest number of SSRs (47) were identified in L. sachalinensis and L. japonica, whereas the highest number of SSRs (50) were identified in L. praeflorens. L. japonica contained the highest number of homopolymers (32), but no pentapolymers. L. insularis, L. sachalinensis and L. praeflorens had 5 dipolymers; lower than L. maackii (7), higher than both L. vesicaria and L. japonica (4). L. maackii had one tripolymers; lower than all other Lonicera species (2). L. insularis, L. sachalinensis and L. praeflorens had 8 tetrapolymers; lower than L. vesicaria (9) and L. maackii (10), higher than L. japonica (7). L. praeflorens and L. vesicaria had 4 and one pentapolymers, respectively, but no other Lonicera species. L. maackii and L. vesicaria had one hexapolymers; lower than L. insularis, L. sachalinensis and L. japonica (2).

Repeat sequences were grouped into four types: tandem, dispersed, palindromic, complement. A total of 32~55 repetitive sequences were identified in chloroplast genomes of each individual Lonicera species (Figure 5A), including tandem (68.8%), dispersed (18.0%), palindromic (11.7%) and complement (1.5%) (Figure 5C). Repeat length ranged from 10 to 77 bp across chloroplast genomes of the six Lonicera species (Figure 5B). The longest repeat was tandem and found in L. vesicaria. Dispersed and palindromic repeats ranged from 18~74 bp and 21~30 bp, repectively. Most repeats were located in intergenic spaces (IGS) regions (55.64%) and coding sequence (CDS) regions (35.34%) (Figure 5D). Some repeats were found in intron regions (9.02%).

(30)

21

We further characterized copy number variation (CNV) of various tandem repeat (TR) units which is one of the important genomic resources for genetic diversity analysis (Table 4). A total of 52 TRs were identified that ranged from 10 to 77 bp in length. Of the 52 TR sequences, 23 were located at genic regions, and 29 were in intergenic regions. Three of those from genic regions were found in introns. Among the 52 TRs, 14 did not show any copy number variation between six Lonicera species, whereas 38 showed polymorphisms: 6 were unique to L. insularis and L. sachalinensis (TR4, TR7, TR25, TR39, TR43, TR35); 3 were unique to L. praeflorens (TR9, TR47, TR48); 3 were unique to L. maackii (TR23, TR24, TR49); 8 were unique to L. vesicaria (TR1, TR8, TR10, TR12, TR19, TR21, TR27, TR38); 6 were unique to L. japonica (TR11, TR28, TR42, TR44, TR50, TR52) and 12 were unique and diverse in six Lonicera species (TR3, TR13, TR14, TR15, TR16, TR17, TR20, TR32, TR33, TR36, TR46, TR51)

(31)

22

Table 3. SSRs comparison in chloroplast genomes of six Lonicera species

Motif SSR units Number of SSRs

Li Ls Lp Lm Lv Lj Mononucleotide A/T 31 30 30 29 30 29 C/G - - 1 - 1 3 Dinucleotide AT/AT 2 2 3 2 2 2 TA/TA 2 2 2 3 2 2 GA/TC 1 1 - 1 - - TC/GA - - - 1 - - Trinucleotide TTC/GAA 1 1 1 1 1 1 AAT/CTT 1 1 - - - - TTG/CAA - - 1 - 1 - ATA/TAT - - - 1 Tetranucleotide AGAT/ATCT 1 1 1 2 1 1 ATAA/TTAT 2 2 2 2 2 2 CAAT/GTTC 1 1 1 1 1 1 TATC/GATA 1 1 1 1 1 1 TCTT/AAGAC 1 1 1 1 1 1 TTAA/TTAA 1 1 1 1 1 1 ATTT/AAAT 1 1 - - - - TTTA/TAAA - - 1 1 - - TCTA/TAGA - - - 1 - - AAAT/ATTT - - - - 2 -

(32)

23

Table 3. Continued Pentanucleotide TATTA/TAATA - - 3 - - - TATAT/ATATA - - 1 - - - TATTC/GAATA - - - - 1 - Hexanucleotide CTTACC/GGTAAG 1 1 - - 1 - TGTTTA/TAAACA 1 1 - - - - TATGGA/TCCATA - - - 1 - - ATTCCA/TGGAAT - - - 1 GGATAG/CTATGG - - - 1 Total SSRs 48 47 50 48 48 47

(33)
(34)

25

Figure 5. Repeat structure analysis in six Lonicera chloroplast genomes. (A) Number of four repeat types in each Lonicera species

chloroplast genome (B) Frequency of repeat sequences by length (C) Frequency of all repeat types (D) Location distribution of all the repeats.

(35)

26

Table 4. CNVs of TR units in chloroplast genomes among six Lonicera species

Marker

name No. TR unit sequence

Length (bp) Copy number Position Li Ls Lp Lm Lv Lj TR1 AAAGTTTCCTATTTCTAC 18 1 1 1 1 2 1 rps16-trnQ(UUG) TR2 CTTTCTACTACTAAT 15 2 2 2 2 2 2 trnC(GCA)-petN TR3 AATAAAAAATATAG 14 1 1 0 2 1 3 trnE(UUC)-trnT(GGU) TR4 AATACTACATTATCATCTCCATTGTATTTAAATCGACAAA 40 2 2 1 1 1 1 trnT(GGU)-psbD

TR5 ATGTAATAACTAGATAAATC 20 2 2 2 2 2 2 rps4-trnT(UGU) TR6 TTAGCTACTCATAA 14 3 3 3 3 3 3 trnT(UGU)-trnL(UAA) TR7 CTCCCTAATTATTTATCCT 19 2 2 1 1 1 1 trnL(UAA)-trnF(GAA) TR8 TAATTGAATTTCAATTAAA 19 1 1 1 1 2 1 rbcL-accD

TR9 TCCCCCTCTAATTCAAATGAGTGGTTTTGTGGGAAAAGGGGATTCAAAGAAAGAA 55 1 1 3 1 1 1 rbcL-accD TR10 TATTCTATTTTCTTCTTTAATATTCGATCAA ATTACATATAAAAAGAATATCTTTGTAATTT GATTAAAAAAAAAAG 77 1 1 1 1 2 1 rbcL-accD TR11 GACTCTGAAAGCGATCCTGAGGAGGGTAAC GATAACCCGTTCCAT 45 1 1 0 1 1 2 accD TR12 CGCCTTGAAGCAGATAGACGTTATCGGGAG GGTTACTCTGGTGCTCCTGACGATGAAGTT ACTGAGGAGATT 72 1 1 1 1 2 1 accD TR13 GGGATAGGGATAAGGATA 18 0 0 0 9 1 2 accD TR14 CTGACTATGGAAGTGATACGGATGGCC 27 5 5 7 1 1 2 accD TR15 GGATTACCTTCAAAAAAAAAGAAATCCTGGGGG 33 1 1 3 2 2 2 accD

Lo_i_03 TR16 CTATGGAAGTAATCCTCAGAAGGATGACCCTAG 33 2 2 5 4 3 3 accD

TR17 CTTAATTAAGAATATTAA 18 2 2 1 2 1 1 accD-psaI

(36)

27

Table 4. Continued

Lo_i_04 TR19 ATTTAATTAAAA 12 1 1 1 1 2 1 trnP(UGG)-psaJ

Lo_i_04 TR20 ATAAAAGTAATATATAAAAAAAG 23 2 2 3 2 2 1 trnP(UGG)-psaJ

TR21 AAATCCAAGCGACCCTTTCTG 21 5 5 5 5 4 5 rps18 TR22 AACGGCCTTTCCAGCCGAGAT 21 2 2 2 2 2 2 rps18-rpl20 TR23 CCGAACTCAA 10 1 1 0 2 1 1 rps18-rpl20 TR24 TATTCTATTAAACTTGG 17 1 1 1 2 1 1 rps12-clpP TR25 AATAAAGAAAACAAATAAGA 20 2 2 1 1 1 1 petB-petB TR26 AAAAGAAAATCCAGTCAA 18 2 2 2 2 2 2 petD-petD TR27 TCTGAATCTATT 12 2 2 2 2 1 2 rps8-rpl14 TR28 TTTCCTTTCAGTCTATT 17 1 1 1 1 1 2 rpl14-rpl16 TR29 TTAAGATATATCTTGAAT 18 2 2 2 2 2 2 rpl16-rpl16 TR30 AATTCTTCTTGTAAATTCCTCTTT 24 2 2 2 2 2 2 rps3 TR31 TAATCTATTTTTAT 14 2 2 2 2 2 2 rpl22-rps19 TR32 ATATCATAAAGAA 13 2 2 17 2 2 2 trnI(CAU)-ycf2 TR33 TTTGTCTAAGCCACTTCGTTTCTT 24 6 6 5 6 6 4 ycf2 TR34 TGATCCTCCATTTGAACCAGATGA 24 2 2 2 2 2 2 ycf2 TR35 GATAAGAAAAGTGAA 15 2 2 2 2 2 2 ycf2 TR36 GGGGATGGGGTTGTGGAC 18 5 5 4 5 4 4 ycf2 TR37 TGATATTGATGATAGTAAGATTGATGA GAG 30 2 2 2 2 2 2 ycf2

Lo_i_05 TR38 AGGATGATGGAG 12 1 1 1 1 2 1 ycf2

TR39 AAGAGTATGAGCTTC 15 2 2 1 1 1 1 ycf2

TR40 AAGAGGATGATGAAGAGAATGAGG 24 2 2 2 2 2 2 ycf2

TR41 TCCATGGAGCTATGTGTCTAT 21 2 2 2 2 2 2 ndhB-rps7

Lo_i_06 TR42 ATGGATAAGAGGCTCGTGGGAT 22 1 1 1 1 1 2 trnV(GAC)-rrn16

TR43 ATAGAATAACATAATATCATATATAGA

ATAACATAATATTCT 42 6 6 0 0 0 1 trnN(GUU)-ndhF

(37)

28

Table 4. Continued

TR45 ATTATTATAAAGTAATATTATAATTAGA

TATTACTTTATAATAATCTATTTTT 53 2 2 1 1 0 0 trnN(GUU)-ndhF

TR46 TTTTTTTTACTTACCTTATT 20 2 2 3 0 2 1 ccsA-ndhD

Lo_i_07 TR47 TGATTAATACTAC 13 1 1 2 1 1 1 psaC-ndhE

TR48 GTTCGCTATTATTTATCTT 19 1 1 2 1 1 1 ndhA-ndhA

Lo_i_08 TR49 TCAGTTAGATTCTTCATTTCTTGT 24 2 2 2 1 2 2 ycf1

TR50 TTCCACTTCCTT 12 1 1 1 1 1 2 ycf1

TR51 TTTTTACAGATTTCTTTGATTCCAACC 27 1 1 2 1 2 1 ycf1

TR52 TTAGAAAATGGATCCACTTTCTGGTCAA

(38)

29

4. Sequence variations of 45S nrDNA sequences of six Lonicera species

The 45S nrDNA sequences of the six Lonicera species were compared and analyzed by multiple alignment. Comparison of 45S nrDNA unit sequences among six Lonicera species showed a total of 45 SNPs and 4 InDels considering all types of 45S nrDNA sequences including major and minor types (Table 5). Polymorphisms were rich in ITS1, ITS2 and 26S rDNA regions, and some were found in 18S rDNA region. In case of L. insularis, L. sachalinensis, L. praeflorens and L. maackii, the 45S nrDNA sequences exist as a heterotypes with number of 3, 7, 2 and 2, respectively (Table 6, Table 7, Table 8). Polymorphisms in major and minor types were also rich in ITS1 and ITS2 regions.

(39)

30

Table 5. Summary of SNPs and InDels found in 45S nrDNA sequences among the six Lonicera species

Feature Total 18S ITS1 5.8S ITS2 26S

Length (bp) 5832-5836 1809 228-230 164 232-235 3399-3402

SNP 58 4 20 0 18 16

(40)

31

Table 6. Summary of SNPs and InDels found between three hetero types of L. insularis 45S nrDNA sequences

Nucleotide positiona Read depthb ITS1 ITS2 26S 1 1 1 1 1 1 1 1 2 2 2 2 2 3 4 8 8 8 8 8 9 9 9 2 2 3 4 4 0 5 8 9 9 9 9 2 3 4 6 8 4 3 5 7 6 6 0 1 5 6 2 4 1 6 1 6 6 4 6 4 Type 1 371.52 G A C C C C A G C G T T G C T Type 2 367.05 G G C - c C C G G A G C C T G T Type 3 341.22 -c G T - c - c G G C C T C C G C C

a Nucleotide position based on 45S nrDNA sequence of L. insularis type1 as a reference sequence. b Read depth by type.

(41)

32

Table 7. Summary of SNPs and InDels found between seven hetero types of L. sachalinensis 45S nrDNA sequences

Nucleotide positiona Read depthb ITS1 ITS2 26S 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 3 2 2 2 2 2 3 8 8 8 8 9 9 0 0 0 2 2 2 2 3 3 4 4 4 4 4 4 4 7 0 8 9 9 9 2 3 0 1 3 5 6 8 8 4 5 3 3 3 4 5 5 5 8 7 6 0 1 4 0 9 5 4 2 5 4 1 7 4 4 1 2 4 5 2 4 5 1 4 Type 1 200.70 G G T - c C G C C T - c C C G C G G A T C T C C G G Type 2 187.24 G G T - c T G T T C - c C C G C G G A T C T C C G G Type 3 187.84 - c G T - c G C C C T C C T G C A G A T C T T T G G Type 4 179.53 G A C C C G C C T - c C T G C G T T T C T C C G G Type 5 187.30 G A C C C G C C T - c A C G C G T T C C T C C A C Type 6 190.87 G A C C C G C C T - c C C G T G T T T C G C C G G Type 7 197.69 G G C C C G C C T - c C C T C G G A T A T C C G G

a Nucleotide position based on 45S nrDNA sequence of L. sachalinensis type1 as a reference sequence. b Read depth by type.

(42)

33

Table 8. Summary of SNPs and InDels found between two hetero types of L. maackii 45S nrDNA sequences

Nucleotide positiona Read depthb ITS1 ITS2 26S 1 1 2 2 2 5 8 9 3 4 4 7 6 4 9 6 7 2 1 0 5 0 3 9 Type 1 536.46 T C G G C G Type 2 483.24 C T A T T A

a Nucleotide position based on 45S nrDNA sequence of L. maackii type1 as a reference sequence. b Read depth by type.

(43)

34

5. Divergence time estimation and phylogenetic relationship among six

Lonicera species

Based on conserved protein-coding sequences among six chloroplast genomes, the mean and median Ks values were 0.0001 ~ 0.0223 and 0.00 ~ 0.0192, respectively (Table 9, Table 10). The lowest Ks value was found between L. insularis and L. sachalinensis, whereas the highest Ks value was found between L. vesicaria and L. japonica. At gene level, the lowest and highest average Ks value was detected in ndhB and petL genes with 0.0007 and 0.0596, respectively. High Ka/Ks ratio of more than 1 was detected in psaJ gene, and rbcL, rps18 and ycf2 genes showed Ka/Ks ratio of over 0.8 from both mean and median values (Figure 6, Figure 7). The psbI, psbZ, psbL and petG genes showed Ka and Ks value of both 0, indicating highly conserved in six Lonicera chloroplast genomes.

The phylogenetic relationship of the six Lonicera species was examined by comparative analysis of the complete chloroplast genomes and the 45S nrDNA sequences (Figure 8). Complete chloroplast genome and all 45S nrDNA sequences including major and minor types were used for phylogenetic analysis.

The phylogenetic tree based on chloroplast genomes revealed that L. japonica is most diverse and grouped into an independent group. L. insularis and L. sachalinensis were the closest, and they belonged to the same subgroup as L. maackii. L. praeflorens and L. vesicaria were classified into another subgroup.

Based on protein-coding sequences, the divergence time between L. insularis and L. sachalinensis could be estimated at approximately 0 ~ 3 MYA (Figure 8A), between L. praeflorens and L. vesicaria at 6.50 ~ 7.61 MYA, between L. japonica and other species at 9.19 ~ 10.75 MYA, and then speciation was considered to have occurred on its own.

The phylogenetic tree based on 45S nrDNA sequences showed a similar pattern with that obtained from chloroplast genomes but more complicated (Figure 8B). The

(44)

35

result showed that L. insularis and L. sachalinensis were the closest, and belonged to the subgroup as L. maackii, as with the result of phylogenetic analysis based on chloroplast genomes. L. vesicaria and L. japonica were classified into another subgroup.

(45)

36

Table 9. Mean Ks values and estimated divergence time of six Lonicera species

Species Divergence time (MYA)

b Li Ls Lp Lm Lv Lj Mean Ks valuesa Li 0.03 7.95 4.14 7.88 10.49 Ls 0.0001 7.89 4.08 7.81 10.41 Lp 0.0159 0.0158 7.33 7.61 11.44 Lm 0.0083 0.0082 0.0147 7.45 10.20 Lv 0.0158 0.0156 0.0152 0.0149 11.14 Lj 0.0210 0.0208 0.0229 0.0204 0.0223

Abbreviations : Li, L. insularis; Ls, L. sachalinensis; Lp, L. praeflorens; Lm, L. maackii; Lv, L. vesicaria; Lj, L. japonica

aMean Ks values between common protein-coding genes of each species calculated using PAML program. bDivergence time was estimated by Ks/2λ (λ =1.0 × 10−9)

(46)

37

Table 10. Median Ks values and estimated divergence time of six Lonicera species

Species Divergence time (MYA)

b Li Ls Lp Lm Lv Lj Median Ks valuesa Li 0.00 7.55 3.50 6.00 9.05 Ls 0.0000 7.50 3.50 6.00 9.05 Lp 0.0151 0.0150 6.80 6.50 9.20 Lm 0.0070 0.0070 0.0136 5.30 9.05 Lv 0.0120 0.0120 0.0130 0.0106 9.60 Lj 0.0181 0.0181 0.0184 0.0181 0.0192

Abbreviations : Li, L. insularis; Ls, L. sachalinensis; Lp, L. praeflorens; Lm, L. maackii; Lv, L. vesicaria; Lj, L. japonica

aMedian Ks values between common protein-coding genes of each species calculated using PAML program. bDivergence time was estimated by Ks/2λ (λ =1.0 × 10−9)

(47)

38

Figure 6. Summary of Ka and Ks values among the 77 conserved protein-coding genes in six Lonicera species. The mean Ka and

Ks values are indicated by grey and light grey bars, respectively. Blue and light blue stars represent genes that Ka/Ks values are over 1 and 0.8, each, and could evolve into positive selection. Red stars indicate conserved genes without any substitution.

(48)

39

(49)

40

Figure 8. Phylogenetic tree and divergence time of six Lonicera species. Phylogenetic trees were generated based on complete

chloroplast genomes (A), 45S nrDNA sequences (B) using MEGA 6.0. The numbers on the nodes indicate bootstrap support values. The numbers under the nodes at (A) represent median and mean divergence time (*, MYA) based on Ks values using PAML 4.9. Bold numbers represent major types of 45S nrDNA sequence, and non-bold numbers are minor types. Li, L. insularis; Ls, L. sachalinensis; Lp, L. praeflorens; Lm, L. maackii; Lv, L. vesicaria; Lj, L. japonica.

(50)

41

6. Phylogenetic relationship within Dipsacales

Phylogenetic relationship inferred using complete chloroplast genome sequences from 14 species in Dipsacales indicated that eight genera divided into two monophyletic groups consisting of Caprifoliaceae and Adoxaceae family (Figure 9). In Caprifoliaceae family, Patrinia saniculifolia and Kolkwitzia amabilis were classified into another subgroup, and the six Lonicera species which is completely assembled in this study were grouped with L. japonica (China collection) which is previously reported.

(51)

42

Figure 9. Phylogenetic analysis of Lonicera species with related species in Dipsacales. The tree was generated based on complete

chloroplast genome sequences of 14 species and analyzed neighbour-joining method with 1000 bootstrap values in MEGA 6.0. The numbers in the nodes represent bootstrap support values.

(52)

43

7. Development and validation of chloroplast genome-based markers

Based on chloroplast genome sequence alignment, the seven markers were developed based on polymorphic sites for discriminating the six Lonicera species and further application for authentication of each species (Table 11, Figure 10). Among those, six markers were derived from CNV-based InDel region, and one marker was from SNP region. Each of these seven markers were successfully amplified by PCR, and each amplicon showed expected PCR product band sizes.

The marker Lo_i_03 was specific to Lonicera species with much different sizes, and derived from a 33 bp tandem repeat in the accD gene (Table 11, Figure 10A). The marker Lo_i_04 was derived from 12 and 23 bp tandem repeat in the trnP - psaJ region, and distinctly amplified in Lonicera species (Table 11, Figure 10B). The marker Lo_i_05 was derived from a 12 bp tandem repeat in the ycf2 gene, and was specific to L. vesicaria (Table 11, Figure 10C). The marker Lo_i_06 was derived from a 22 bp tandem repeat in the trnV - rrn16 region, and specific to L. japonica (Table 11, Figure 10D). The marker Lo_i_07 was derived from a 13 bp tandem repeat in psaC - ndhE region, and specific to L. praeflorens (Table 11, Figure 10E). The marker Lo_i_08 was derived from a 24 bp tandem repeat in ycf1 gene, and specific to L. maackii (Table 11, Figure 10F). Finally, the dominant marker Lo_do_04 was derived from SNP in rps18 - rpl20 region, and specific to L. insularis (Table 11, Figure 10G). Validation results indicated that more than three species were able to be distinguished using each marker Lo_i_03 and Lo_i_04, also seven Lonicera species were successfully discriminated (Table 12).

(53)

44

Table 11. Information of developed molecular markers in this study for six Lonicera discrimination

Type Marker

name Primer sequence (5’-3’)

Product (bp)

Location

Li Ls Lp Lm Lv Lj

InDel

Lo_i_03 F AGAGCCTTACCTTGACTATGGA 480 480 579 546 513 513 accD

R ACGGATCCCATACTACCCCC

Lo_i_04 F AAACAAACGCGCTACCAAGC 314 314 338 314 326 295 trnP(UGG)-psaJ

R CCCGAGCATTCCCGAAAAAG

Lo_i_05 F TTTGAAGACGGGGAAGGAGC 200 200 200 200 212 200 ycf2

R TCCTCTTCATCCGCGAAAGG

Lo_i_06 F GAGTGTCACCTTGACGTGGT 186 186 186 186 186 208 trnV(GAC)-rrn16

R TCATATTCGCCCGGAGTTCG

Lo_i_07 F TCAATCGACTTCTGGATTGGGT 236 236 249 236 236 236 psaC-ndhE

R GCCGCTGAAGCAGCTATTGG

Lo_i_08 F AATCGAGCGTTTCTTCGTTTT 220 220 220 196 220 220 ycf1

R GGGCAAATTCTTTACAGACAGAAC

SNP Lo_do_04 F AAACGGAATCGCGTTAGTGTGG 266 Na

a Na Na Na Na rps18-rpl20

R TCGGTTGAGTTCGGATTGGA

(54)
(55)

46

Figure 10. Validation of seven developed molecular markers from InDel and SNP regions of six Lonicera chloroplast genomes.

Schematic diagrams indicate CNVs of TR units and SNP polymorphism. Tandem repeats are designated by triangles. Direction of genes were represented by pentagons.

(56)

47

Table 12. Marker combinations for each Lonicera species

No. Species Lo_i_03 Lo_i_04 Lo_i_05 Lo_i_06 Lo_i_07 Lo_i_08 Lo_do_04

1 L. insularis A A A A A A A AAAAAAA 2 L. sachalinensis A A A A A A B AAAAAAB 3 L. praeflorens B B A A B A B BBAABAB 4 L. maackii C A A A A B B CAAAABB 5 L. vesicaria D C B A A A B DCBAAAB 6 L. japonica D D A B A A B DDABAAB

(57)

48

DISCUSSION

1. Complete chloroplast genome and 45S nrDNA sequences of six

Lonicera species derived from low-coverage whole-genome NGS data

Multi-copies of chloroplast genome and 45S nrDNA sequences exist in a plant cytoplasm and nucleus, respectively, which can be well explained why high coverage of reads were generated from small amount of WGS data (Table 1). Moreover, the chloroplast genome has been extensively used in understanding genetic diversity, authentication, and evolution in plants (Kim et al. 2015; Joh et al. 2017; Kim C-K et al. 2018). However, the genus Lonicera lacks such studies. Here, complete chloroplast genome and 45S nrDNA sequences were successfully obtained from the six Lonicera species (Figure 1, Figure 2). The three or four initial contigs that showed high homology to the reference sequence were extracted, and complete chloroplast genomes were generated by overlapping the contigs and manual curation. For the 45S nrDNA sequences, the longest contig which is like the reference sequence was extracted. This result demonstrated that assembly method used in this study is reliable and efficient to obtain complete chloroplast genome and 45S nrDNA sequences, as previously described (Kim et al. 2015). Through this study, the chloroplast genomes of six Lonicera species were completed, five of them except L. japonica were completed for the first time.

(58)

49

2. Comparative analysis of chloroplast genome and 45S nrDNA

sequences among six Lonicera species

Most of genes in the chloroplast genomes were shared among six Lonicera species, except for accD and rpoA genes. The accD gene, which is known to encode one of the acetyl-CoA carboxylase enzyme subunits, was pseudogenized in both L. vesicaria and L. japonica. Moreover, The rpoA gene, which encodes a subunit of RNA polymerase, was pseudogenized in L. japonica. Similar gene loss of accD and rpoA genes was found in some plants (Sugiura et al. 2003; Goffinet et al. 2005; Harris et al. 2013; Li J et al. 2016)

Several sequence variations were revealed by comparing the chloroplast genomes and 45S nrDNA sequences of the six Lonicera species. In chloroplast genome, nucleotide substitution has been used to study plant evolution and genome differentiation between species (Wolfe et al. 1987). Also, InDel have been known to play a major role in genome size increase (Britten et al. 2003). Although the six chloroplast genomes showed high similarity (97.4%), abundant polymorphisms such as SNP and InDel were confirmed.

In addition, we found some chloroplast protein-coding genes that showed high Ka/Ks value over 1 and 0.8 and conserved among six Lonicera species. Four genes, psaJ, rbcL, rps18 and ycf2 genes that showed high Ka/Ks value, might be involved in evolution under positive selection in Lonicera genus. On the other hand, four genes in LSC, psbI, psbZ, psbL and petZ genes, were highly conserved in genus Lonicera.

The different 45S nrDNA heterotypes were identified in this study such as those reported in Brassica genomes (Kim C-K et al. 2018). These heterotypes often occurred from hybridization or allopolyploidization, and could provide information about genome history or relationship (Reeder 1985).

(59)

50

The abundant variations of chloroplast genomes and 45S nrDNA sequences identified in this study will be valuable for barcoding in six Lonciera species as well as studying genetic diversity in the family Caprifoliaceae.

3. Repetitive sequences in Lonicera chloroplast genomes

Repeat structure which is originated from DNA strand repair mechanisms have been known to relate with genome recombination and divergence (Haberle et al. 2008). We found abundant tandem repeats in some genes such as accD, ycf2 and ycf1 (Table 4). These repetitive sequences could cause divergence in chloroplast genome between species.

SSRs consist of one to six or more nucleotide sequentially repetitive motifs in a head-to-tail structure and have been used to analyze genetic diversity (Kelkar et al. 2010). Furthermore, molecular marker derived from SSR polymorphism is useful for phylogenetic study, genome mapping and gene tagging due to its highly polymorphic features (Reddy et al. 2002). In this study, we found a total of 288 SSRs, varying in numbers and types between six Lonicera species (Table 3).

The abundant and variable repeats identified in six chloroplast genomes could be used to develop molecular markers for identifying the six Lonicera species as well as characterizing genetic diversity of Lonicera.

(60)

51

4. Divergence and phylogenetic analysis based on chloroplast genome

and 45S nrDNA sequences of the Lonicera species

Some previous studies carried out phylogenetic analysis using intergenic or coding regions derived from chloroplast genome and nrDNA sequences such as trnL-trnF, rpoB-trnC, petN-psbM, matK and ITS regions (Theis et al. 2008; Jeong et al. 2014). However, phylogenetic study using the complete chloroplast genome and 45S nrDNA sequences have not been reported. So, here, we used whole sequences of the chloroplast genome and 45S nrDNA to study the phylogenetic relationship between six Lonicera species, and carried out phylogenetic study with its related species belonging to Dipsacales using complete chloroplast genomes.

The topologies based on both chloroplast genome and 45S nrDNA sequences among six Lonicera species showed similar pattern. Phylogenetic analysis indicated that L. japonica was diverged first from the common ancestor of six Lonicera species (9.19-10.74 MYA), and ycf2 and rbcL genes might be related to divergence event considering Ka/Ks values (Figure 7A, Figure 7B and Figure 8). Moreover, two major subgroups which clustered L. insularis and L. praeflorens separately diverged 6.53-7.72 MYA and, psaJ and rps18 genes might be related to divergence event between two subgroups (Figure 7C, Figure 7D and Figure 8). In addition, the divergence time that we calculated was coincident with previous study by fossil calibrations (Smith 2009).

Based on the phylogenetic analysis with Lonicera related species, Lonicera species were grouped and close to Patrinia and Kolkwitzia genus which belong to Caprifoliaceae family, as expected. Moreover, based on the branch lengths, L. japonica (China collection) has longer branch and it is thought to be more diversified from their common ancestor than L. japonica (Korea collection). Furthermore,

(61)

52

chloroplast genome comparison between Korea and China L. japonica collections showed richer intra-species diversity.

5. Development of molecular markers for authentication of six Lonicera

species

The complete chloroplast genome has a conserved sequence and is known to be reliable and precise target for plant molecular marker because of their multi-copies and high inter-species variations compared with the nuclear genome (Li X et al. 2015). In this study, we developed chloroplast-derived markers based on polymorphic sites and successfully distinguished each species (Figure 10, Table 12). Among seven molecular markers, two markers, Lo_i_03 and Lo_i_04, were derived from CNV of TRs between six Lonicera species, so that four species except for L. insularis and L. sachalinensis could be identified with these two markers (Figure 10A, Figure 10B and Table 12). Four markers, Lo_i_05, Lo_i_06, Lo_i_07 and Lo_i_08, were also CNV of TR-based markers and specific to L. vesicaria, L. japonica, L. praeflorens and L. maackii, respectively (Figure 10C, Figure 10D, Figure 10E, Figure 10F and Table 12). Furthermore, the Lo_do_04 dominant marker was derived from SNP region, and solely amplified in L. insularis (Figure 10G, Table 12). These developed markers will be valuable to authenticate the six Lonicera species and provide useful information for genetic diversity studies.

(62)

53

REFERENCES

Allen G, Flores-Vergara M, Krasynanski S, Kumar S, Thompson W. 2006. A

modified protocol for rapid DNA isolation from plant tissues using

cetyltrimethylammonium bromide. Nature protocols. 1:2320.

Ammal EJ, Saunders B. 1952. Chromosome numbers in species of Lonicera.

Kew Bulletin.539-541.

Bauer JT, Shannon SM, Stoops RE, Reynolds HL. 2012. Context dependency

of the allelopathic effects of Lonicera maackii on seed germination.

Plant Ecology. 213:1907-1916.

Benson G. 1999. Tandem repeats finder: a program to analyze DNA

sequences. Nucleic acids research. 27:573.

Britten RJ, Rowen L, Williams J, Cameron RA. 2003. Majority of divergence

between closely related DNA samples is due to indels. Proceedings of

the National Academy of Sciences. 100:4661-4665.

Chang W-C, Hsu F-L. 1992. Inhibition of platelet activation and endothelial

cell injury by polyphenolic compounds isolated from Lonicera japonica

Thunb. Prostaglandins, Leukotrienes and Essential Fatty Acids.

45:307-312.

Chen J, Xia N, Wang X, Beeson RC, Chen J. 2017. Ploidy Level, Karyotype,

and DNA Content in the Genus Lonicera. HortScience. 52:1680-1686.

Forman T. 2011. Applying landscape ecology in biological conservation.

Springer Science & Business Media.

Goffinet B, Wickett NJ, Shaw AJ, Cox CJ. 2005. Phylogenetic significance

of the rpoA loss in the chloroplast genome of mosses. Taxon.

54:353-360.

Haberle RC, Fourcade HM, Boore JL, Jansen RK. 2008. Extensive

rearrangements in the chloroplast genome of Trachelium caeruleum are

associated with repeats and tRNA genes. Journal of Molecular Evolution.

66:350-361.

Harris ME, Meyer G, Vandergon T, Vandergon VO. 2013. Loss of the

acetyl-CoA carboxylase (accD) gene in Poales. Plant Molecular Biology

Reporter. 31:21-31.

He L, Qian J, Li X, Sun Z, Xu X, Chen S. 2017. Complete chloroplast genome

of medicinal plant lonicera japonica: genome rearrangement, intron gain

and loss, and implications for phylogenetic studies. Molecules. 22:249.

Huang Y, Yu F, Li X, Luo L, Wu J, Yang Y, Deng Z, Chen R, Zhang M.

2017. Comparative genetic analysis of the 45S rDNA intergenic spacers

from three Saccharum species. PloS one. 12:e0183447.

Jeong KS, Kim MS, Lee W, Pak J-H. 2014. Intraspecific variation and

geographic study of Lonicera insularis (Caprifoliaceae) based on

(63)

54

chloroplast DNA sequences. Korean Journal of Plant Taxonomy.

44:202-207.

Joh HJ, Kim N-H, Jayakodi M, Jang W, Park JY, Kim YC, In J-G, Yang T-J.

2017. Authentication of Golden-Berry P. ginseng

Cultivar

‘Gumpoong’from a Landrace ‘Hwangsook’Based on Pooling Method

Using Chloroplast-Derived Markers. Plant Breeding and Biotechnology.

5:16-24.

Kang KB, Lee DY, Kim MS, Kim TB, Yang T-J, Sung SH. 2018.

Argininosecologanin, a secoiridoid-derived guanidine alkaloid from the

roots of Lonicera insularis. Natural product research. 32:788-794.

Katoh K, Standley DM. 2013. MAFFT multiple sequence alignment software

version 7: improvements in performance and usability. Molecular

biology and evolution. 30:772-780.

Kelkar YD, Strubczewski N, Hile SE, Chiaromonte F, Eckert KA, Makova

KD. 2010. What is a microsatellite: a computational and experimental

definition based upon repeat mutational behavior at A/T and GT/AC

repeats. Genome biology and evolution. 2:620-635.

Kim C-K, Seol Y-J, Perumal S, Lee J, Waminal NE, Jayakodi M, Lee S-C,

Jin S, Choi B-S, Yu Y. 2018. Re-exploration of U’s Triangle Brassica

Species Based on Chloroplast Genomes and 45S nrDNA Sequences.

Scientific reports. 8:7353.

Kim K, Lee S-C, Lee J, Lee HO, Joh HJ, Kim N-H, Park H-S, Yang T-J. 2015.

Comprehensive survey of genetic diversity in chloroplast genomes and

45S nrDNAs within Panax ginseng species. PloS one. 10:e0117159.

Kim K, Lee S, Lee J, Yu Y, Yang Y, Choi B. 2015. Complete chloroplast ad

ribosomal sequences for 30 accessions elucidate evolution of Oryza AA

genome species. Sci Rep 5: 15655.

Kim Y-D, Kim S-H. 1999. Phylogeny of Weigela and Diervilla

(Caprifoliaceae) based on nuclear rDNA ITS sequences: biogeographic

and taxonomic implications. Journal of Plant Research. 112:331-341.

Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich

R. 2001. REPuter: the manifold applications of repeat analysis on a

genomic scale. Nucleic acids research. 29:4633-4642.

Lagesen K, Hallin P, Rødland EA, Stærfeldt H-H, Rognes T, Ussery DW.

2007. RNAmmer: consistent and rapid annotation of ribosomal RNA

genes. Nucleic acids research. 35:3100-3108.

Lee I, Lee S, Lee S, Yang S, Lee S, Eun J. 2016. 1568 Supplementation of

Korean honeysuckle (Lonicera vesicaria) extract in timothy hay on in

vitro ruminal fermentation. Journal of Animal Science.

94(supplement5):762-762.

Li J, Gao L, Chen S, Tao K, Su Y, Wang T. 2016. Evolution of short inverted

repeat in cupressophytes, transfer of accD to nucleus in Sciadopitys

(64)

55

verticillata and phylogenetic position of Sciadopityaceae. Scientific

reports. 6:20934.

Li X, Yang Y, Henry RJ, Rossetto M, Wang Y, Chen S. 2015. Plant DNA

barcoding: from gene to genome. Biological Reviews. 90:157-166.

Liu J, Zhang J, Wang F, Chen X. 2012. New secoiridoid glycosides from the

buds of Lonicera macranthoides. Natural product communications.

7:1561-1562.

Lohse M, Drechsel O, Bock R. 2007. OrganellarGenomeDRAW

(OGDRAW): a tool for the easy generation of high-quality custom

graphical maps of plastid and mitochondrial genomes. Current genetics.

52:267-274.

NAKAJI M, TANAKA N, SUGAWARA T. 2015. A Molecular Phylogenetic

Study of Lonicera L.(Caprifoliaceae) in Japan Based on Chloroplast

DNA Sequences. Acta Phytotaxonomica et Geobotanica. 66:137-151.

Palmer JD. 1985. Comparative organization of chloroplast genomes. Annual

review of genetics. 19(1):325-354.

Peng L-Y, Mei S-X, Jiang B, Zhou H, Sun H-D. 2000. Constituents from

Lonicera japonica. Fitoterapia. 71:713-715.

Reddy MP, Sarla N, Siddiq E. 2002. Inter simple sequence repeat (ISSR)

polymorphism and its application in plant breeding. Euphytica.

128:9-17.

Reeder RH. 1985. Mechanisms of nucleolar dominance in animals and plants.

The Journal of cell biology. 101:2013-2016.

Shang X, Pan H, Li M, Miao X, Ding H. 2011. Lonicera japonica Thunb.:

ethnopharmacology, phytochemistry and pharmacology of an important

traditional Chinese medicine. Journal of ethnopharmacology. 138:1-21.

Smith SA. 2009. Taking into account phylogenetic and divergence‐time

uncertainty in a parametric biogeographical analysis of the Northern

Hemisphere plant clade Caprifolieae. Journal of Biogeography.

36(:2324-2337.

Son KH, Jung KY, Chang HW, Kim HP, Kang SS. 1994. Triterpenoid

saponins from the aerial parts of Lonicera japonica. Phytochemistry.

35:1005-1008.

Sugiura C, Kobayashi Y, Aoki S, Sugita C, Sugita M. 2003. Complete

chloroplast DNA sequence of the moss Physcomitrella patens: evidence

for the loss and relocation of rpoA from the chloroplast to the nucleus.

Nucleic Acids Research. 31:5324-5331.

Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. 2013. MEGA6:

molecular evolutionary genetics analysis version 6.0. Molecular biology

and evolution. 30:2725-2729.

참조

관련 문서

This study examines the relationship between self-efficacy and creativity of general characteristics of school dance classes of middle school students, and

Objectives: The present study was conducted to determine the relationship between degree of work performance and job satisfaction in NICU nurses.. Methods: The subjects of

Objectives: This study aimed to investigate the relationship between job stress and turnover intention of employed opticians and to investigate the mediating effect of burnout

The purpose of this study was to analyze the relations between the leader-member exchange relationship, job satisfaction, and organizational citizenship behavior of sports

The purpose of this study is to verify whether The mediating effects of planned happenstance skills on the relationship between perceived social support

Purpose: The aim of this study is to identify the relationship between knowledge and awareness of radiation department students and their intent in

Also, for verifying the study hypothesis, unitary multi-variant analysis, correlation analysis and structural equation model analysis were carried out. The

This study was conducted to recognize the importance of self - management of dance major students and to understand the relationship between the sub -