Here we report complete genome sequence of Gram-positive, rod-shaped and chemoheterophic Lactobacillus paraplantarum strain CK401 with anti-avian flu activity. Circular chromo- some of 3,164,408 bp with 43.99% G + C content and five circular plasmids: pCK401A (15,102 bp with 37.33% G + C content), pCK401B (15,825 bp with 40.58% G + C content), pCK401C (12,380 bp with 36.03% G + C content), pCK401D (40,656 bp with 42.00% G + C content), and pCK401E (41,380 bp with 40.12% G + C content). Total genes of 3,179 in number were predicted: 3,088 are protein coding genes and 2,061 of protein coding genes could be functionally assigned. The chromosome harbors 5 operons of 16 rRNAs and 71 tRNA genes. No CRISPR locus was found in this genome. Canonical glycoylitic and pentose phosphate pathways plus lactate/alcohol/
butanediol dehydrogenases and acetate kinase suggested that the strain may have a heteroferementative metabolism. Diverse glycosyl hydrolases (GH) were endcoded in 47 genes in 17 GH families: alpha-amylases, alpha-/beta-galactosidases, beta-glu- cosidases, a beta-glucuronidase, a galactanase, and an inver- tase. Some bacteriocin genes and confirmed hyaluronic acid gene showed that the antibiotic-resistance-free strain CK401 will be a promising probiotic candidates for animals.
Keywords: Lactobacillus paraplantarum CK401, anti-influenza, complete genome
Gram-positive, rod-shaped and chemoheterophic Lacto- bacillus sp. strain CK401 was isolated from a kimchi, a Korean fermented food. Strain CK401 was assayed for its anti-avian flu activity using cultured supernatants whose oligosaccharides were confirmed (Kim et al., 2019). Here we report the complete genome anaylsis of the Lactobacillus sp. strain CK401 using single-molecule real-time technology (SMRT).
The strain was grown in Lactobacillus MRS medium at 30°C for 2 days. Genomic DNA was extracted using i-genomic BYF mini kit (iNtRON Biotechnology) following manufacturer’s protocols (Oh et al., 2019). Genome sequencing of the strain CK401 was performed using PacBio RS II (SMRT) sequencing technology (Pacific Biosciences). A standard PacBio library with an average of 20 kb inserts were prepared and were sequenced, yielding > 98× average genome coverage. De novo assembly of the 53,789 reads with 7,229 nucleotides on the average (388,889,412 bp in total) was conducted using the hierarchical genome-assembly process (HGAP) pipeline of the SMRT Analysis version 4 using default parameters (Chin et al., 2013).
For the taxonomic identification of the strain, the average nucleotide identity (ANI) values between closely related Lacto- bacillus spp. was calculated using Jspecies program (Richter and Rosselló-Móra, 2009) with default settings for ANI based on BLAST. Protein-coding genes and signal peptides and
Korean Journal of Microbiology (2020) Vol. 56, No. 3, pp. 347-350 pISSN 0440-2413
DOI https://doi.org/10.7845/kjm.2020.0049 eISSN 2383-9902
Copyright ⓒ 2020, The Microbiological Society of Korea
Complete genome sequence of Lactobacillus paraplantarum CK401
Dongil Jang
1and Hyun-Myung Oh
2*
1
Cotde, Inc., Cheonan 31252, Republic of Korea
2
Institute of Liberal Arts Education, Pukyong National University, Busan 48547, Republic of Korea
Lactobacillus paraplantarum CK401의 유전체 해독
장동일
1・ 오현명
2*
1
주식회사 콧데,
2부경대학교 기초교양교육원
(Received June 1, 2020; Revised September 18, 2020; Accepted September 21, 2020)
*For correspondence. E-mail: [email protected];
Tel.: +82-51-629-6869; Fax: +82-51-629-6949
348
∙ Jang and Oh미생물학회지 제56권 제3호
transmembrane regions were predicted by Prodigal v.2.6.3, SignalP v4.1, and TMHMM v2.0 respectively according to previous report (Oh et al., 2019). BLAST-searches were performed against UniProt, Pfam, and COG databases for functional annotations of the predicted coding sequences as in our previous publication (Oh et al., 2019). Ribosomal RNA, transfer RNA and miscellaneous features were predicted using Rfam v14.1 (Griffiths-Jones et al., 2005). Virulence factors was searched by BLASTN (coverage > 70%, identity > 70%) using VFDB database (Chen et al., 2005). Glycosyl hydrolases were annotated using the dbCAN (Yin et al., 2012). Genes of encoding bacteriocin was annotated using BAGEL3 database (van Heel et al., 2013).
The genome of strain CK401 (Table 1) consists of one circular chromosome of 3,164,408 bp with 43.99% G + C content (Fig.
1) and five of circular plasmids, pCK401A (15,102 bp with 37.33% G + C content), pCK401B (15,825 bp with 40.58% G + C content), pCK401C (12,380 bp with 36.03% G + C content),
pCK401D (40,656 bp with 42.00% G + C content) and pCK 401E (41,380 bp with 40.12% G + C content). Results of in silico genome to genome hybridization indicated that the strain CK401 was most similar to Lactobacillus paraplantarum DSM 10667
Tby the ANI value of 99.6%. A total of 3,179 genes were predicted in the genome of this strain, 3,088 of which are protein coding genes. 2,061 of protein coding genes were functionally assigned, while the rest of genes were annotated as hypothetical proteins by COG database. The chromosome harbors 16 rRNAs (5 operons made up of 5S, 16S, and 23S with additional one of 5S gene) and 71 tRNA genes. Plasmids dose not have any rRNA and tRNA gene (Table 1). No CRISPR locus was found in this genome when the CRISPR recognition tool (Bland et al., 2007) was used.
Chemoheterotrophic Lactobacillus paraplantarum strain CK401 harbors full gene sets for glycolysis via EMP pathway producing pyruvate and acetyl-CoA as well as pentose phos- phate pathway producing ribose 5-phosphate, the precursor for
Fig. 1. Graphical circular map of the chromosome of the strain CK401. From outside to the center: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNAs orange, rRNAs red, other RNAs green), GC content (black), and GC skew (light green/orange).
Genome of Lactobacillus paraplantarum CK401∙
349
Korean Journal of Microbiology, Vol. 56, No. 3 biosynthesis of purine, pyrimidine, and histidine. However, the
strain did not have genes encoding proteins involved in TCA cycle and oxidative phosphorylation system. Instead, the strain contains genes encoding enzymes for the fermentation such as L-lactate dehydrogenases (CK401_00731, CK401_00814, etc.), alcohol dehydrogenases (CK401_00358, CK401_01602, CK 401_02113), acetate kinase (CK401_01407, CK401_01497, CK401_02848), and meso-butanediol dehydrogenase (CK401_
03165). The presence of these genes indirectly showed that the strain may be a heterofermentative bacteria.
The genome harbors genes encoding diverse glycosyl hydrolases (47 CDSs of 17 GH families) containing alpha- amylases, alpha-/beta-galactosidases, beta-glucosidases, a beta- glucuronidase, a galactanase, and an invertase as well as proteins involved in phosphotransferase systems for mono- saccharides and disaccharides. In addition, the genome contains L-iditol 2-dehydrogenase (CK401_01743) and fructokinase (CK401_01521) for sorbitol degradation, and mannitol-1- phosphate 5-dehydrogenase (CK401_01473) for mannitol degradation. These results imply that the strain is able to utilize polysaccharides, disaccharides and sugar alcohols containing galactan, starch, sugar, maltose, cellobiose, sorbitol, mannitol, and melibitol as carbon sources as well as monosaccharides such as mannose, rhamnose, galactose, and glucose.
It was reported that many Lactobacillus strains have genes involved in the mevalonate pathway for the biosynthesis of carotenoids (Thorne and Kodicek, 1966). Mevalonate genes contain mevalonate kinase(CK401_00300;mvK1), diphospho-
mevalonate decarboxylase (CK401_00301;mvD), phospho- mevalonate kinase (CK401_00302;mvK2), hydroxymethyl- CoA reductase (CK401_01307;mvaA), and hydroxymethyl- glutaryl-CoA reductase (CK401_00042). The CK401 strain also harbors genes for the biosynthesis of a carotenoid com- pound such as lycopene or 4,4-diaponeurosporene. Isopentenyl- diphosphate delta-isomerase (CK401_00302;idi), geranylgeranyl diphosphate synthase (CK401_00412), phytoene synthase (CK401_01996;crtB), and phytoene dehydrogenase (CK401_
01997;crtI) could be found. These carotenoid compounds in Lactobacillus were known to contribute to overcoming the oxygen stress as antioxidants (Breithaupt et al., 2001). The strain has the gene encoding the glutamate decarboxylase (CK401_01887), which the strain may be produce the gamma aminobutyric acid. Unlike the genome of L. plantarum WCF1, this genome contains the full gene set for biosynthesis of riboflavin. In addition, the genome has the locus for bio- synthesis, and transport of bacteriocins such as plantaricin locus of L. plantarum. The locus consists of 18 genes (CK401_
01322 - CK401_01339; size of 15.1 kb), including the homo- logous genes to plnWUVSTHGEFIDCBAQLR. The locus did not contain the bacteriocin plnJK, but had the ppnC7 which encodes the leucocin K and the bacteriocin gene which was only found in the genomes of L. paraplantarum. While, UTP- glucose-1-phosphate uridylyltransferase (CK401_01098) was only identified as the virulence factor. It was known that the gene is involved in the biosynthesis of hyaluronic acid capsule, which were confirmed using biochemical methods. However,
Table 1. General features of L. paraplantarum CK401 complete genome
Attribute Chromosome pCK401A pCK401B pCK401C pCK401D pCK401E
Assembly size (bp) 3,164,408 15,102 15,825 12,380 40,656 41,380
Contigs 1 1 1 1 1 1
G + C content (%) 43.99% 37.33% 40.58% 36.03% 42.00% 40.12%
DNA coding region (%) 83.97% 62.75% 63.81% 59.54% 76.96% 73.52%
Total genes 3,036 21 20 12 51 39
rRNA genes 16 - - - - -
tRNA genes 71 - - - - -
Protein coding genes 2,945 21 20 12 51 39
Genes assigned to COGs 2,240 8 8 7 32 30
Genes with Pfam domains 2,355 9 13 6 33 31
Genes with signal peptides 153 0 0 0 0 2
Genes with transmembrane helices 7942 2 2 10 5
350
∙ Jang and Oh미생물학회지 제56권 제3호
the gene was frequently found in the genomes of diverse Lactobacillus groups. Antibiotic resistance and antiSMASH- predicted secondary metabolite gene clusters could not be found. No antibiotic resistance genes in this genome shows that the strain can be used as a potential probiotic for animals.
Nucleotide sequence and strain accession numbers The complete genome sequence of L. paraplantarum CK 401 has been deposited at DDBJ/EMBL/GenBank under the accession numbers CP053337 (chromosome), CP053338 (pCK 401A), CP053339 (pCK401B), CP053340 (pCK401C), CP 053341 (pCK401D), and CP053342 (pCK401E). The strain CK 401(= KCTC 13287BP) is available from Cotde Inc. (daniel@
cotde.co.kr) or from Korean Collection for Type Cultures.
적 요
항독감바이러스 활성을 갖는 그람 양성 종속영양 젖산 간균 Lactobacillus paraplantarum strain CK401의 유전체 서열 분석 결과를 실시하였다. 원형의 세균 유전체는 길이는 3,164,408 bp 였으며 43.99% G + C 비율을 가지며, 나머지 5개의 원형 플라 스미드가 존재하였다: pCK401A (15,102 bp / 37.33% G + C 비율), pCK401B (15,825 bp / 40.58% G + C 비율), pCK401C (12,380 bp / 36.03% G + C 비율), pCK401D (40,656 bp / 42.00%
G + C 비율), pCK401E (41,380 bp / 40.12% G + C 비율). 전체 3,179개의 유전자중에 3,088개는 단백질을 코드 하며, 이중에 2,061개는 단백질의 기능이 할당되었다. 유전체는 5개의 16 rRNA 오페론과 71개의 tRNA 유전자를 가지고 있었다. CRISPR 유전자 좌위는 존재하지 않았다. CK401 균주는 해당작용 및 5 탄당인산염 대사 경로가 존재하며, 젖산, 에탄올, 부탄디올 탈 수소 효소 및 초산 인산화 효소에 의해 이종발효 대사(hetero- ferementative metabolism)를 할 것으로 예상된다. 다양한 당 가수분해 효소는 47개의 유전자가 17개의 GH families에 속하 였으며, 알파-아밀라아제, 알파-/베타-갈락토시다제, 베타-글 루코시다제, 베타-글루쿠로니다제, 갈락타나제, 및 인버타제 를 포함했다. 일부 알려진 bacteriocin 유전자와 이미 활성을 확인한 히알루론산 유전자를 가지며, 항생제 내성 유전자 또 한 존재하지 않는 CK401 균주의 유전체 분석을 통해 이 균주 가 잠재력 있는 가축용 프로바이오틱이 될 수 있음을 보여 주 었다.
Acknowledgments
This research was supported the National Research Foun- dation (NRF-2017R1D1A1B03034706).
References
Bland C, Ramsey TL, Sabree F, Lowe M, Brown K, Kyrpides NC, and Hugenholtz P. 2007. CRISPR Recognition Tool (CRT): a tool for automatic detection of clustered regularly interspaced palin- dromic repeats. BMC Bioinformatics 8, 209.
Breit haupt DE, Schwack W, Wolf G, and Hammes WP. 2001.
Characterization of the triterpenoid 4,4'-diaponeurosporene and its isomers in food-associated bacteria. Eur. Food Res. Technol.
213, 231–233.
Chen L, Yang J, Yu J, Yao Z, Sun L, Shen Y, and Jin Q. 2005. VFDB:
a reference database for bacterial virulence factors. Nucleic Acids Res. 33, D325–D328.
Chin CS, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, Clum A, Copeland A, Huddleston J, Eichler EE, et al. 2013.
Nonhybrid, finished microbial genome assemblies from long- read SMRT sequencing data. Nat. Methods 10, 563–569.
Griffiths-Jones S, Moxon S, Marshall M, Khanna A, Eddy SR, and Bateman A. 2005. Rfam: annotating non-coding RNAs in com- plete genomes. Nucleic Acids Res. 33, D121–D124.
Kim S, Oh DB, and Kang JY. 2019. Composition for anti-influenza virus comprising as active ingredient polysaccharide derived from Lactobacillus plantarum and method for producing the polysaccharide. Vol. 1020074210000, Republic of Korea.
Oh HM, Kim DH, Han SJ, Song JH, Kim K, and Jang D. 2019.
Complete genome sequence of Marinobacter salarius HL2708#2 isolated from a lava sea water environment on Jeju Island.
Korean J. Microbiol. 55, 69–73.
Richter M and Rosselló-Móra R. 2009. Shifting the genomic gold standard for the prokaryotic species definition. Proc. Natl. Acad.
Sci. USA 106, 19126–19131.
Thorne KJI and Kodicek E. 1966. The structure of bactoprenol, a lipid formed by Lactobacilli from mevalonic acid. Biochem. J. 99, 123–127.
van Heel AJ, de Jong A, Montalbán-López M, Kok J, and Kuipers OP.
2013. BAGEL3: Automated identification of genes encoding bacteriocins and (non-)bactericidal posttranslationally modified peptides. Nucleic Acids Res. 41, W448–W453.
Yin Y, Mao X, Yang J, Chen X, Mao F, and Xu Y. 2012. dbCAN: a web resource for automated carbohydrate-active enzyme annotation.
Nucleic Acids Res. 40, W445–W451.