Evaluation of Lipid Transfer Protein (Ltp) 1 Gene in Sesame and Other Plants Using Bioinformatics Approach

Research Article

Evaluation of Lipid Transfer Protein (Ltp) 1 Gene in Sesame and Other Plants Using Bioinformatics Approach

  • Edu N.E, Udensi O.U
  • Agada F.N *
  • Ubi G.M
  • Agbor R

Department of Genetics and Biotechnology, University of Calabar, PMB 1115, Calabar, Nigeria.

*Corresponding Author: Department of Genetics and Biotechnology, University of Calabar, PMB 1115, Calabar, Nigeria.

Citation: Edu N.E, Udensi O.U, Agada F.N, Ubi G.M, Agbor R. (2023). Evaluation of Lipid Transfer Protein (Ltp) 1 Gene in Sesame and Other Plants Using Bioinformatics Approach. Journal of BioMed Research and Reports, BioRes Scientia Publishers. 2(6):1-20. DOI: 10.59657/2837-4681.brs.23.037

Copyright: © 2023 Agada F.N, this is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Received: September 05, 2023 | Accepted: September 20, 2023 | Published: September 27, 2023

Abstract

This study was aimed at using bioinformatics tools to characterize the Lipid Transfer Protein 1 gene in some selected accessions with special reference to Benny seed (Sesamum indicum) Lipid Transfer Protein 1 sequence as a query sequence. Nucleotide and amino acid sequences of 30 accessions were retrieved from NCBI database and analyzed for homology, physicochemical properties, motifs, GC content as well as phylogenetic relationships. Results showed that nucleotide and amino acid sequence lengths of this gene among the selected accessions differs. Its nucleotide length varied between 599–8461bp, while the amino acids sequence varied between 96–355 residues, Molecular weight range from 10008.77-35532.61daltons. With Sesamum indicum having the lowest molecular weight and Physcomitrium patens having the highest molecular weight. Result on the Theoretical PI was above 4.61 for all the amino acid sequences of Lipid Transfer Protein 1 gene in the selected accessions. It was observed that the total number of negatively charged residues ranged from 1-20. The instability index and aliphatic index ranged from 20.23–69.39, 73.48–102.24 respectively. Some of the proteins are stable, while twelve were considered unstable following the results for instability index. Extinction coefficient was highest for Sesamum indicum (14480). Daucus carota subsp. Sativus (-0213) is the only accessions with a negative GRAVY. The motifs N-glycosylation site, Plant lipid transfer proteins signature, N-myristoylation site, Casein kinase II phosphorylation site, Protein kinase C phosphorylation site were the most common across the selected accessions. GC content analysis revealed that it ranged from ranged from 29.73–54.55%. Analysis of the secondary structure of the amino acid sequences of the Lipid Transfer Protein 1 gene showed that the region covered by random coil was the highest in the sequences compared to alpha helix and extended strand. Alpha helix ranges from 33.11-54.31%, the extended strands ranged from 9.17–15.13%, while the random coil ranges from 32.77–51.16% across the accessions. Following the results of the present study, it can be concluded that Lipid Transfer Protein 1 gene sequence of Sesamum indicum is closely related to Lipid Transfer Protein 1 gene in Brachypodium distachyon and distant to that in Glycine max, Vigna unguiculata, Capsicum annum.


Keywords: lipid transfer protein 1; sesamum indicum; bioinformatics

Introduction

Sesamum indicum is an annual herb of the family Pedaliaceae. The plant is commonly known as benniseed, benniseed in English and it is found in tropical and subtropical areas of Asia, Africa and South America. Compared to similar crops, such as peanuts, soybean, and rapeseed, the seeds of sesame are believed to have the most oil. Sesame seed is one of the oldest oilseed crops known, domesticated well over 3000 years ago. S. indicum has many other species, most being wild and native to sub-Saharan Africa. S. indicum, the cultivated type, originated in India (Ogasawara, et al., 1988) and is tolerant to drought-like conditions, growing where other crops fail (Raghav et al., 1990). Sesame has been widely known for its oil seeds production which has also finds wide applications in the food and pharmaceutical industries due to their significance. The genes responsible for this oil production trait is the lipid transfer protein 1 gene whose properties, functions and structures, this research tries to analyze using in silico approach. The expression in silico was first used in public in 1989 in the workshop “Cellular Automata”: Theory and Application in Los Alamos New Mexico by Pedro Miramontes a Mathematician from National Autonomous University of Mexico (UNAM) who presented the report “DNA and RNA physiochemical constraints, Cellular Automata and Molecular Evolution. Plants genome shows a wide array of architectures i.e. (genetic make-up) varying immensely in size, structures and content. Some organelle DNA’s have even developed elaborate peculiarities such as scrambled coding regions, non-standard genetic codes and convoluted modes of post-transcriptional modification and editing all of which has been deciphered using bioinformatics tools.

Bioinformatics is an interdisciplinary field that develops methods and software tools for understanding biological data. As an interdisciplinary field of science, bioinformatics combines Computer Science, Biology, Mathematics and Engineering to analyze and interpret biological data. Bioinformatics has been used for in silico analyses of biological data and genes using mathematical and statistical techniques. Bioinformatics has become an important part of many areas of biology. In experimental molecular biology, Bioinformatics techniques such as image and signal processing allow extraction of useful results from large amount of raw data. In the field of genetics and genomics, it aids in sequencing and annotating genomes and their observed mutations. It plays a role in the text miming of biological literature and the development of biological and gene ontologies to organize and query biological data. It also plays a role in the analysis of gene and protein expression and regulation. Bioinformatics tools aid in the comparison of genetic and genomic data and more generally in the understanding of evolutionary aspects of molecular biology. At a more integrative level, it helps to analyze and catalogue the biological pathways and networks that are an important part of systems biology. In structural biology, it aids in the simulation and modeling of DNA, RNA, proteins as well as bimolecular interactions.

In Genetics and Biochemistry, in silico studies can be used to examine the molecular modeling of gene, gene expression, gene sequence analysis and 3D structure of proteins, identification of diseases and prediction of lipid metabolic pathways. In silico studies/drugs designing software plays an important role to design innovative proteins.

With the increasing concern on the side effects caused by modern synthetic on chemical drugs, oil seeds and medicinal plants remain the main source of a large range of basic healthcare and pharmaceutical products. Successful attempts, to produce some of the valuable in relatively large quantities by cell cultures have been reported.

Oil rich plants such as Sesame indicum have been used as a source of food and medicine since historic times. In the era of high volume, high throughput data generation across the biosciences, bioinformatics plays a crucial role in food, drug design, drug discovery and metabolism. Availability of the functional and active components of lipid transfer protein genes in plant species is an indication of increase yield and high output for medicinal relevance and therapeutic security. The yield and potency of active ingredients in the oil rich plants will depend largely on the expressions in functional and structural of the LTP gene that controls photosynthesis, nutrition and lipid metabolism in general. There is therefore, need to study lipid transfer protein genes underlying this potential to unveil the physiochemical characteristics and other parameters which makes the usefulness of these genes unique to the plant breeders and biotechnologists. The study is aimed at using in silico ­approach to evaluate lipid transfer protein (LTP) gene variations in sesame and other plants. The objectives of the study are to identify the percentage identity and similarity in the Lipid Transfer Protein (LTP) gene in sesame and other plants, to determine the variations in the physiochemical properties of the Lipid transfer protein (LTP) gene in sesame and other plants, to determine the Lipid transfer protein (LTP) gene stability in terms of their Guanine–cytosine contents.

Materials & Methods

Experimental site

The in-silico study was carried out in the bioinformatics laboratory of the Department of Genetics and Biotechnology, University of Calabar, Calabar.

Retrieval of Nucleotides and Amino Acid Sequences

The nucleotides and amino acid sequences of the lipid transfer protein genes in low and high oil seeds producing varieties of sesame was retrieved using the FASTA format from the National Centre for Biotechnology information (NCBI) database. The accession numbers, sequence lengths and E-values was recorded.

Determination of Percent Identity and Similarity (Homology)

Percentage identity and similarity among the nucleotides and amino acid sequences of the retrieved sequences for lipid transfer protein genes in low and high oil seeds producing varieties of sesame plants was determined using similarity homology comparison tool for more than two sequences option of the basic alignment search tool. 

Determination of Physico Chemical Properties of the lipid transfer protein genes in low and high oil seeds producing varieties of sesame plant species

The physico-chemical properties of the lipid transfer protein genes in low and high oil seeds producing varieties of sesame were determined using the Expert Protein Analysis System (EXPASY) which is a proteomic server of the Swiss Institute of Bioinformatics (SIB) using the online program of the Expasy site. The proteomic server of the expasy.org was used to access the protparam site which displays the physiochemical properties of the query amino acid sequences. 

Determination of secondary and tertiary protein structures of lipid transfer protein genes in low and high oil seeds producing varieties of sesame

Prediction of motif for secondary structure was done using NSOPMA software. The motif for the prediction tertiary structure (3D protein structure) for the lipid transfer protein genes was done by pasting the protein sequences on the interactive online platform of Phyre and phyre or phyre 2 (Protein Homolog Y Analysis Recognition Engine) based on translated amino acid sequence retrieved from the NCBI databases as modified by Kelley and Stemberg, 2009. The Rasmol (Raswin) software was used to fine tune the 3D protein structure to ribbons or cartoons with desired colours and magnitudes. 

Phylogenetics analysis of lipid transfer protein genes in low and high oil seeds producing varieties of sesame 

The MEGA X software was used to align the retrieved DNA and protein sequences and subject to phylogenetic analysis for phylogenetic tree, substitution model selection, evolutionary distance estimation, phylogeny inference, substitution rate and pattern estimation, test of natural selection and ancestral sequence inference. 

Determination of Guanine–Cytosine contents and N-terminal amino acids side chains

The Genscan online interaction programme of Expasy.org. and SWISS institute of Bioinformatic (SIB) suite was used to determine the guanine – cytosine contents of the nucleotide sequences for the lipid transfer protein genes for the low and high oil seeds yielding varieties of sesame as well as the N-terminal amino acid side chains. 

Results

Determination of nucleotide and amino acid sequences length in selected accessions

Results obtained for sequence lengths of nucleotide and amino acids sequences of Lipid Transfer Protein 1 gene showed that the nucleotide sequence lengths ranged from 599–8461bp. While amino acid sequence lengths ranged from 96 – 355 residues. It was observed that nucleotide sequence of Lipid Transfer Protein 1 gene in Sorghum bicolor (8461bp) was the longest while that in Sesamum indicum had the shortest sequence (599bp) as shown in table 1.

Table 1: Nucleotide and amino acid sequences of Lipid Transfer Protein 1 gene in selected plants.

Name of PlantsAccession numberAADNA sequencing
Sesamum indicumNC_02615796599
Ricinus communisNW_002994932115848
Daucus carota subsp. sativusNC_0303861231349
Medicago truncatulaNC_053042151967
Helianthus annusNC_0354331191026
Nicotiana tabacumNW_0159048311171217
Gossypium hirsutumNC_053433120761
Vigna unguiculataNC_0402791831856
Nicotiana attenuataNC_0319951171372
Dendrobium catenatumNW_0213193091181321
Musa acuminataNC_0252021171264
Brassica rapaNC_024803147794
Oryza sativa Japonica GroupNC_029256120924
Physcomitrium patensNC_0372533551856
Vigna radiataNW_0145431151171670
Prunus persicaNC_034014117821
Arachis hypogaeaNC_0376191161528
Solanum tuberosumNW_006239682114968
Sorghum bicolorNC_0128771198461
Glycine max (2)NC_0160881194094
Brachypodium distachyonNC_0161311722871
Capsicum annumNC_029977142769
Arabidopsis thalianaNC_003071118951
Solanum pennelliiNC_028646114930
Vitis viniferaNC_0120072012620
Juglans regiaNC_0499031191151
Pistacia veraNW_022196275117613
Brassica napusNC_027765147773
Olea europaea var. sylvestrisNC_036249118736
Glycine maxNC_0160881194094

Determination of Percentage Identity and Similarity of the Nucleotide Sequence of Lipid Transfer Protein Gene in the Selected Accessions

Result of the percentage identity of amino acid sequence of Lipid Transfer Protein 1 gene in the selected accessions showed that the highest identity to the LTP1 sequence in the query.

Physiochemical Properties of Lipid Transfer Protein 1 gene in the selected sequences

Result of the analysis of the physicochemical properties of amino acid of Lipid Transfer Protein 1 gene in the selected accessions showed that the number of amino acid residues ranged from 96–355 residues. Molecular weight ranged from 10008.77-35532.61daltons. With Sesamum indicum having the lowest molecular weight and Physcomitrium patens. having the highest molecular weight, as shown in Table 3. None of the other accessions had amino acid residues and molecular weight equal to that of Lipid Transfer Protein 1 gene in the query. Result of the theoretical PI was above 4.61 for all the amino acid sequence of Lipid Transfer Protein 1 gene in the selected accessions. It was observed that the total number of negatively charged residues ranged from 1–20. The total number of positively charged residues ranged from 6–19. The instability index and the aliphatic index ranged from 20.23–69.39, 73.48–102.24 respectively. Some of the proteins are stable, while twelve were considered unstable following the results for instability index. Extinction coefficient was highest for Sesamum indicum (14480). Daucus carota subsp. Sativus (-0213) is the only accessions with a negative GRAVY. (Table 3).

Table 2: Percentage identity, similarity of Lipid Transfer Protein 1 gene in selected accessions.

Name of PlantsAccession numberPercentage identity (%)Percentage Similarity (%)
Sesamum indicumNC_0261579697
Ricinus communisNW_0029949329395
Daucus carota subsp. sativusNC_0303868892
Medicago truncatulaNC_05304297100
Helianthus annusNC_0354339194
Nicotiana tabacumNW_0159048319497
Gossypium hirsutumNC_0534339296
Vigna unguiculataNC_04027998100
Nicotiana attenuataNC_0319958893
Dendrobium catenatumNW_0213193099699
Musa acuminataNC_0252028795
Brassica rapaNC_02480397100
Oryza sativa Japonica GroupNC_0292564070
Physcomitrium patensNC_0372538793
Vigna radiataNW_0145431158592
Prunus persicaNC_0340148391
Arachis hypogaeaNC_0376198892
Solanum tuberosumNW_0062396829698
Sorghum bicolorNC_0128779397
Glycine max(2)NC_01608897100
Brachypodium distachyonNC_0161319498
Capsicum annumNC_02997792100
Arabidopsis thalianaNC_0030719196
Solanum pennelliiNC_0286468792
Vitis viniferaNC_0120078492
Juglans regiaNC_0499038691
Pistacia veraNW_0221962759295
Brassica napusNC_02776595100
Olea europaea var. sylvestrisNC_0362499596
Glycine maxNC_01608897100

Table 3: Physico-chemical properties of amino acid sequence of lipid transfer protein 1 gene in selected accessions.

S/NAccess. NoName Of PlantsAaMol. WtThe. PiExt. Coeff.IiA IndexGravy-Ve (Asp+Glu)+Ve (Arg+Lys)Seq. Length (Bp)No. of Atoms
1.NC_026157Sesamum indicum9610008.777.501448033.7985.310.408785991405
2.NW_002994932Ricinus communis11511766.758.05360523.0483.910.449578481633
3.NC_030386Daucus carota subsp. Sativus12313429.789.55149020.2375.45-0.213131913491919
4.NC_053042Medicago truncatula15115344.138.81211552.2494.970.3215119672189
5.NC_035433Helianthus annus11912418.569.37497031.8689.240.16141310261748
6.NW_015904831Nicotiana tabacum11711757.698.10348030.9197.610.5344612171645
7.NC_053433Gossypium hirsutum12011834.889.04360527.5589.500.463297611660
8.NC_040279Vigna unguiculata18319377.628.06898050.5188.420.190131518562732
9.NC_031995Nicotiana attenuate11711793.818.63497031.6499.320.5393713721656
10.NW_021319309Dendrobium catenatum11812095.228.89898069.3986.950.4174913211696
11.NC_025202Musa acuminate11711758.809.32348040.3497.520.44641212641675
12.NC_024803Brassica rapa14715420.649.59211547.37102.240.2663177942229
13.NC_029256Oryza sativa Japonica Group12012092.249.28646038.34100.080.558199241709
14.NC_037253Physcomitrium patens35535532.614.61410551.6978.450.315201118564914
15.NW_014543115Vigna radiate11711704.859.08211530.5494.360.59821016701655
16.NC_034014Prunus persica11711806.909.25497032.6194.270.434198211664
17.NC_037619Arachis hypogaea11611539.709.28199020.3186.810.5851915281619
18.NW_006239682Solanum tuberosum11411471.648.73360539.0495.880.5105109681622
19.NC_012877Sorghum bicolor11911564.199.30348036.6988.820.44631084611614
20.NC_016088Glycine max(2)11912583.669.95360548.5486.130.15041640941760
21.NC_016131Brachypodium distachyon17216678.427.50497060.9193.310.5927828712355
22.NC_029977Capsicum annum14214654.839.5562528.53109.860.3873167692136
23.NC_003071Arabidopsis thaliana11811754.899.30360546.1997.710.4751109511658
24.NC_028646Solanum pennellii11411542.768.87360538.3095.880.4735119301636
25.NC_012007Vitis vinifera20121011.925.841047040.3373.480.042171626202922
26.NC_049903Juglans regia11911741.959.20348035.8395.880.52431111511671
27.NW_022196275Pistacia vera11712090.924.86509522.3978.550.241546131658
28.NC_027765Brassica napus14715420.649.59211547.37102.240.2663177732229
29.NC_036249Olea europaea var. sylvestris11811980.289.20944030.1790.000.3984137361704
30.NC_016088Glycine max11912583.669.95360548.5486.130.15041640941760

Table 4: Motifs in Amino acid sequence of Lipid Transfer Protein 1 gene in selected accessions.

PlantsAccessionsMotif
Sesamum indicumNC_026157N-myristoylation site, Protease inhibitor/seed storage/LTP family, Extensin_1 Extensin-like protein repeat
Ricinus communisNW_002994932N-glycosylation site, Casein kinase II phosphorylation site, Protein kinase C phosphorylation site, N-myristoylation site, Protease inhibitor/seed storage/LTP family, Big-1 (bacterial Ig-like domain 1) domain profile
Daucus carota subsp. sativusNC_030386Casein kinase II phosphorylation site, N-myristoylation site, Protein kinase C phosphorylation site, SCP-2 sterol transfer family, Protein kinase C phosphorylation site, Microbodies C-terminal targeting signal
Medicago truncatulaNC_053042Protein kinase C phosphorylation site, N-myristoylation site, Casein kinase II phosphorylation site, Lysosome-associated membrane glycoprotein family, Proline-rich region profile, Tryp_alpha_amyl Protease inhibitor family, MYRISTYL N-myristoylation site, DEC-1 protein, N-terminal region
Helianthus annusNC_035433Casein kinase II phosphorylation site, Plant lipid transfer proteins signature, N-myristoylation site, Protein kinase C phosphorylation site, Protease inhibitor/seed storage/LTP family
Nicotiana tabacumNW_015904831N-myristoylation site, Plant lipid transfer proteins signature, Casein kinase II phosphorylation site, N-glycosylation site, Agouti protein, Protease inhibitor/seed storage/LTP family
Gossypium hirsutumNC_053433Protein kinase C phosphorylation site, Plant lipid transfer proteins signature, Casein kinase II phosphorylation site, N-myristoylation site, Amastin surface glycoprotein, Protease inhibitor/seed storage/LTP family
Vigna unguiculataNC_040279cAMP- and cGMP-dependent protein kinase phosphorylation site, Protein kinase C phosphorylation site, Casein kinase II phosphorylation site, N-glycosylation site N-myristoylation site, Big-1 (bacterial Ig-like domain 1) domain profile, Protease inhibitor/seed storage/LTP family, NPR nonapeptide repeat
Nicotiana attenuataNC_031995N-glycosylation site, Casein kinase II phosphorylation site, N-myristoylation site, Plant lipid transfer proteins signature, Agouti protein, Protease inhibitor/seed storage/LTP family
Dendrobium catenatumNW_021319309N-myristoylation site, N-glycosylation site, Plant lipid transfer proteins signature,  Ankyrin repeat, Sialic-acid binding micronemal adhesive repeat, Protease inhibitor/seed storage/LTP family
Musa acuminataNC_025202Casein kinase II phosphorylation site, Protease inhibitor/seed storage/LTP family, N-myristoylation site, Plant lipid transfer proteins signature
Brassica rapaNC_024803Proline-rich region profile, N-glycosylation site, Protease inhibitor/seed storage/LTP family,  Protein kinase C phosphorylation site, N-myristoylation site, Procyclic acidic repetitive protein (PARP)
Oryza sativa Japonica GroupNC_029256N-glycosylation site, Protease inhibitor/seed storage/LTP family, Plant lipid transfer proteins signature, N-myristoylation site
Physcomitrium patensNC_037253N-myristoylation site, Protein kinase C phosphorylation site, Casein kinase II phosphorylation site, N-glycosylation site, Protease inhibitor/seed storage/LTP family
Vigna radiataNW_014543115Plant lipid transfer proteins signature, Protein kinase C phosphorylation site, N-myristoylation site, Casein kinase II phosphorylation site, Protease inhibitor/seed storage/LTP family
Prunus persicaNC_034014Plant lipid transfer proteins signature, N-myristoylation site, Casein kinase II phosphorylation site, Protein kinase C phosphorylation site, Protease inhibitor/seed storage/LTP family
Arachis hypogaeaNC_037619Plant lipid transfer proteins signature, N-myristoylation site, Casein kinase II phosphorylation site, Protein kinase C phosphorylation site, Protease inhibitor/seed storage/LTP family
Solanum tuberosumNW_006239682Plant lipid transfer proteins signature, N-myristoylation site, Protein kinase C phosphorylation site, Casein kinase II phosphorylation site, Protease inhibitor/seed storage/LTP family
Sorghum bicolorNC_012877Plant lipid transfer proteins signature, Casein kinase II phosphorylation site, N-myristoylation site, Alanine-rich region profile, Protease inhibitor/seed storage/LTP family
Glycine max(2)NC_016088Plant lipid transfer proteins signature, Protein kinase C phosphorylation site, Casein kinase II phosphorylation site, N-myristoylation site, Protease inhibitor/seed storage/LTP family
Brachypodium distachyonNC_016131N-myristoylation site, Protein kinase C phosphorylation site, N-glycosylation site, Alanine-rich region profile, Protease inhibitor/seed storage/LTP family
Capsicum annumNC_029977Prokaryotic membrane lipoprotein lipid attachment site profile, Proline-rich region profile,  Protein kinase C phosphorylation site, cAMP- and cGMP-dependent protein kinase phosphorylation site, N-myristoylation site, Amidation site, Protease inhibitor/seed storage/LTP family, Penaeidin, Protease inhibitor/seed storage/LTP family
Arabidopsis thalianaNC_003071Prokaryotic membrane lipoprotein lipid attachment site profile, Plant lipid transfer proteins signature, N-myristoylation site, Casein kinase II phosphorylation site, Protein kinase C phosphorylation site, Protease inhibitor/seed storage/LTP family
Solanum pennelliiNC_028646Plant lipid transfer proteins signature, N-myristoylation site, Protein kinase C phosphorylation site, Casein kinase II phosphorylation site, Protease inhibitor/seed storage/LTP family
Vitis viniferaNC_012007N-myristoylation site, Casein kinase II phosphorylation site, Protein kinase C phosphorylation site, N-glycosylation site, CHCH domain, Protease inhibitor/seed storage/LTP family
Juglans regiaNC_049903Plant lipid transfer proteins signature, N-myristoylation site, Casein kinase II phosphorylation site, Protease inhibitor/seed storage/LTP family, cAMP- and cGMP-dependent protein kinase phosphorylation site
Pistacia veraNW_022196275Casein kinase II phosphorylation site, Plant lipid transfer proteins signature, N-glycosylation site, Big-1 (bacterial Ig-like domain 1) domain profile, Protein kinase C phosphorylation site, N-myristoylation site
Brassica napusNC_027765Proline-rich region profile, N-glycosylation site, Protein kinase C phosphorylation site, N-myristoylation site
Olea europaea var. sylvestrisNC_036249N-myristoylation site, Plant lipid transfer proteins signature, Protein kinase C phosphorylation site, Protease inhibitor/seed storage/LTP family, Casein kinase II phosphorylation site
Glycine maxNC_016088Plant lipid transfer proteins signature, Protein kinase C phosphorylation site, Casein kinase II phosphorylation site, N-myristoylation site, Protease inhibitor/seed storage/LTP family

Table 5: G-C contents and other parameters of nucleotide sequence of Lipid Transfer Protein 1 gene in selected accessions.

AccessionsAccession numbersGC Contents (%)Sequence Length (bp)Start CodonEnd Codon
Sesamum indicumNC_02615748.75599127417
Ricinus communisNW_00299493239.8684882419
Daucus carota subsp. sativusNC_03038634.911349175300
Medicago truncatulaNC_05304234.75967127582
Helianthus annusNC_03543335.87102658407
Nicotiana tabacumNW_01590483136.57121796439
Gossypium hirsutumNC_05343345.07761330519
Vigna unguiculataNC_04027939.82185686425
Nicotiana attenuataNC_03199536.441372216559
Dendrobium catenatumNW_02131930935.43132172422
Musa acuminataNC_02520248.18126476419
Brassica rapaNC_02480340.68794102545
Oryza sativa Japonica GroupNC_02925654.55924121489
Physcomitrium patensNC_03725352.8618564741445
Vigna radiataNW_01454311531.80167096436
Prunus persicaNC_03401443.4882197440
Arachis hypogaeaNC_03761931.481528267475
Solanum tuberosumNW_00623968235.8596896430
Sorghum bicolorNC_01287741.738461247342
Glycine max(2)NC_01608829.734094216565
Brachypodium distachyonNC_01613148.692871396735
Capsicum annumNC_02997736.15769150271
Arabidopsis thalianaNC_00307140.06951129475
Solanum pennelliiNC_02864634.30930104438
Vitis viniferaNC_01200735.842620102465
Juglans regiaNC_04990339.88115183436
Pistacia veraNW_02219627543.07613102234
Brassica napusNC_02776540.4977381524
Olea europaea var. sylvestrisNC_03624943.07736113436
Glycine maxNC_01608829.734094216565

Motifs in Amino Acid Sequences of Lipid Transfer

Result of the analysis of Motifs in amino acid sequence of Lipid Transfer Protein 1 gene in the selected accessions showed that the motifs were within 3 - 9 across accessions. The motifs N-glycosylation site, Plant lipid transfer proteins signature, N-myristoylation site, Casein kinase II phosphorylation site, Protein kinase C phosphorylation site were the most common motifs found in the amino acid sequences of Lipid transfer Protein 1 gene in the selected accessions.

GC content and start and end codons of Lipid Transfer Protein 1 gene in the Selected Accessions

Result of GC analysis revealed that GC content of Lipid Transfer Protein 1 gene in the selected accessions ranged from 29.73 – 54.55% with Glycine max having the lowest GC content and Oryza sativa Japonica Group having the highest GC content (54.55%). The start and end codons for the Lipid Transfer Protein 1 gene in the accessions selected varied generally although some accessions had the same start codons but not the same end codons. (Table 5).

Secondary Structure of Amino Acid and Sequences of Lipid Transfer Protein 1 gene in the Selected Accessions

Results of the analysis of secondary structures of the amino acid sequences of Lipid Transfer Protein 1 gene showed that the region covered by random coil was the highest in the sequences compared to alpha helix and extended strand. Alpha helix ranges from 33.11 - 54.31%, the extended strands ranged from 9.17 – 15.13%, while the random coil ranges from 32.77 – 51.16

Phylogenetic relationship of Lipid Transfer Protein 1 gene sequence in the selected accessions

The Lipid Transfer Protein 1 gene sequence from the selected accessions analyzed showed that there was

Figure 1: Phylogenetic tree of Lipid Transfer Protein 1 gene in the selected accessions.

Tertiary structure of amino acid sequences of Lipid Transfer Protein 1 gene in the selected accessions

Results of the analysis of the tertiary protein structure of Lipid Transfer Protein 1 gene in the selected accessions are shown in Figure 2-31. In each figure, the pink portion represents the alpha-helix, the blue portion of the ribbon structure represents the random coil, the green represents the extended strand.

Figure 2: The Tertiary structure of Lipid Transfer Protein gene of Sesamum indicum.

Figure 3: The Tertiary structure of Lipid Transfer Protein gene of Ricinus communis.

Figure 4: The Tertiary structure of Lipid Transfer Protein gene of Daucus carota subsp. Sativus.

Figure 5: The Tertiary structure of Lipid Transfer Protein gene of Medicago truncatula.

Figure 6: The Tertiary structure of Lipid Transfer Protein gene of Helianthus annus.

Figure 7: The Tertiary structure of Lipid Transfer Protein gene of Nicotiana tabacum.

Figure 8: The Tertiary structure of Lipid Transfer Protein gene of Gossypium hirsutum.

Figure 9: The Tertiary structure of Lipid Transfer Protein gene of Vigna unguiculata.

Figure 10: The Tertiary structure of Lipid Transfer Protein gene of Nicotiana attenuate.

Figure 11: The Tertiary structure of Lipid Transfer Protein gene of Dendrobium catenatum

Figure 12: The Tertiary structure of Lipid Transfer Protein gene of Musa acuminata

Figure 13: The Tertiary structure of Lipid Transfer Protein gene of Brassica rapa

Figure 14: The Tertiary structure of Lipid Transfer Protein gene of Oryza sativa Japonica Group

Figure 15: The Tertiary structure of Lipid Transfer Protein gene of Physcomitrium patens

Figure 16: The Tertiary structure of Lipid Transfer Protein gene of Vigna radiate

Figure 17: The Tertiary structure of Lipid Transfer Protein gene of Prunus persica

Figure 18: The Tertiary structure of Lipid Transfer Protein gene of Arachis hypogaea.

Figure 19: The Tertiary structure of Lipid Transfer Protein gene of Solanum tuberosum

Figure 20: The Tertiary structure of Lipid Transfer Protein gene of Sorghum bicolor.

Figure 21: The Tertiary structure of Lipid Transfer Protein gene of Glycine max (2).

Figure 22: The Tertiary structure of Lipid Transfer Protein gene of Brachypodium distachyon

Figure 23: The Tertiary structure of Lipid Transfer Protein gene of Capsicum annum

Figure 24: The Tertiary structure of Lipid Transfer Protein gene of Arabidopsis thaliana

Figure 25: The Tertiary structure of Lipid Transfer Protein gene of Solanum pennellii

Figure 26: The Tertiary structure of Lipid Transfer Protein gene of Vitis vinifera

Figure 27: The Tertiary structure of Lipid Transfer Protein gene of Juglans regia

Figure 28: The Tertiary structure of Lipid Transfer Protein gene of Pistacia vera

Figure 29: The Tertiary structure of Lipid Transfer Protein gene of Brassica napus

Figure 30: The Tertiary structure of Lipid Transfer Protein gene of Olea europaea var. sylvestris

Figure 31: The Tertiary structure of Lipid Transfer Protein gene of Glycine max

Discussion

At the end of the study, it is expected that lipid transfer protein genes in the low and high oil seed yielding varieties of sesame will vary in their percent identity and similarity, show variable physiochemical properties and show variable genetic stability indices which is responsible for the differences in the functional and structural capacities of the LTP gene for oil production of the crop. 

Results on the physicochemical properties of Lipid Transfer Protein 1 gene in the respective accessions showed that the higher the number of amino acids, the higher the value for the other properties; molecular weight, theoretical PI, number of negatively and positively charged residues and the extinction coefficient. Some of the accessions had instability indexes classifying the amino acids as unstable, while some were classified as stable.

Result of GC analysis revealed that GC content of Lipid Transfer Protein 1 gene in the selected accessions ranged from 29.73 – 54.55% with Glycine max having the lowest GC content and Oryza sativa Japonica Group having the highest GC content (54.55%). The start and end codons for the Lipid Transfer Protein 1 gene in the accessions selected varied generally although some accessions had the same start codons but not the same end codons. Solanum tuberosum, Capsicum annum and Solanum pennellii did not show any start codon and end codon in Suboptimal exon cutoff option of (1.00). Suboptimal exon cutoff of (0.50) was used to find their start codons and end codons. In Juglans regia, Brassica napus and Olea europaea var. sylvestris, Sngl [Single-exon gene (ATG to stop)] was used to determine the start codon and end codon instead of the Init [Initial exon (ATG to 5' splice site)].

The extinction coefficient of an amino acid indicates how much light a protein absorbs at a certain wavelength (Gasteiger et al., 2005). Results of the present study showed that extinction coefficient value of the amino acid sequences of Lipid Transfer Protein 1 gene in the selected accessions were higher than that in the query sequence, the implication therefore, is that there will be a variation in the amount of light these amino acids will absorb.

According to (Guruprasad et al. 1990), instability index greater than 40 indicates that the protein will be unstable in a test tube. Therefore, since the instability index of Lipid Transfer Protein 1 gene in twelve of the accessions were recorded to be above 40, it stands to argue that these amino acids will be unstable in a test tube. While the other eighteen are considered stable. Aliphatic index plays an important role in a protein’s thermal stability, the relative volume of a protein occupied by its aliphatic side chains is termed as Aliphatic index (AI).

Sequence motifs are short recurring patterns in the DNA that are thought to have biological function are usually specific binding sites for proteins. In the present study, the accessions selected share almost the same motifs, their position difference not withstanding indicating that the sequence-specific binding sites for proteins differ and might lead to functionality difference. The biological significant of GC content diversity in plants remains unclear due to lack of sufficiently robust genomic data (Smarda et al., 2014).

Summary

The study was conducted to analyze the sequence of Lipid Transfer Protein 1 gene in 30 selected accessions. The LTP 1 gene were retrieved from the NCBI database. Mega 10.0 software was used to align the sequences and to produce a phylogenetic tree of the accessions. The physicochemical properties, G-C content, secondary protein structure, motifs, percentage identity and similarity of the amino acid and nucleotide sequences were determined.

The physicochemical properties of Lipid Transfer Protein 1 gene proteins revealed that some of the proteins are unstable, while about twelve were considered stable following the results for instability index.  The accessions contained similar motifs, the most common of which were N-glycosylation site, Plant lipid transfer proteins signature, N-myristoylation site, Casein kinase II phosphorylation site, Protein kinase C phosphorylation.

GC content analysis revealed that it ranges from 29.73–54.55% with Glycine max having the lowest GC content and Oryza sativa Japonica Group having the highest GC content (54.55%). The secondary amino acid of LTP 1 gene in the selected accessions revealed three main structures; alpha, helix, random coil and extended strand. Alpha helix ranges from 33.11-54.31%, the extended strands ranged from 9.17-15.13%, while the random coil ranges from 32.77-51.16

Conclusion

In conclusion, results of the present study revealed difference in the Lipid Transfer Protein gene 1 sequences in the selected accessions following the parameters analyzed. However, results revealed that there is a high degree of sequence identity and similarity in the sequences of the selected accession, implying similar functions in these accessions.

Recommendation

Following the results of the present study, we recommend that Lipid Transfer Protein 1 gene sequences of other economically important crops, and underutilized plant species be analyzed and documented to further enable easy assessment of variability and diversity.

Also, we recommend that larger population size be explored since this makes accurate judgement possible.

References