Research Article
Evaluation of Lipid Transfer Protein (Ltp) 1 Gene in Sesame and Other Plants Using Bioinformatics Approach
- Edu N.E, Udensi O.U
- Agada F.N *
- Ubi G.M
- Agbor R
Department of Genetics and Biotechnology, University of Calabar, PMB 1115, Calabar, Nigeria.
*Corresponding Author: Department of Genetics and Biotechnology, University of Calabar, PMB 1115, Calabar, Nigeria.
Citation: Edu N.E, Udensi O.U, Agada F.N, Ubi G.M, Agbor R. (2023). Evaluation of Lipid Transfer Protein (Ltp) 1 Gene in Sesame and Other Plants Using Bioinformatics Approach. Journal of BioMed Research and Reports, BioRes Scientia Publishers. 2(6):1-20. DOI: 10.59657/2837-4681.brs.23.037
Copyright: © 2023 Agada F.N, this is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Received: September 05, 2023 | Accepted: September 20, 2023 | Published: September 27, 2023
Abstract
This study was aimed at using bioinformatics tools to characterize the Lipid Transfer Protein 1 gene in some selected accessions with special reference to Benny seed (Sesamum indicum) Lipid Transfer Protein 1 sequence as a query sequence. Nucleotide and amino acid sequences of 30 accessions were retrieved from NCBI database and analyzed for homology, physicochemical properties, motifs, GC content as well as phylogenetic relationships. Results showed that nucleotide and amino acid sequence lengths of this gene among the selected accessions differs. Its nucleotide length varied between 599–8461bp, while the amino acids sequence varied between 96–355 residues, Molecular weight range from 10008.77-35532.61daltons. With Sesamum indicum having the lowest molecular weight and Physcomitrium patens having the highest molecular weight. Result on the Theoretical PI was above 4.61 for all the amino acid sequences of Lipid Transfer Protein 1 gene in the selected accessions. It was observed that the total number of negatively charged residues ranged from 1-20. The instability index and aliphatic index ranged from 20.23–69.39, 73.48–102.24 respectively. Some of the proteins are stable, while twelve were considered unstable following the results for instability index. Extinction coefficient was highest for Sesamum indicum (14480). Daucus carota subsp. Sativus (-0213) is the only accessions with a negative GRAVY. The motifs N-glycosylation site, Plant lipid transfer proteins signature, N-myristoylation site, Casein kinase II phosphorylation site, Protein kinase C phosphorylation site were the most common across the selected accessions. GC content analysis revealed that it ranged from ranged from 29.73–54.55%. Analysis of the secondary structure of the amino acid sequences of the Lipid Transfer Protein 1 gene showed that the region covered by random coil was the highest in the sequences compared to alpha helix and extended strand. Alpha helix ranges from 33.11-54.31%, the extended strands ranged from 9.17–15.13%, while the random coil ranges from 32.77–51.16% across the accessions. Following the results of the present study, it can be concluded that Lipid Transfer Protein 1 gene sequence of Sesamum indicum is closely related to Lipid Transfer Protein 1 gene in Brachypodium distachyon and distant to that in Glycine max, Vigna unguiculata, Capsicum annum.
Keywords: lipid transfer protein 1; sesamum indicum; bioinformatics
Introduction
Sesamum indicum is an annual herb of the family Pedaliaceae. The plant is commonly known as benniseed, benniseed in English and it is found in tropical and subtropical areas of Asia, Africa and South America. Compared to similar crops, such as peanuts, soybean, and rapeseed, the seeds of sesame are believed to have the most oil. Sesame seed is one of the oldest oilseed crops known, domesticated well over 3000 years ago. S. indicum has many other species, most being wild and native to sub-Saharan Africa. S. indicum, the cultivated type, originated in India (Ogasawara, et al., 1988) and is tolerant to drought-like conditions, growing where other crops fail (Raghav et al., 1990). Sesame has been widely known for its oil seeds production which has also finds wide applications in the food and pharmaceutical industries due to their significance. The genes responsible for this oil production trait is the lipid transfer protein 1 gene whose properties, functions and structures, this research tries to analyze using in silico approach. The expression in silico was first used in public in 1989 in the workshop “Cellular Automata”: Theory and Application in Los Alamos New Mexico by Pedro Miramontes a Mathematician from National Autonomous University of Mexico (UNAM) who presented the report “DNA and RNA physiochemical constraints, Cellular Automata and Molecular Evolution. Plants genome shows a wide array of architectures i.e. (genetic make-up) varying immensely in size, structures and content. Some organelle DNA’s have even developed elaborate peculiarities such as scrambled coding regions, non-standard genetic codes and convoluted modes of post-transcriptional modification and editing all of which has been deciphered using bioinformatics tools.
Bioinformatics is an interdisciplinary field that develops methods and software tools for understanding biological data. As an interdisciplinary field of science, bioinformatics combines Computer Science, Biology, Mathematics and Engineering to analyze and interpret biological data. Bioinformatics has been used for in silico analyses of biological data and genes using mathematical and statistical techniques. Bioinformatics has become an important part of many areas of biology. In experimental molecular biology, Bioinformatics techniques such as image and signal processing allow extraction of useful results from large amount of raw data. In the field of genetics and genomics, it aids in sequencing and annotating genomes and their observed mutations. It plays a role in the text miming of biological literature and the development of biological and gene ontologies to organize and query biological data. It also plays a role in the analysis of gene and protein expression and regulation. Bioinformatics tools aid in the comparison of genetic and genomic data and more generally in the understanding of evolutionary aspects of molecular biology. At a more integrative level, it helps to analyze and catalogue the biological pathways and networks that are an important part of systems biology. In structural biology, it aids in the simulation and modeling of DNA, RNA, proteins as well as bimolecular interactions.
In Genetics and Biochemistry, in silico studies can be used to examine the molecular modeling of gene, gene expression, gene sequence analysis and 3D structure of proteins, identification of diseases and prediction of lipid metabolic pathways. In silico studies/drugs designing software plays an important role to design innovative proteins.
With the increasing concern on the side effects caused by modern synthetic on chemical drugs, oil seeds and medicinal plants remain the main source of a large range of basic healthcare and pharmaceutical products. Successful attempts, to produce some of the valuable in relatively large quantities by cell cultures have been reported.
Oil rich plants such as Sesame indicum have been used as a source of food and medicine since historic times. In the era of high volume, high throughput data generation across the biosciences, bioinformatics plays a crucial role in food, drug design, drug discovery and metabolism. Availability of the functional and active components of lipid transfer protein genes in plant species is an indication of increase yield and high output for medicinal relevance and therapeutic security. The yield and potency of active ingredients in the oil rich plants will depend largely on the expressions in functional and structural of the LTP gene that controls photosynthesis, nutrition and lipid metabolism in general. There is therefore, need to study lipid transfer protein genes underlying this potential to unveil the physiochemical characteristics and other parameters which makes the usefulness of these genes unique to the plant breeders and biotechnologists. The study is aimed at using in silico approach to evaluate lipid transfer protein (LTP) gene variations in sesame and other plants. The objectives of the study are to identify the percentage identity and similarity in the Lipid Transfer Protein (LTP) gene in sesame and other plants, to determine the variations in the physiochemical properties of the Lipid transfer protein (LTP) gene in sesame and other plants, to determine the Lipid transfer protein (LTP) gene stability in terms of their Guanine–cytosine contents.
Materials & Methods
Experimental site
The in-silico study was carried out in the bioinformatics laboratory of the Department of Genetics and Biotechnology, University of Calabar, Calabar.
Retrieval of Nucleotides and Amino Acid Sequences
The nucleotides and amino acid sequences of the lipid transfer protein genes in low and high oil seeds producing varieties of sesame was retrieved using the FASTA format from the National Centre for Biotechnology information (NCBI) database. The accession numbers, sequence lengths and E-values was recorded.
Determination of Percent Identity and Similarity (Homology)
Percentage identity and similarity among the nucleotides and amino acid sequences of the retrieved sequences for lipid transfer protein genes in low and high oil seeds producing varieties of sesame plants was determined using similarity homology comparison tool for more than two sequences option of the basic alignment search tool.
Determination of Physico Chemical Properties of the lipid transfer protein genes in low and high oil seeds producing varieties of sesame plant species
The physico-chemical properties of the lipid transfer protein genes in low and high oil seeds producing varieties of sesame were determined using the Expert Protein Analysis System (EXPASY) which is a proteomic server of the Swiss Institute of Bioinformatics (SIB) using the online program of the Expasy site. The proteomic server of the expasy.org was used to access the protparam site which displays the physiochemical properties of the query amino acid sequences.
Determination of secondary and tertiary protein structures of lipid transfer protein genes in low and high oil seeds producing varieties of sesame
Prediction of motif for secondary structure was done using NSOPMA software. The motif for the prediction tertiary structure (3D protein structure) for the lipid transfer protein genes was done by pasting the protein sequences on the interactive online platform of Phyre and phyre or phyre 2 (Protein Homolog Y Analysis Recognition Engine) based on translated amino acid sequence retrieved from the NCBI databases as modified by Kelley and Stemberg, 2009. The Rasmol (Raswin) software was used to fine tune the 3D protein structure to ribbons or cartoons with desired colours and magnitudes.
Phylogenetics analysis of lipid transfer protein genes in low and high oil seeds producing varieties of sesame
The MEGA X software was used to align the retrieved DNA and protein sequences and subject to phylogenetic analysis for phylogenetic tree, substitution model selection, evolutionary distance estimation, phylogeny inference, substitution rate and pattern estimation, test of natural selection and ancestral sequence inference.
Determination of Guanine–Cytosine contents and N-terminal amino acids side chains
The Genscan online interaction programme of Expasy.org. and SWISS institute of Bioinformatic (SIB) suite was used to determine the guanine – cytosine contents of the nucleotide sequences for the lipid transfer protein genes for the low and high oil seeds yielding varieties of sesame as well as the N-terminal amino acid side chains.
Results
Determination of nucleotide and amino acid sequences length in selected accessions
Results obtained for sequence lengths of nucleotide and amino acids sequences of Lipid Transfer Protein 1 gene showed that the nucleotide sequence lengths ranged from 599–8461bp. While amino acid sequence lengths ranged from 96 – 355 residues. It was observed that nucleotide sequence of Lipid Transfer Protein 1 gene in Sorghum bicolor (8461bp) was the longest while that in Sesamum indicum had the shortest sequence (599bp) as shown in table 1.
Table 1: Nucleotide and amino acid sequences of Lipid Transfer Protein 1 gene in selected plants.
Name of Plants | Accession number | AA | DNA sequencing |
Sesamum indicum | NC_026157 | 96 | 599 |
Ricinus communis | NW_002994932 | 115 | 848 |
Daucus carota subsp. sativus | NC_030386 | 123 | 1349 |
Medicago truncatula | NC_053042 | 151 | 967 |
Helianthus annus | NC_035433 | 119 | 1026 |
Nicotiana tabacum | NW_015904831 | 117 | 1217 |
Gossypium hirsutum | NC_053433 | 120 | 761 |
Vigna unguiculata | NC_040279 | 183 | 1856 |
Nicotiana attenuata | NC_031995 | 117 | 1372 |
Dendrobium catenatum | NW_021319309 | 118 | 1321 |
Musa acuminata | NC_025202 | 117 | 1264 |
Brassica rapa | NC_024803 | 147 | 794 |
Oryza sativa Japonica Group | NC_029256 | 120 | 924 |
Physcomitrium patens | NC_037253 | 355 | 1856 |
Vigna radiata | NW_014543115 | 117 | 1670 |
Prunus persica | NC_034014 | 117 | 821 |
Arachis hypogaea | NC_037619 | 116 | 1528 |
Solanum tuberosum | NW_006239682 | 114 | 968 |
Sorghum bicolor | NC_012877 | 119 | 8461 |
Glycine max (2) | NC_016088 | 119 | 4094 |
Brachypodium distachyon | NC_016131 | 172 | 2871 |
Capsicum annum | NC_029977 | 142 | 769 |
Arabidopsis thaliana | NC_003071 | 118 | 951 |
Solanum pennellii | NC_028646 | 114 | 930 |
Vitis vinifera | NC_012007 | 201 | 2620 |
Juglans regia | NC_049903 | 119 | 1151 |
Pistacia vera | NW_022196275 | 117 | 613 |
Brassica napus | NC_027765 | 147 | 773 |
Olea europaea var. sylvestris | NC_036249 | 118 | 736 |
Glycine max | NC_016088 | 119 | 4094 |
Determination of Percentage Identity and Similarity of the Nucleotide Sequence of Lipid Transfer Protein Gene in the Selected Accessions
Result of the percentage identity of amino acid sequence of Lipid Transfer Protein 1 gene in the selected accessions showed that the highest identity to the LTP1 sequence in the query.
Physiochemical Properties of Lipid Transfer Protein 1 gene in the selected sequences
Result of the analysis of the physicochemical properties of amino acid of Lipid Transfer Protein 1 gene in the selected accessions showed that the number of amino acid residues ranged from 96–355 residues. Molecular weight ranged from 10008.77-35532.61daltons. With Sesamum indicum having the lowest molecular weight and Physcomitrium patens. having the highest molecular weight, as shown in Table 3. None of the other accessions had amino acid residues and molecular weight equal to that of Lipid Transfer Protein 1 gene in the query. Result of the theoretical PI was above 4.61 for all the amino acid sequence of Lipid Transfer Protein 1 gene in the selected accessions. It was observed that the total number of negatively charged residues ranged from 1–20. The total number of positively charged residues ranged from 6–19. The instability index and the aliphatic index ranged from 20.23–69.39, 73.48–102.24 respectively. Some of the proteins are stable, while twelve were considered unstable following the results for instability index. Extinction coefficient was highest for Sesamum indicum (14480). Daucus carota subsp. Sativus (-0213) is the only accessions with a negative GRAVY. (Table 3).
Table 2: Percentage identity, similarity of Lipid Transfer Protein 1 gene in selected accessions.
Name of Plants | Accession number | Percentage identity (%) | Percentage Similarity (%) |
Sesamum indicum | NC_026157 | 96 | 97 |
Ricinus communis | NW_002994932 | 93 | 95 |
Daucus carota subsp. sativus | NC_030386 | 88 | 92 |
Medicago truncatula | NC_053042 | 97 | 100 |
Helianthus annus | NC_035433 | 91 | 94 |
Nicotiana tabacum | NW_015904831 | 94 | 97 |
Gossypium hirsutum | NC_053433 | 92 | 96 |
Vigna unguiculata | NC_040279 | 98 | 100 |
Nicotiana attenuata | NC_031995 | 88 | 93 |
Dendrobium catenatum | NW_021319309 | 96 | 99 |
Musa acuminata | NC_025202 | 87 | 95 |
Brassica rapa | NC_024803 | 97 | 100 |
Oryza sativa Japonica Group | NC_029256 | 40 | 70 |
Physcomitrium patens | NC_037253 | 87 | 93 |
Vigna radiata | NW_014543115 | 85 | 92 |
Prunus persica | NC_034014 | 83 | 91 |
Arachis hypogaea | NC_037619 | 88 | 92 |
Solanum tuberosum | NW_006239682 | 96 | 98 |
Sorghum bicolor | NC_012877 | 93 | 97 |
Glycine max(2) | NC_016088 | 97 | 100 |
Brachypodium distachyon | NC_016131 | 94 | 98 |
Capsicum annum | NC_029977 | 92 | 100 |
Arabidopsis thaliana | NC_003071 | 91 | 96 |
Solanum pennellii | NC_028646 | 87 | 92 |
Vitis vinifera | NC_012007 | 84 | 92 |
Juglans regia | NC_049903 | 86 | 91 |
Pistacia vera | NW_022196275 | 92 | 95 |
Brassica napus | NC_027765 | 95 | 100 |
Olea europaea var. sylvestris | NC_036249 | 95 | 96 |
Glycine max | NC_016088 | 97 | 100 |
Table 3: Physico-chemical properties of amino acid sequence of lipid transfer protein 1 gene in selected accessions.
S/N | Access. No | Name Of Plants | Aa | Mol. Wt | The. Pi | Ext. Coeff. | Ii | A Index | Gravy | -Ve (Asp+Glu) | +Ve (Arg+Lys) | Seq. Length (Bp) | No. of Atoms |
1. | NC_026157 | Sesamum indicum | 96 | 10008.77 | 7.50 | 14480 | 33.79 | 85.31 | 0.408 | 7 | 8 | 599 | 1405 |
2. | NW_002994932 | Ricinus communis | 115 | 11766.75 | 8.05 | 3605 | 23.04 | 83.91 | 0.449 | 5 | 7 | 848 | 1633 |
3. | NC_030386 | Daucus carota subsp. Sativus | 123 | 13429.78 | 9.55 | 1490 | 20.23 | 75.45 | -0.213 | 13 | 19 | 1349 | 1919 |
4. | NC_053042 | Medicago truncatula | 151 | 15344.13 | 8.81 | 2115 | 52.24 | 94.97 | 0.321 | 5 | 11 | 967 | 2189 |
5. | NC_035433 | Helianthus annus | 119 | 12418.56 | 9.37 | 4970 | 31.86 | 89.24 | 0.161 | 4 | 13 | 1026 | 1748 |
6. | NW_015904831 | Nicotiana tabacum | 117 | 11757.69 | 8.10 | 3480 | 30.91 | 97.61 | 0.534 | 4 | 6 | 1217 | 1645 |
7. | NC_053433 | Gossypium hirsutum | 120 | 11834.88 | 9.04 | 3605 | 27.55 | 89.50 | 0.463 | 2 | 9 | 761 | 1660 |
8. | NC_040279 | Vigna unguiculata | 183 | 19377.62 | 8.06 | 8980 | 50.51 | 88.42 | 0.190 | 13 | 15 | 1856 | 2732 |
9. | NC_031995 | Nicotiana attenuate | 117 | 11793.81 | 8.63 | 4970 | 31.64 | 99.32 | 0.539 | 3 | 7 | 1372 | 1656 |
10. | NW_021319309 | Dendrobium catenatum | 118 | 12095.22 | 8.89 | 8980 | 69.39 | 86.95 | 0.417 | 4 | 9 | 1321 | 1696 |
11. | NC_025202 | Musa acuminate | 117 | 11758.80 | 9.32 | 3480 | 40.34 | 97.52 | 0.446 | 4 | 12 | 1264 | 1675 |
12. | NC_024803 | Brassica rapa | 147 | 15420.64 | 9.59 | 2115 | 47.37 | 102.24 | 0.266 | 3 | 17 | 794 | 2229 |
13. | NC_029256 | Oryza sativa Japonica Group | 120 | 12092.24 | 9.28 | 6460 | 38.34 | 100.08 | 0.558 | 1 | 9 | 924 | 1709 |
14. | NC_037253 | Physcomitrium patens | 355 | 35532.61 | 4.61 | 4105 | 51.69 | 78.45 | 0.315 | 20 | 11 | 1856 | 4914 |
15. | NW_014543115 | Vigna radiate | 117 | 11704.85 | 9.08 | 2115 | 30.54 | 94.36 | 0.598 | 2 | 10 | 1670 | 1655 |
16. | NC_034014 | Prunus persica | 117 | 11806.90 | 9.25 | 4970 | 32.61 | 94.27 | 0.434 | 1 | 9 | 821 | 1664 |
17. | NC_037619 | Arachis hypogaea | 116 | 11539.70 | 9.28 | 1990 | 20.31 | 86.81 | 0.585 | 1 | 9 | 1528 | 1619 |
18. | NW_006239682 | Solanum tuberosum | 114 | 11471.64 | 8.73 | 3605 | 39.04 | 95.88 | 0.510 | 5 | 10 | 968 | 1622 |
19. | NC_012877 | Sorghum bicolor | 119 | 11564.19 | 9.30 | 3480 | 36.69 | 88.82 | 0.446 | 3 | 10 | 8461 | 1614 |
20. | NC_016088 | Glycine max(2) | 119 | 12583.66 | 9.95 | 3605 | 48.54 | 86.13 | 0.150 | 4 | 16 | 4094 | 1760 |
21. | NC_016131 | Brachypodium distachyon | 172 | 16678.42 | 7.50 | 4970 | 60.91 | 93.31 | 0.592 | 7 | 8 | 2871 | 2355 |
22. | NC_029977 | Capsicum annum | 142 | 14654.83 | 9.55 | 625 | 28.53 | 109.86 | 0.387 | 3 | 16 | 769 | 2136 |
23. | NC_003071 | Arabidopsis thaliana | 118 | 11754.89 | 9.30 | 3605 | 46.19 | 97.71 | 0.475 | 1 | 10 | 951 | 1658 |
24. | NC_028646 | Solanum pennellii | 114 | 11542.76 | 8.87 | 3605 | 38.30 | 95.88 | 0.473 | 5 | 11 | 930 | 1636 |
25. | NC_012007 | Vitis vinifera | 201 | 21011.92 | 5.84 | 10470 | 40.33 | 73.48 | 0.042 | 17 | 16 | 2620 | 2922 |
26. | NC_049903 | Juglans regia | 119 | 11741.95 | 9.20 | 3480 | 35.83 | 95.88 | 0.524 | 3 | 11 | 1151 | 1671 |
27. | NW_022196275 | Pistacia vera | 117 | 12090.92 | 4.86 | 5095 | 22.39 | 78.55 | 0.241 | 5 | 4 | 613 | 1658 |
28. | NC_027765 | Brassica napus | 147 | 15420.64 | 9.59 | 2115 | 47.37 | 102.24 | 0.266 | 3 | 17 | 773 | 2229 |
29. | NC_036249 | Olea europaea var. sylvestris | 118 | 11980.28 | 9.20 | 9440 | 30.17 | 90.00 | 0.398 | 4 | 13 | 736 | 1704 |
30. | NC_016088 | Glycine max | 119 | 12583.66 | 9.95 | 3605 | 48.54 | 86.13 | 0.150 | 4 | 16 | 4094 | 1760 |
Table 4: Motifs in Amino acid sequence of Lipid Transfer Protein 1 gene in selected accessions.
Plants | Accessions | Motif |
Sesamum indicum | NC_026157 | N-myristoylation site, Protease inhibitor/seed storage/LTP family, Extensin_1 Extensin-like protein repeat |
Ricinus communis | NW_002994932 | N-glycosylation site, Casein kinase II phosphorylation site, Protein kinase C phosphorylation site, N-myristoylation site, Protease inhibitor/seed storage/LTP family, Big-1 (bacterial Ig-like domain 1) domain profile |
Daucus carota subsp. sativus | NC_030386 | Casein kinase II phosphorylation site, N-myristoylation site, Protein kinase C phosphorylation site, SCP-2 sterol transfer family, Protein kinase C phosphorylation site, Microbodies C-terminal targeting signal |
Medicago truncatula | NC_053042 | Protein kinase C phosphorylation site, N-myristoylation site, Casein kinase II phosphorylation site, Lysosome-associated membrane glycoprotein family, Proline-rich region profile, Tryp_alpha_amyl Protease inhibitor family, MYRISTYL N-myristoylation site, DEC-1 protein, N-terminal region |
Helianthus annus | NC_035433 | Casein kinase II phosphorylation site, Plant lipid transfer proteins signature, N-myristoylation site, Protein kinase C phosphorylation site, Protease inhibitor/seed storage/LTP family |
Nicotiana tabacum | NW_015904831 | N-myristoylation site, Plant lipid transfer proteins signature, Casein kinase II phosphorylation site, N-glycosylation site, Agouti protein, Protease inhibitor/seed storage/LTP family |
Gossypium hirsutum | NC_053433 | Protein kinase C phosphorylation site, Plant lipid transfer proteins signature, Casein kinase II phosphorylation site, N-myristoylation site, Amastin surface glycoprotein, Protease inhibitor/seed storage/LTP family |
Vigna unguiculata | NC_040279 | cAMP- and cGMP-dependent protein kinase phosphorylation site, Protein kinase C phosphorylation site, Casein kinase II phosphorylation site, N-glycosylation site N-myristoylation site, Big-1 (bacterial Ig-like domain 1) domain profile, Protease inhibitor/seed storage/LTP family, NPR nonapeptide repeat |
Nicotiana attenuata | NC_031995 | N-glycosylation site, Casein kinase II phosphorylation site, N-myristoylation site, Plant lipid transfer proteins signature, Agouti protein, Protease inhibitor/seed storage/LTP family |
Dendrobium catenatum | NW_021319309 | N-myristoylation site, N-glycosylation site, Plant lipid transfer proteins signature, Ankyrin repeat, Sialic-acid binding micronemal adhesive repeat, Protease inhibitor/seed storage/LTP family |
Musa acuminata | NC_025202 | Casein kinase II phosphorylation site, Protease inhibitor/seed storage/LTP family, N-myristoylation site, Plant lipid transfer proteins signature |
Brassica rapa | NC_024803 | Proline-rich region profile, N-glycosylation site, Protease inhibitor/seed storage/LTP family, Protein kinase C phosphorylation site, N-myristoylation site, Procyclic acidic repetitive protein (PARP) |
Oryza sativa Japonica Group | NC_029256 | N-glycosylation site, Protease inhibitor/seed storage/LTP family, Plant lipid transfer proteins signature, N-myristoylation site |
Physcomitrium patens | NC_037253 | N-myristoylation site, Protein kinase C phosphorylation site, Casein kinase II phosphorylation site, N-glycosylation site, Protease inhibitor/seed storage/LTP family |
Vigna radiata | NW_014543115 | Plant lipid transfer proteins signature, Protein kinase C phosphorylation site, N-myristoylation site, Casein kinase II phosphorylation site, Protease inhibitor/seed storage/LTP family |
Prunus persica | NC_034014 | Plant lipid transfer proteins signature, N-myristoylation site, Casein kinase II phosphorylation site, Protein kinase C phosphorylation site, Protease inhibitor/seed storage/LTP family |
Arachis hypogaea | NC_037619 | Plant lipid transfer proteins signature, N-myristoylation site, Casein kinase II phosphorylation site, Protein kinase C phosphorylation site, Protease inhibitor/seed storage/LTP family |
Solanum tuberosum | NW_006239682 | Plant lipid transfer proteins signature, N-myristoylation site, Protein kinase C phosphorylation site, Casein kinase II phosphorylation site, Protease inhibitor/seed storage/LTP family |
Sorghum bicolor | NC_012877 | Plant lipid transfer proteins signature, Casein kinase II phosphorylation site, N-myristoylation site, Alanine-rich region profile, Protease inhibitor/seed storage/LTP family |
Glycine max(2) | NC_016088 | Plant lipid transfer proteins signature, Protein kinase C phosphorylation site, Casein kinase II phosphorylation site, N-myristoylation site, Protease inhibitor/seed storage/LTP family |
Brachypodium distachyon | NC_016131 | N-myristoylation site, Protein kinase C phosphorylation site, N-glycosylation site, Alanine-rich region profile, Protease inhibitor/seed storage/LTP family |
Capsicum annum | NC_029977 | Prokaryotic membrane lipoprotein lipid attachment site profile, Proline-rich region profile, Protein kinase C phosphorylation site, cAMP- and cGMP-dependent protein kinase phosphorylation site, N-myristoylation site, Amidation site, Protease inhibitor/seed storage/LTP family, Penaeidin, Protease inhibitor/seed storage/LTP family |
Arabidopsis thaliana | NC_003071 | Prokaryotic membrane lipoprotein lipid attachment site profile, Plant lipid transfer proteins signature, N-myristoylation site, Casein kinase II phosphorylation site, Protein kinase C phosphorylation site, Protease inhibitor/seed storage/LTP family |
Solanum pennellii | NC_028646 | Plant lipid transfer proteins signature, N-myristoylation site, Protein kinase C phosphorylation site, Casein kinase II phosphorylation site, Protease inhibitor/seed storage/LTP family |
Vitis vinifera | NC_012007 | N-myristoylation site, Casein kinase II phosphorylation site, Protein kinase C phosphorylation site, N-glycosylation site, CHCH domain, Protease inhibitor/seed storage/LTP family |
Juglans regia | NC_049903 | Plant lipid transfer proteins signature, N-myristoylation site, Casein kinase II phosphorylation site, Protease inhibitor/seed storage/LTP family, cAMP- and cGMP-dependent protein kinase phosphorylation site |
Pistacia vera | NW_022196275 | Casein kinase II phosphorylation site, Plant lipid transfer proteins signature, N-glycosylation site, Big-1 (bacterial Ig-like domain 1) domain profile, Protein kinase C phosphorylation site, N-myristoylation site |
Brassica napus | NC_027765 | Proline-rich region profile, N-glycosylation site, Protein kinase C phosphorylation site, N-myristoylation site |
Olea europaea var. sylvestris | NC_036249 | N-myristoylation site, Plant lipid transfer proteins signature, Protein kinase C phosphorylation site, Protease inhibitor/seed storage/LTP family, Casein kinase II phosphorylation site |
Glycine max | NC_016088 | Plant lipid transfer proteins signature, Protein kinase C phosphorylation site, Casein kinase II phosphorylation site, N-myristoylation site, Protease inhibitor/seed storage/LTP family |
Table 5: G-C contents and other parameters of nucleotide sequence of Lipid Transfer Protein 1 gene in selected accessions.
Accessions | Accession numbers | GC Contents (%) | Sequence Length (bp) | Start Codon | End Codon |
Sesamum indicum | NC_026157 | 48.75 | 599 | 127 | 417 |
Ricinus communis | NW_002994932 | 39.86 | 848 | 82 | 419 |
Daucus carota subsp. sativus | NC_030386 | 34.91 | 1349 | 175 | 300 |
Medicago truncatula | NC_053042 | 34.75 | 967 | 127 | 582 |
Helianthus annus | NC_035433 | 35.87 | 1026 | 58 | 407 |
Nicotiana tabacum | NW_015904831 | 36.57 | 1217 | 96 | 439 |
Gossypium hirsutum | NC_053433 | 45.07 | 761 | 330 | 519 |
Vigna unguiculata | NC_040279 | 39.82 | 1856 | 86 | 425 |
Nicotiana attenuata | NC_031995 | 36.44 | 1372 | 216 | 559 |
Dendrobium catenatum | NW_021319309 | 35.43 | 1321 | 72 | 422 |
Musa acuminata | NC_025202 | 48.18 | 1264 | 76 | 419 |
Brassica rapa | NC_024803 | 40.68 | 794 | 102 | 545 |
Oryza sativa Japonica Group | NC_029256 | 54.55 | 924 | 121 | 489 |
Physcomitrium patens | NC_037253 | 52.86 | 1856 | 474 | 1445 |
Vigna radiata | NW_014543115 | 31.80 | 1670 | 96 | 436 |
Prunus persica | NC_034014 | 43.48 | 821 | 97 | 440 |
Arachis hypogaea | NC_037619 | 31.48 | 1528 | 267 | 475 |
Solanum tuberosum | NW_006239682 | 35.85 | 968 | 96 | 430 |
Sorghum bicolor | NC_012877 | 41.73 | 8461 | 247 | 342 |
Glycine max(2) | NC_016088 | 29.73 | 4094 | 216 | 565 |
Brachypodium distachyon | NC_016131 | 48.69 | 2871 | 396 | 735 |
Capsicum annum | NC_029977 | 36.15 | 769 | 150 | 271 |
Arabidopsis thaliana | NC_003071 | 40.06 | 951 | 129 | 475 |
Solanum pennellii | NC_028646 | 34.30 | 930 | 104 | 438 |
Vitis vinifera | NC_012007 | 35.84 | 2620 | 102 | 465 |
Juglans regia | NC_049903 | 39.88 | 1151 | 83 | 436 |
Pistacia vera | NW_022196275 | 43.07 | 613 | 102 | 234 |
Brassica napus | NC_027765 | 40.49 | 773 | 81 | 524 |
Olea europaea var. sylvestris | NC_036249 | 43.07 | 736 | 113 | 436 |
Glycine max | NC_016088 | 29.73 | 4094 | 216 | 565 |
Motifs in Amino Acid Sequences of Lipid Transfer
Result of the analysis of Motifs in amino acid sequence of Lipid Transfer Protein 1 gene in the selected accessions showed that the motifs were within 3 - 9 across accessions. The motifs N-glycosylation site, Plant lipid transfer proteins signature, N-myristoylation site, Casein kinase II phosphorylation site, Protein kinase C phosphorylation site were the most common motifs found in the amino acid sequences of Lipid transfer Protein 1 gene in the selected accessions.
GC content and start and end codons of Lipid Transfer Protein 1 gene in the Selected Accessions
Result of GC analysis revealed that GC content of Lipid Transfer Protein 1 gene in the selected accessions ranged from 29.73 – 54.55% with Glycine max having the lowest GC content and Oryza sativa Japonica Group having the highest GC content (54.55%). The start and end codons for the Lipid Transfer Protein 1 gene in the accessions selected varied generally although some accessions had the same start codons but not the same end codons. (Table 5).
Secondary Structure of Amino Acid and Sequences of Lipid Transfer Protein 1 gene in the Selected Accessions
Results of the analysis of secondary structures of the amino acid sequences of Lipid Transfer Protein 1 gene showed that the region covered by random coil was the highest in the sequences compared to alpha helix and extended strand. Alpha helix ranges from 33.11 - 54.31%, the extended strands ranged from 9.17 – 15.13%, while the random coil ranges from 32.77 – 51.16
Phylogenetic relationship of Lipid Transfer Protein 1 gene sequence in the selected accessions
The Lipid Transfer Protein 1 gene sequence from the selected accessions analyzed showed that there was
Figure 1: Phylogenetic tree of Lipid Transfer Protein 1 gene in the selected accessions.
Tertiary structure of amino acid sequences of Lipid Transfer Protein 1 gene in the selected accessions
Results of the analysis of the tertiary protein structure of Lipid Transfer Protein 1 gene in the selected accessions are shown in Figure 2-31. In each figure, the pink portion represents the alpha-helix, the blue portion of the ribbon structure represents the random coil, the green represents the extended strand.
Figure 2: The Tertiary structure of Lipid Transfer Protein gene of Sesamum indicum.
Figure 3: The Tertiary structure of Lipid Transfer Protein gene of Ricinus communis.
Figure 4: The Tertiary structure of Lipid Transfer Protein gene of Daucus carota subsp. Sativus.
Figure 5: The Tertiary structure of Lipid Transfer Protein gene of Medicago truncatula.
Figure 6: The Tertiary structure of Lipid Transfer Protein gene of Helianthus annus.
Figure 7: The Tertiary structure of Lipid Transfer Protein gene of Nicotiana tabacum.
Figure 8: The Tertiary structure of Lipid Transfer Protein gene of Gossypium hirsutum.
Figure 9: The Tertiary structure of Lipid Transfer Protein gene of Vigna unguiculata.
Figure 10: The Tertiary structure of Lipid Transfer Protein gene of Nicotiana attenuate.
Figure 11: The Tertiary structure of Lipid Transfer Protein gene of Dendrobium catenatum
Figure 12: The Tertiary structure of Lipid Transfer Protein gene of Musa acuminata
Figure 13: The Tertiary structure of Lipid Transfer Protein gene of Brassica rapa
Figure 14: The Tertiary structure of Lipid Transfer Protein gene of Oryza sativa Japonica Group
Figure 15: The Tertiary structure of Lipid Transfer Protein gene of Physcomitrium patens
Figure 16: The Tertiary structure of Lipid Transfer Protein gene of Vigna radiate
Figure 17: The Tertiary structure of Lipid Transfer Protein gene of Prunus persica
Figure 18: The Tertiary structure of Lipid Transfer Protein gene of Arachis hypogaea.
Figure 19: The Tertiary structure of Lipid Transfer Protein gene of Solanum tuberosum
Figure 20: The Tertiary structure of Lipid Transfer Protein gene of Sorghum bicolor.
Figure 21: The Tertiary structure of Lipid Transfer Protein gene of Glycine max (2).
Figure 22: The Tertiary structure of Lipid Transfer Protein gene of Brachypodium distachyon
Figure 23: The Tertiary structure of Lipid Transfer Protein gene of Capsicum annum
Figure 24: The Tertiary structure of Lipid Transfer Protein gene of Arabidopsis thaliana
Figure 25: The Tertiary structure of Lipid Transfer Protein gene of Solanum pennellii
Figure 26: The Tertiary structure of Lipid Transfer Protein gene of Vitis vinifera
Figure 27: The Tertiary structure of Lipid Transfer Protein gene of Juglans regia
Figure 28: The Tertiary structure of Lipid Transfer Protein gene of Pistacia vera
Figure 29: The Tertiary structure of Lipid Transfer Protein gene of Brassica napus
Figure 30: The Tertiary structure of Lipid Transfer Protein gene of Olea europaea var. sylvestris
Figure 31: The Tertiary structure of Lipid Transfer Protein gene of Glycine max
Discussion
At the end of the study, it is expected that lipid transfer protein genes in the low and high oil seed yielding varieties of sesame will vary in their percent identity and similarity, show variable physiochemical properties and show variable genetic stability indices which is responsible for the differences in the functional and structural capacities of the LTP gene for oil production of the crop.
Results on the physicochemical properties of Lipid Transfer Protein 1 gene in the respective accessions showed that the higher the number of amino acids, the higher the value for the other properties; molecular weight, theoretical PI, number of negatively and positively charged residues and the extinction coefficient. Some of the accessions had instability indexes classifying the amino acids as unstable, while some were classified as stable.
Result of GC analysis revealed that GC content of Lipid Transfer Protein 1 gene in the selected accessions ranged from 29.73 – 54.55% with Glycine max having the lowest GC content and Oryza sativa Japonica Group having the highest GC content (54.55%). The start and end codons for the Lipid Transfer Protein 1 gene in the accessions selected varied generally although some accessions had the same start codons but not the same end codons. Solanum tuberosum, Capsicum annum and Solanum pennellii did not show any start codon and end codon in Suboptimal exon cutoff option of (1.00). Suboptimal exon cutoff of (0.50) was used to find their start codons and end codons. In Juglans regia, Brassica napus and Olea europaea var. sylvestris, Sngl [Single-exon gene (ATG to stop)] was used to determine the start codon and end codon instead of the Init [Initial exon (ATG to 5' splice site)].
The extinction coefficient of an amino acid indicates how much light a protein absorbs at a certain wavelength (Gasteiger et al., 2005). Results of the present study showed that extinction coefficient value of the amino acid sequences of Lipid Transfer Protein 1 gene in the selected accessions were higher than that in the query sequence, the implication therefore, is that there will be a variation in the amount of light these amino acids will absorb.
According to (Guruprasad et al. 1990), instability index greater than 40 indicates that the protein will be unstable in a test tube. Therefore, since the instability index of Lipid Transfer Protein 1 gene in twelve of the accessions were recorded to be above 40, it stands to argue that these amino acids will be unstable in a test tube. While the other eighteen are considered stable. Aliphatic index plays an important role in a protein’s thermal stability, the relative volume of a protein occupied by its aliphatic side chains is termed as Aliphatic index (AI).
Sequence motifs are short recurring patterns in the DNA that are thought to have biological function are usually specific binding sites for proteins. In the present study, the accessions selected share almost the same motifs, their position difference not withstanding indicating that the sequence-specific binding sites for proteins differ and might lead to functionality difference. The biological significant of GC content diversity in plants remains unclear due to lack of sufficiently robust genomic data (Smarda et al., 2014).
Summary
The study was conducted to analyze the sequence of Lipid Transfer Protein 1 gene in 30 selected accessions. The LTP 1 gene were retrieved from the NCBI database. Mega 10.0 software was used to align the sequences and to produce a phylogenetic tree of the accessions. The physicochemical properties, G-C content, secondary protein structure, motifs, percentage identity and similarity of the amino acid and nucleotide sequences were determined.
The physicochemical properties of Lipid Transfer Protein 1 gene proteins revealed that some of the proteins are unstable, while about twelve were considered stable following the results for instability index. The accessions contained similar motifs, the most common of which were N-glycosylation site, Plant lipid transfer proteins signature, N-myristoylation site, Casein kinase II phosphorylation site, Protein kinase C phosphorylation.
GC content analysis revealed that it ranges from 29.73–54.55% with Glycine max having the lowest GC content and Oryza sativa Japonica Group having the highest GC content (54.55%). The secondary amino acid of LTP 1 gene in the selected accessions revealed three main structures; alpha, helix, random coil and extended strand. Alpha helix ranges from 33.11-54.31%, the extended strands ranged from 9.17-15.13%, while the random coil ranges from 32.77-51.16
Conclusion
In conclusion, results of the present study revealed difference in the Lipid Transfer Protein gene 1 sequences in the selected accessions following the parameters analyzed. However, results revealed that there is a high degree of sequence identity and similarity in the sequences of the selected accession, implying similar functions in these accessions.
Recommendation
Following the results of the present study, we recommend that Lipid Transfer Protein 1 gene sequences of other economically important crops, and underutilized plant species be analyzed and documented to further enable easy assessment of variability and diversity.
Also, we recommend that larger population size be explored since this makes accurate judgement possible.
References
- Agarwal KN, Gupta A, Pushkarna R, Bhargava SK, Faridi MM, Prabhu MK (2000) Effects of massage & use of oil on growth, blood flow & sleep pattern in infants. Indian Journal of Medical Research 112:212-217.
Publisher | Google Scholor - Arnold Konstantin, Lorenza Bordoli, Jurgen Kopp, Torsten Schwede. (2006). The SWISS-MODEL work space: A web-based environment for portion structure homology modeling. Bioinformatics, 22(2):195-201.
Publisher | Google Scholor - Arthur M. Lesk. (2013). Introduction to Bioinformatics. Biotechnology Journal, 3(11):1452-1453.
Publisher | Google Scholor - C Combet, C Blanchet, C Geourjon, G Deleage. (2000). NPS@: Network Protein Sequence Analysis. Trends in Biochmical Sciences, 25(3):147-150.
Publisher | Google Scholor - Chaofu Lu, Jonathan A Napier, Thomas E Clemente, Edgar B Cahoon. (2011). New frontiers in oilseed biotechnology: meeting the global demand for vegetable oils for food, feed, biofuel, and industrial applications. Current Opinion in Biotechnology, 22(2):252-259.
Publisher | Google Scholor - Chork H and Larsen R. A. (2000). Amino acid sequence and glycon structures of cysteinprotecises with proline specificity from Ginger rhizome Zingiber officinale. European Journal of Biochemistry, 267(5):1516-1526.
Publisher | Google Scholor - Daniell H, Chan HT, Pasoreck EK. (2016) Vaccination through Chloroplast Genetics: Affordable protein drugs for the prevention and treatment of inherited or infectious human diseases. Annual review of genetics, 50:595-618.
Publisher | Google Scholor - Elisabeth Gasteiger, Christine Hoogland, Alexandre Gattiker,Séverine Duvaud, Marc R. Wilkins, Ron D. Appel, and Amos Bairoch. (2005). The Proteomics Protocols Handbook, Humana Press.
Publisher | Google Scholor - Enrique Lopez-Juez and Kevin A. Pyke (2005) plastid unleashed: their development and their integration in plant Development. The International Journal of Developmental Biology, 49(5-6):557-577.
Publisher | Google Scholor - E.S. Oplinger, D.H. Putnam, A.R. Kaminski, C.V Hanson, E.A. Oelke, E.E. Schulte and J.D. Doll. (1990). Sesame. Purdue University.
Publisher | Google Scholor - Farhat Batool, Madiha Sattar, Asad Hussain Shah and Aisha Kamal (2011). Evaluation of Antidepressant-like effects of aqueous extract of sea buckthorn fruits in experimental models of depression. Pakistan Journal of Botany, 43(3):1595-1599.
Publisher | Google Scholor - Friesen Nikolai, Reinhard M. Fristch, Frank R. Blattner. (2005). Phylogeny and new Intergeneric classification of Allium. Aliso: A Journal of Systematic and Evolutionary Botany, 22:372-395.
Publisher | Google Scholor - Gouripur G. C, Kaliwal R. B & Kaliwal B. B. (2016). In silico characterization of beta-galactosidase using computational tools. Journal of Bioinformatics and Sequence Analysis, 8(1):1-11.
Publisher | Google Scholor - G.S. Mkamilo, Van der Vossen, H.A.M. (2007). PROTA Plant Resources of Tropical Africa / Ressources végétales de l’Afrique tropicale, Wageningen, Netherlands. Feedipedia.
Publisher | Google Scholor - Heuzé V, Tran G, Bastianelli D, Lebas F. (2017). Sesame (Sesamum indicum) seeds and oil meal. Feedipedia, a programme by INRA, CIRAD, AFZ, and FAO, 16:08.
Publisher | Google Scholor - Hongbo Gao, Deena Kadirijan_Kalbach, John E, Froehlich and Katherine W. Osteryoung. (2003). ARCS, a cytosolic dynamin-like protein from plants, is part of the chloroplast division machinery. Proceedings of the National Academy of Sciences of the United States of America, 100(7):4328-4333.
Publisher | Google Scholor - Ingersent K. A. (2003). World agriculture: towards 2015/2030 - an FAO perspective. Agricultural Development Economics, 54(3):513-515.
Publisher | Google Scholor - Karin Hjort, Alina V. Goldberg, Anastasios D. Tsaousis, Robert P. Hirt and Martin Embley. (2010). Diversity and Reductive evolution of mitochondria among microbial eukaryotes. Philosophical Transactions of the Royal Society B, 365(1541):713-727.
Publisher | Google Scholor - Krzysztof Bobik and Tessa M. Burch Smith. (2015). chloroplast signaling within, between and beyond cell. Plant science, 6:781-784.
Publisher | Google Scholor - Larkin M. A, G Blackshields, N P Brown, R Chenna, P A McGettigan, H McWilliam, F Valentin, I M Wallace, A Wilm, R Lopez, J D Thompson, T J Gibson and D G Higgins (2007). Clustal W. and Clustal X version 2.0 Bioinformatics, 23:2947-2948.
Publisher | Google Scholor - Lawrence A Kelley, Stefans Mezulis, Christopher M Yates, Mark N Wass and Michael J E Sternberg. (2015). The Phyre2 web portal for protein modeling, prediction and analysis. Nature Protocols, 10:845-858.
Publisher | Google Scholor - LA Johnson, TM Suleiman, Lusas EW. (1979). Sesame protein: a review and prospectus. Journal of the American Oil Chemists’ Society, 56(3):463-468.
Publisher | Google Scholor - Linhai Wang, Sheng Yu, Chaobo Tong, Yingzhong Zhao, Yan Liu. et al. (2014). Genome sequencing of the high oil crop sesame provides insight into oil biosynthesis. Genome Biology, 15-39.
Publisher | Google Scholor - Linhai Wang, Yanxin Zhang, Peiwu Li, Xuefang Wang, Wen Zhang, et al. (2012). HPLC analysis of seed sesamin and sesamolin variation in a sesame germplasm collection in China. Journal of the American Oil Chemists’ Society, 89(6):10.
Publisher | Google Scholor - Mark N Wass, Lawrence A Kelley, Michael J E Sternberg. (2010). 3DLigandSite: predicting ligand-binding sites using similar structures. Nucleic Acids Research, 38:469-473.
Publisher | Google Scholor - Neetu Jabalia, Hina Bansal, P.C. Mishra and Nidhee Chaudhary. (2015). In silico investigation of cysteine proteases from Zingiber officiniale. Journal of Protein and Proteomics, 6(3):245-253.
Publisher | Google Scholor - Oplinger E. S, Oelke E. A, Doll J. D, Bundy L. G and R.T. Schuler. (1990). In: Alternative Field Crops Manual, University of Wisconsin-Exension, Cooperative Extension.
Publisher | Google Scholor - Phillip John Kanu. (2011). Biochemical analysis of black and white sesame seeds from China. American Journal of Biochemistry and Molecular Biology, 1:145-157.
Publisher | Google Scholor - Raghav Ram, David Catlin, Juan Romero and Craig Cowley. (1990).
Publisher | Google Scholor - Ray Hansen. (2011). Sesame profile. Agricultural Marketing Resource Center.
Publisher | Google Scholor - Robert L. Myers. (2002). Sesame: high value oilseed. Thomas Jefferson Agriculture Institute, 573-449-3518.
Publisher | Google Scholor - Sun Hwang, Fereidoon Shahidi. (2005). Bailey’s Industrial Oil and Fat Products, Sixth Edition, Six Volume Set, John Wiley & Sons. Feedipedia.
Publisher | Google Scholor - Tunde-Akintunde T. Y, Akintunde B. O. (2004). Some Physical Properties of Sesame Seed. Biosystems Engineering, 88(1):127-129.
Publisher | Google Scholor - T. Ogasawara, k.Chiba, m.Tada in (Y. P. S. Bajaj edited). (1988). Medicinal and Aromatic Plants, Volume 10. Springer, 978(3):540-62727.
Publisher | Google Scholor - Xin Wei, Kunyan Liu, Yanxin Zhang, Qi Feng, Linhai Wang, Yan Zhao. et al. (2015). Genetic discovery for oil production and quality in sesame. Nature Communications, 6:8609.
Publisher | Google Scholor - Yang Zhang. (2008). I-TASSER Server for protein 3D structure prediction BMC Bioinformatics, 9:40-42.
Publisher | Google Scholor - Yonghua Li-Beisson, Basil Shorrosh, Fred Beisson, Mats X. Andersson, Vincent Arondel. et al. (2013). Acyl-Lipid metabolism. Arabidopsis Book, 11:0161.
Publisher | Google Scholor