Journal of Cell Science & Therapy

Journal of Cell Science & Therapy
Open Access

ISSN: 2157-7013

+44 1300 500008

Research Article - (2022)Volume 13, Issue 2

Molecular Evidence for Segmental Duplication across Chromosomes of Soybean Using Transcription Factor Gene Family

Manoj Kumar Srivastava* and Gyanesh Kumar Satpute
 
*Correspondence: Manoj Kumar Srivastava, Department of Crop Improvement, ICAR-Indian Institute of Soybean Research, Khandwa Road, Indore, Madhya Pradesh, India, Email:

Author info »

Abstract

Duplication of genome is an important genetic innovation. Large genome size (1.1 Gb) along with ancient and recent duplication events makes the soybean genome more complex. Analysing the distribution and duplication event in soybean transcription family genes, the segmental duplication within chromosomes was revealed. Our study provides a strong evidence that the large segmental duplication event in genome architecture and evolution of soybean genome using simple method of sequence and order analysis of TF genes. Finally, a scheme for interrelationship of different chromosomes has been proposed.

Keywords

Glycine max; Soybean; Chromosome; Segmental duplication; Transcription factor

Introduction

Duplication of genome is an important event leading to evolution of organism. Gene and genome duplication may contribute to evolution and domestication of crops. The role of genome duplication in present architecture/topology of soybean genome have vital influence on agronomic traits, yield potential and adaptation of crop plants. The redundant copy of gene arising by the duplication accumulates the beneficial mutations resulting in new function of duplicated gene product [1]. Thus gene duplication provides a source of plasticity to genome for adapting to changing environment [2]. Large genome size (1.1 Gb) along with ancient and recent duplication events make the soybean genome more complex. Having complex genome structure makes it rather difficult to design and develop effective breeding strategies in soybean for desired traits. Several QTLs for various traits and linkage maps have been developed for 20 chromosomes of soybean genome. Translocation, inversion, deletion and duplication play important roles in creating small duplication events.

In flowering plants, two ancestral whole genome duplication (WGD) is reported. In soybean, two additional sequential WGDs are established: One had occurred 59 MYA in the common ancestor of legumes and other about 8-13 MYA in Glycine lineage [3]. Due to multiple genome duplication events, the number of predicted coding genes in soybean is much higher than in Arabidopsis and grapes [4,5]. Several small blocks of homeologus retention and chromosomal arrangements are shown to exist in 20 chromosomes of soybean [3,6]. The segmental duplication in soybean has been reported to result in the evolution of several phenotypic traits such as disease resistance [7,8]. Several QTLs associated with seed related traits, disease resistance and high content of carbohydrates, proteins and oil were reported to be conserved in the duplicated segment of the soybean genome. The major seed protein QTL is mapped on chromosome 20 [9]. This QTL have been studied using several different approaches [10,11].

Transcription factors are DNA binding regulatory proteins, which interact with other proteins and regulate the process of transcription. Most of the TF genes have several family members. In soybean, there are 57 families of TF genes reported with are distributed at 3747 locus in 20 chromosomes. In the present study the duplication and order of these TF genes were analysed.

Methodology

All the transcription factor genes of soybean were downloaded from Plant transcription factor database (http://planttfdb.gaolab. org/index.php?sp=Gma). Duplication of TF genes across all chromosomes was studied by BLAST analysis of each gene against total transcript database (Wm82.a2.v1 Transcript Sequences). The top BLAST hit having E value of 0.0 was considered as transcript arising of duplicated gene. The E value having more than 0.0 was considered as non-duplicated gene product. The chromosome wise duplicated gene pair was recorded. The putative chromosome pair for duplicated segment was observed for similarity in order of TF genes.

Results and Discussion

Most of the high copy number TF genes are distributed among all the chromosomes, however, the relative number of TF varied for each chromosome (Table 1). Five out of 20 chromosomes (Chr 2,Chr 6, Chr 8, Chr 10 and Chr 13) contain more than 200 loci of TF genes. The chromosome 16 has the lowest number of TF genes while Chr 13 has the maximum number of TF genes. No specific pattern of chromosomal location for different TF gene family was observed i.e. different TF gene family have different distribution pattern.

S. No. Gene 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1. AP2 8 6 5 2 4 2 4 7 4 3 7 6 5 1 4 2 8
2. ARF 4 8 4 3 10 2 10 6 1 2 3 14 12 5 7 4 2
3. ARR B 2 2 0 1 7 1 5 4 4 1 7 0 2 2 6 1 6
4. B3 4 8 9 6 0 5 11 5 6 4 22 8 2 14 1 7 6
5. BBR BPC 0 0 0 2 7 8 3 7 2 0 0 0 0 0 0 0 0
6. BES1 2 0 0 1 0 1 1 0 2 0 1 3 4 2 0 0 1
7. BHLH 33 46 42 20 21 29 36 40 16 40 14 23 45 15 33 14 26
8. BZIP 17 18 23 21 23 15 6 37 5 14 33 20 23 10 9 12 7
9. C2H2 12 29 17 11 13 11 18 14 3 31 14 20 30 12 12 8 13
10. C3H 2 11 11 10 8 7 4 11 7 13 6 8 7 7 10 5 9
11. CATMA 0 0 0 0 7 0 2 6 2 0 1 0 0 0 2 0 2
12. CO LIKE 0 4 1 1 1 1 3 2 3 2 0 0 6 5 0 1 1
13. CPP 3 0 0 1 1 3 6 1 3 1 1 0 0 0 0 0 4
14. DBB 6 0 1 3 0 3 0 0 1 0 8 16 4 1 1 2 1
15. DoF 3 4 4 8 4 7 10 7 2 2 2 2 11 0 10 2 5
16. E2F/DP 1 1 0 6 2 4 0 0 0 2 11 7 0 0 0 0 4
17. eil 0 1 0 0 1 1 0 1 0 0 1 0 3 1 1 0 0
18. ERF 15 16 22 17 15 19 17 25 10 24 11 11 27 16 14 17 19
19. FAR1 3 2 6 15 1 12 5 14 7 7 5 4 6 2 26 1 1
20. G2 LIKE 9 15 21 3 7 3 12 4 19 13 6 10 14 5 23 2 9
21. GATA 4 10 2 6 3 4 5 15 2 6 7 5 2 4 2 6 6
22. gebp 0 0 1 0 2 0 0 0 0 2 0 0 1 0 1 0 0
23. GRAS 5 8 6 5 7 6 7 4 6 5 19 17 13 3 10 5 7
24. GRF 2 0 3 7 0 2 1 0 4 1 4 2 3 0 2 2 4
25. HB OTHERS 0 2 1 3 0 5 4 8 0 0 0 1 2 1 0 0 0
26. HBPHD 0 0 0 0 0 0 0 0 1 2 0 0 4 0 3 0 5
27. HDZIP 15 6 10 3 8 5 12 15 18 12 10 14 10 1 7 6 4
28. HRT LIKE 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
29. HSF 8 1 4 7 8 1 2 7 4 9 5 0 4 5 2 2 5
30. LBD 8 8 4 8 9 6 3 8 2 6 6 2 8 13 4 5 4
31. LFY 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0
32. LSD 3 0 0 0 2 0 3 1 2 0 0 0 0 0 4 0 3
33. M TYPE MADS 1 7 3 2 5 0 6 7 0 16 8 0 2 2 0 1 3
34. MICK MADS 12 19 5 12 23 14 7 30 13 4 5 2 22 4 9 5 5
35. MYB 22 24 18 18 14 23 29 24 19 25 19 19 26 18 14 10 17
36. MYB RELATED 14 10 15 19 15 17 16 10 18 32 25 11 34 9 12 25 12
37. NAC 13 11 3 18 20 21 15 17 8 10 8 18 17 12 12 13 9
38. NFX1 0 0 0 0 0 0 2 1 2 0 1 0 0 0 0 0 0
39. nfya 0 18 3 0 2 0 12 5 16 5 0 5 3 3 23 5 4
40. nfyb 0 3 5 2 5 5 3 5 3 8 1 1 1 0 2 0 4
41. nfyc 0 3 2 2 0 3 0 3 0 2 2 2 8 3 3 0 0
42. NIN LIKE 1 1 0 15 1 10 0 0 2 1 4 4 9 1 6 2 1
43. RAV 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
44. s1fa like 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0
45. sap 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
46. SBP 12 6 6 3 9 6 10 4 3 1 21 1 5 0 4 4 3
47. SRS 2 7 0 6 0 4 3 0 0 0 4 1 2 2 2 4 3
48. STAT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0
49. TALE 9 5 6 15 4 17 1 2 3 5 6 17 6 7 4 2 13
50. TCP 1 2 1 6 12 8 2 5 2 4 1 6 9 0 1 2 7
51. Trihelix 2 3 11 5 1 6 4 3 5 13 4 1 8 0 5 6 8
52. VOZ 0 0 0 0 0 3 3 0 0 12 3 4 1 0 0 0 0
53. Whirly 2 6 1 0 0 0 0 4 0 0 0 0 0 0 0 0 0
54. WOX 1 3 1 2 2 2 4 1 1 3 6 1 2 2 1 0 1
55. WRKY 16 24 19 16 17 20 11 21 29 18 4 4 10 16 12 8 19
56. YABBY 4 1 1 1 2 8 0 1 0 0 0 16 8 0 0 0 4
57. ZFHD 3 8 0 2 2 2 3 6 5 0 2 3 2 1 0 4 2
    289 368 298 315 305 333 322 398 265 362 328 309 424 206 306 195 277

Table 1: Distribution of TF gene products in soybean chromosomes.

Locus frequency of less than 40% represents the TF genes which produce more copies of products from the same gene (Tables 2 and S1). These include BBR-BCP (31.03%; 10/29), E2F/DP (36.84%; 14/38), VOZ (23.08%; 6/28), Whirly (38.89%; 7/18) and YABBY (34.04%; 17/47). Five out of 57 families have same number of gene products and gene locus (HRT-like, LFY, RAV, S1Fa-like and SAP). These are also low copy number transcription factor genes. The genes having locus to gene product ratio of more than 0.8 produce less gene product variants. These are BES1 (0.84), Dof (0.82), EIL (0.92), ERF (0.88), GeBP (0.9), M-type_MADS (0.95), WOX (0.80) and ZF-HD (0.89).

Chromosome Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 0 23.1 11.1 0 0 0.6 2.7 2.4 11.4 0 35.1 0 1 1 0 0
2 30.9 0 0 0 0 0 3.4 0 2.4 19.6 0 0 0 48 0 23.3
3 8.1 0 0 0 0.8 0 8.9 0 0 0 0 0.7 0 1 0.7 3.5
4 0 0 0 0 0 76.6 0 0 0.8 0 1.5 0 0 1 0 0
5 0 0 0.9 0 0 0 0 37.3 0 0 0 0 0 0 0 0
6 0 0 0 93.7 0 0 0 0 0.8 0 0 18.5 0 1 1.5 0
7 4.1 3.8 10.3 0 0 0 0 15.4 9.8 0 0 0.7 3.9 0 0 31.4
8 2.4 0 0.9 0 48.4 0 17.1 0 0 0.7 0.7 0.7 0.5 0 17.8 2.3
9 11.4 1.9 0 0.8 0 0 9.6 0 0 0 0 2.7 1 1 27.4 18.6
10 0 18.8 0.9 0 0 0.6 0 0 0 0 2.2 0 10.2 0 0 0
11 37.4 0 0 1.6 0.8 0 0 0.6 0 2.7 0 37 0 0 0 0
12 0 0.6 0 0 0 17.7 0 1.8 3.3 0 37.3 0 22.9 0 0 2.3
13 0.8 0 0 0.8 0 0 6.2 0.6 0.8 14.9 0 33.6 0 8.8 51.9 0
14 0.8 31.9 0 0 0 0 0 0 0 0 0 0 3.9 0 0 0
15 0 0 0 0 0 1.3 0.7 14.8 30.1 0 0 0 35.1 0 0 0
16 0 11.3 2.6 0 0 0 17.8 1.2 16.3 0 0 2.1 0 0 0 0
17 0 1.3 0 1.6 46.1 0.6 10.3 0.6 0.8 0.7 0 0 10.2 31.4 0 0
18 0.8 1.9 0.9 0 0 0 8.9 25.4 22 0 21.6 0 0.5 0 0.7 0
19 0 0.6 72.6 0 3.9 0 0 0 0.8 0 0 0 8.3 2 0 18.6
20 0 4.4 0 0 0 1.9 14.4 0 0.8 60.8 0 0 2.4 0 0 0
Unassigned 3.3 0.6 0 1.6 0 0.6 0 0 0 0.7 1.5 4.1 0 4.9 0 0
  100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

Table 2: Distribution of duplicated genes (%) in soybean chromosome.

This analysis for cDNA of all the 3747 locus genes revealed that at least 72.63% of the TF genes were duplicated paralog pair (Tables 3 and S2). Also, there was a specific preference of duplication among various chromosomes, e.g. chromosome 1 have more duplication segments from chr 2, chr 9 and chr 11; chromosome 2 have more duplication fragments from chr 1, chr 10, chr 14 and chr 16; Chromosome 3 have major segment from chr 19 and minor contribution from chr 1 and chr 7; Chromosome 4 has almost entire (95%) TF gene duplication from chr 6. The detail distribution is given in Figure 1, Table 3 and supplementary Table S2.

Cell-Science-different

Figure 1: Distribution of duplicated transcription factor genes in different chromosomes of soybean.Equation
Equation

 Chromosome number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
1 0 37 13 0 0 1 4 4 14 0 47 0 2 1 0 0 0 1
2 38 0 0 0 0 0 5 0 3 29 0 0 0 49 0 20 2 3
3 10 0 0 0 1 0 13 0 0 0 0 1 0 1 1 3 0 1
4 0 0 0 0 0 121 0 0 1 0 2 0 0 1 0 0 0 1
5 0 0 1 0 0 0 0 63 0 0 0 0 0 0 0 0 62 0
6 0 0 0 119 0 0 0 0 1 0 0 27 0 1 2 0 0 0
7 5 6 12 0 0 0 0 26 12 0 0 1 8 0 0 27 14 13
8 3 0 1 0 62 0 25 0 0 1 1 1 1 0 24 2 0 39
9 14 3 0 1 0 0 14 0 0 0 0 4 2 1 37 16 1 26
10 0 30 1 0 0 1 0 0 0 0 3 0 21 0 0 0 1 0
11 46 0 0 2 1 0 0 1 0 4 0 54 0 0 0 0 0 32
12 0 1 0 0 0 28 0 3 4 0 50 0 47 0 0 2 0 0
13 1 0 0 1 0 0 9 1 1 22 0 49 0 9 70 0 19 1
14 1 51 0 0 0 0 0 0 0 0 0 0 8 0 0 0 31 0
15 0 0 0 0 0 2 1 25 37 0 0 0 72 0 0 0 0 0
16 0 18 3 0 0 0 26 2 20 0 0 3 0 0 0 0 0 0
17 0 2 0 2 59 1 15 1 1 1 0 0 21 32 0 0 0 0
18 1 3 1 0 0 0 13 43 27 0 29 0 1 0 1 0 1 0
19 0 1 85 0 5 0 0 0 1 0 0 0 17 2 0 16 0 0
20 0 7 0 0 0 3 21 0 1 90 0 0 5 0 0 0 0 0
Unassigned 4 1 0 2 0 1 0 0 0 1 2 6 0 5 0 0 2 0
  123 160 117 127 128 158 146 169 123 148 134 146 205 102 135 86 133 117

Table 3: Distribution of duplicated genes in soybean chromosome.

The gene order of different TFs was then compared and similarity was investigated among putative chromosome pairs (Table S3). The data supported the finding of preferential chromosome duplication. Furthermore, in the segment of similar TF genes the order of various TF genes was found to be collinear either in direct or reverse direction. This indicated that long segmental duplication event in soybean is major event in the evolution of soybean chromosome. The preferential segmental duplication is presented in Figure S1. This is in accordance with the distribution data of duplicated TF genes. However this may be noted that careful sequence analysis is required at gene to gene basis for the accumulated mutations in duplicated TF genes. This will lead to assigning new function to the duplicated TF gene.

It has been established that the recent genome duplication occurred on many soybean [5]. QTLs across duplicated regions of Chr 4/Chr 6, Chr 3/Chr 19 and Chr 10/Chr 20 were shown to be correlated [6]. Small rearrangements were found in duplicated homeologus regions of QTLs due to recent duplication event. Soybean gene duplication may also lead to gene regulation [12]. A large inversion with synteny in the corresponding regions of Chr 10 and Chr 20 has also been reported [13].

He studied similarities between soybean and Arabidopsis genomes using dot-plot analysis and reported that whole genome duplication event occurred more than once during the evolution of soybean genome [14]. About 70% of total genes in soybean genome have duplicated paralog pairs. The block of 2140 genes were found to be the largest pair of duplicated paralog gene in chromosome 3 and 19 of soybean [14]. The data presented also indicated towards the ongoing duplication event in soybean genome. DNA rearrangement and codon mutations resulted in emergence of new gene sequences which may have new functions leading to better adaptation in new challenging environment.

Chromosome 1- contains 6 duplicated TF gene segments from chr 9, chr 2, chr 3 and chr 11. The largest segment was from Chr 11 having 519 genes.

Chromosome 2- contains 8 distinct segments from Chr 1, Chr 10,Chr 14 and Chr 16. There were four different fragments duplicated from Chr 14, two in forward direction and two in reverse directions.

Chromosome 3- block of 1355 genes is directly duplicated in chromosome 19 (1325 genes). All the TF genes are collinear. 160, 29 and 88 genes are duplicated in smaller segments and the order of TFs are collinear but in reverse and forward direction.

Chromosome 4- has 5 different segments from Chr 6. The first two segments are in forward direction while last three are in reverse direction. Chromosome 6 has an additional segment from chromosome 12 in reverse direction.

Chromosome 5- has seven smaller segments, four from chromosome 17 and three from chromosome 8. Three segments of Chr 17 were duplicated in reverse direction. All the three segments duplicated from chromosome 8 were in forward direction. Most of the TF gene order was conserved in duplicated segments.

The duplicated segments have almost similar number of transcription factor genes (Table 4). However, the number of embedded genes is quite variable in respective duplicated segments (Figure S1). There are 48 segmental pairs detected having 4-107 conserved TF genes and 29-1431 embedded genes. Only segment number 36 has identical 20 TF genes and 423 embedded genes on Chr 9 and chr 16. Other segment pairs have variable number of genes. This may be due to ongoing translocation, mutations, and deletions of gene within the segments. Based on these segment pairs an inter-relation of various chromosomes of soybean has been proposed and shown as Figure 2.

Cell-Science-soybean

Figure 2: Inter relationship of different soybean chromosomes.

Chromosome number Segment number Number of TF genes in the segment Total number of genes in the segment Chromosome number of Duplicated segment Number of TF genes in the segment Total number of genes in the segment Orientation
Chr 1 1 11 130 Chr 9 11 117 Reverse
2 16 214 Chr 2 16 227 Reverse
3 22 239 Chr 2 22 203 Direct
4 4 29 Chr 3 4 40 Direct
5 8 110 Chr 3 8 123 Reverse
6 59 630 Chr 11 54 619 Reverse
Chr 2 7 24 291 Chr 16 24 345 Direct
8 9 114 Chr 10 9 75 Reverse
9 17 224 Chr 14 20 253 Direct
10 5 33 Chr 14 5 20 Direct
11 12 143 Chr 14 13 168 Reverse
12 10 170 Chr 14 10 186 Reverse
Chr 3 13 19 166 Chr 7 14 88 Reverse
14 105 1356 Chr 19 105 1326 Direct
Chr 4 15 37 407 Chr 6 35 418 Direct
16 41 586 Chr 6 36 574 Direct
17 25 255 Chr 6 25 247 Reverse
18 28 350 Chr 6 27 354 Reverse
19 18 293 Chr 6 19 319 Reverse
Chr 5 20 7 50 Chr 17 7 53 Reverse
21 15 149 Chr 17 15 151 Reverse
22 20 260 Chr 17 22 261 Direct
23 14 99 Chr 17 13 81 Reverse
24 6 107 Chr 8 6 104 Direct
25 35 476 Chr 8 33 502 Direct
26 37 485 Chr 8 38 497 Direct
Chr 6 27 25 257 Chr 12 25 269 Reverse
Chr 7 28 11 103 Chr 8 11 93 Reverse
29 28 243 Chr 16 27 236 Direct
30 19 223 Chr 9 17 215 Reverse
31 22 262 Chr 20 26 323 Direct
32 17 361 Chr 17 19 354 Reverse
Chr 8 33 7 58 Chr 15 6 44 Direct
34 11 168 Chr 18 12 185 Direct
Chr 9 35 46 942 Chr 15 49 972 Direct
36 20 423 Chr 16 20 423 Direct
37 38 465 Chr 18 40 541 Reverse
Chr 10 38 35 342 Chr 13 34 314 Direct
39 107 1431 Chr 20 110 1361 Reverse
Chr 11 40 43 486 Chr 12 37 460 Direct
41 46 676 Chr 18 40 503 Reverse
Chr 12 42 40 441 Chr 13 40 460 Reverse
Chr 13 43 19 186 Chr 19 22 265 Reverse
44 17 239 Chr 17 18 244 Reverse
45 40 488 Chr 15 39 468 Reverse
46 38 417 Chr 15 42 420 Reverse
Chr 14 47 37 425 Chr 17 36 392 Reverse
Chr 16 48 15 146 Chr 19 15 192 Reverse

Table 4: Distribution and orientation of genes in duplicated segment pairs of soybean chromosomes.

Duplication creates genetic redundancy leading to evolutionary innovation. Over the passage of time the duplicated copy acquire a beneficial mutation resulting in retention of both copies. Alternatively the mutation in duplicated segment may make it nonfunctional. The recognition of fact that a single protein can have a multiple catalytic or structural functions supports the contribution of gene duplication. In recent studies, the genome-wide analysis of duplication of individual transcription factors have been reported [15-20]. Our results also validate these studies; however, there are some minor differences about the position of duplicated segments. Our study has provided a strong evidence that the large segmental duplication event in genome architecture and evolution of soybean genome using simple method of sequence and order analysis of TF genes.

Conclusion

A detailed analysis of these genes using Bioinformatics tools may help in establishing the process of gene duplication in other species and genera. By analysing the distribution and order of transcription factor genes the early mode of genome duplication was established. This method provides an easy and effective tool to study genome duplication in different species and genera. The functional analysis of duplicated genes is required for complete elucidation of the process of genome duplication.

References

Author Info

Manoj Kumar Srivastava* and Gyanesh Kumar Satpute
 
Department of Crop Improvement, ICAR-Indian Institute of Soybean Research, Khandwa Road, Indore, Madhya Pradesh, India
 

Citation: Srivastava MK, Satpute GK (2022) Molecular Evidence for Segmental Duplication across Chromosomes of Soybean Using Transcription Factor Gene Family. J Cell Sci Therapy. 13:345.

Received: 04-Feb-2022, Manuscript No. JCEST-22-15751 ; Editor assigned: 07-Feb-2022, Pre QC No. JCEST-22-15751 (PQ); Reviewed: 18-Feb-2022, QC No. JCEST-22-15751 ; Revised: 25-Feb-2022, Manuscript No. JCEST-22-15751 (R) ; Published: 04-Mar-2022 , DOI: 10.35248/2157-7013-22.13.345

Copyright: © 2022 Srivastava MK, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Top