Chloroplast Genome-Based Baraminology Study of Liliales (Open Access)


ABSTRACTS


The mitochondrial DNA has been used widely in molecular baraminology studies. Besides the mitochondrion, the chloroplasts of plants harbor an entire organelle genome, which can also be used to identify putative groups, which can be compared to those identified in studies based on mitochondrial DNA, nuclear DNA, and morphological characters. The ten-fold size of the chloroplast genome allows a much larger sequence space to be analyzed, but it also brings its own set of analytical challenges to the table.

In this study, the chloroplast genomes of 163 species from the plant order Liliales were examined, with the genus Stemona (order Pandanales) as an outgroup. A multiple alignment was created, and a sequence similarity matrix was derived from it, which was then clustered into several putative holobaramins. Of these, six groups had more than five members. These putative baramins are: Tulipa+Amana, Disporum, Fritillaria+Lilium+Nomocharis+Notholirion, Daiswa+Paris+Trillium, Smilax, and Veratrum .

Compared to the results of a previous morphological analysis, there are some discordances. The previous analysis separates Trillium from Daiswa and Paris, and unites Alstroemeria with Disporum. Ultimately, genetics should decide baraminic classification, albeit the chloroplast genome still makes up only a small fragment of the entire plant genome. Following this study, it is hoped that chloroplast genome analysis may enrich the toolkit of molecular baraminology.


Chloroplast Genome-Based Baraminology Study of Liliales

Matthew Cserhati

Key Words: Amana, chloroplast DNA, Disporum, Fritillaria, molecular baraminology, Liliales, organelle DNA, Paris, Tulipa, Veratrum,

Introduction

Organelle genomes are useful in analyzing species relationships via sequence similarities. Their short size makes them easy to sequence, and occur abundantly in cells. Mitochondrial DNA also do not undergo the kind of genetic recombination that nuclear genes do. However, their short length limits the kind of conclusions that we can draw from using them in organelle DNA studies.

Mitochondrial DNA (mtDNA) have been widely used in molecular baraminology studies for decades. Some mtDNA studies have been performed in turtles (Robinson et al., 1997), cats, dogs, and horses (Wood, 2013). Cserhati analyzed mtDNA in bats (Cserhati, 2021), cephalopods (O'Micks, 2018), birds (Cserhati and Alquist, 2019), primates (Cserhati, 2022), and pinnipeds (Cserhati and Moynaugh, 2022).

Chloroplast studies, on the other hand, are almost non-existent in the molecular baraminology literature. Wood and Cavanaugh (2001) studied the relationships between eleven species of Flaveria (yellowtops), and found that C3, C4, and C3-C4 species segregate into separate monobaramins, based on the sequence similarity of the H subunit of the glycine carboxylase system gene (gcsH). Furthermore, the sequence similarity of the NADH dehydrogenase gene (ndhF) calculated by Kim and Jansen (1995) for the family Asteraceae was used by Wood and Cavanaugh to assess baraminic relationships within the subtribe Flaveriinae and other groups within Asteraceae.

However, no such genome analysis has been carried out on chloroplast genomes for a large number of plant species. Compared to the mitochondrial genome, the chloroplast genome is much longer, around 150 Kbp. It contains 100-150 coding sequences, genes, rRNAs and tRNAs, compared to the 37 of the mtDNA. It is made up of four genomic regions, two inverted repeats (IR), a large and a small single copy (LSC, SSC), and a small copy region (SCR). This makes it harder to align these longer sequences, so alternative alignment software, such as MAFFT may be used as opposed to ClustalW or ClustalO. Furthermore, chloroplast DNA also undergoes recombination. On the other hand, since plant cells also contain mitochondria, it may be possible to supplement chloroplast studies with mitochondrial and also morphological studies.

In this study, my goal is to study the sequence similarity between plants in the order Liliales (lilies). Lilies are monocotyledonous flowering plants and are angiosperms. It is made up of 10 families, 67 genera and 1,768 species. They are characterized by a bulbous rhizome and six alternating tepals. Similar to mtDNA studies, I aim to determine baraminic relationships based on clustering these 163 species based on global chloroplast genome sequence similarity.

Materials and Methods

Sequence data

The chloroplast genomes of 165 species from Liliales were downloaded from the NCBI Organelle Genome Browser at https://www.ncbi.nlm.nih.gov/genome/browse#!/organelles. Four chloroplast genomes from the genus Stemona (order Pandanales) were also downloaded as outlier species. Outgroups may be used in baraminology studies to compare the organisms under study to a group that can reasonably be identified as a separate holobaramin. If clusters are found in the data set under analysis that are comparable to the outgroup holobaramin, then that supports the conclusions drawn from the study. A list of species that were studied can be found in Supplementary File 1. Disporum bodinieri and Lilium souliei were removed because their sequence origin was irregular compared to all other sequences (which began with the gene psbA). Since generally the level of the holobaramin is the family, the order Liliales was chosen, making it likely to find multiple holobaramins within this larger taxonomic group.

Software

The sequence aligner software MAFFT (Katoh and Standley, 2014) was used because it can align longer sequences compared to mtDNA. Its high speed is also desirable. MAFFT was run on the chloroplast genomes with default parameters.

The Calinski-Harabasz index (Ujjwal and Sanghamitra, 2002) was calculated to help determine the optimal number of baramins in the sequence similarity matrix using the cascadeKM command from the 'vegan' package.

Baraminic heatmaps were created using R, version 4.1.0 using the heatmap command, and clusters were determined with the 'single' method. The results of clustering and group statistics are also available in Supplementary File 1.

Baraminic trees were drawn for three putative clusters (assumed to be holobaramins) in Figures 2-4 using the WPGMA (weighted pair group method with arithmetic mean) method. For this the 'wpgma' command was used in R with default parameters. The linkage clustering method used by the wpgma command is 'average.' In average linkage clustering, the distance between two clusters is equal to the average of all of the distances between all pairs of entities in the two clusters. WPGMA was preferred over UPGMA, since UPGMA assumes a uniform base pair substitution rate ( Göker, 2011 ).

The elbow, silhouette, and Calinski-Harabasz values for 1-20 clusters can be found in Supplementary File 2. The Organelle DNA Lineages (ODL) software was used to generate chloroplast genome maps for Veratrum species as well as Xerophyllum tanax. All supplementary files are available on Zenodo at https://zenodo.org/record/7686930#.Y_7fzh_MLrc. See the end of the paper for a description of each of the supplementary files.

Determination of species clusters as possible holobaramins

The fundamental tenet of baraminology (FTB) states that there exists significant discontinuity between baramins, and significant continuity within baramins. As such, Student's t-test p-values were used to determine the statistical significance of the tentative species clusters resulting from the cluster analysis. For each cluster, two sets of sequence similarity values were analyzed using Student's t-test. The first set of values comes from sequence comparisons between the species in a given cluster. The second set of values comes from comparing sequences in the cluster versus all other species used in the analysis. Therefore, the p-value unites the statistical significance of both outward dissimilarity (discontinuity) and inward similarity (continuity), as required by the FTB.

Biblical Analysis

Lilies are mentioned 17 times in the Bible (1Kings 7:19, 22, 26; 2 Chronicles 4:5; Psalm 45:1, 60:1, 69:1, 80:1; Song 2:1-2, 16, 4:5, 5:13, 6:2-3, 7:2; Hosea 14:5; Matthew 6:28; Luke 12:27). Together with all land plants, they were created on Day 3 of Creation Week (Genesis 1:11-12).

Most of these verses involve poetic descriptions of lilies, describing how their shape and vivid coloration exceeds the glory of even the greatest kings (Luke 12:27). However, the Bible doesn't let us draw any conclusions about how lilies are grouped or subdivided.

Song 2:1 appears to compare lilies to roses: "I am the rose of Sharon, and the lily of the valleys." However, this is only a poetic comparison, and not much can be concluded from it. Song 2:2 says this: "Like a lily among thorns, So is my love among the daughters." This seems to suggest a discontinuity between lilies and thorny plants, such as roses. Thus, we can say that lilies and thorny plants (such as roses) belong to separate baramins.

Morphology-Based Baraminology Analyses of Liliales

Wood (2008) performed morphology-based baraminology analysis using the BDIST software on three data sets from three families of Liliales: Alstroemeriaceae, Pontederiaceae, and Trilliaceae. Of these, the first and the third analysis is of interest to this study.

Wood found continuity between the genera Luzuriaga, Uvularia , Disporum, and Dioscorea. These four genera showed discontinuity with species from two other genera, namely Alstroemeria and Bomarea, which showed continuity between themselves. This corresponds to two holobaramins within Alstroemeriaceae.

Wood found evidence for two more holobaramins within the family Trilliaceae. These two holobaramins show continuity within themselves and discontinuity between each other. The first holobaramin is made up of almost exclusively of species from the genus Trillidum, plus the species Pseudotrillium rivale. The second holobaramin is made up of species from Paris, Kinugasa, and Daiswa.

Results and Discussion

The heatmap showing the baraminic relationships between the 163 species can be seen in Figure 1. The Hopkins clustering statistic is 0.965, which reflects extremely good clustering. The Elbow plot in Supplementary Figure 1 is a bit difficult to interpret. The total within sum of squares (twss) value drops very shallowly from three to four clusters: from 30.64 to 30.18, which is a difference of only 1.5%. However, from four to five clusters, the twss value drops to 11.03, but then rises again to 16.49 for six clusters. Afterward, the twss value rises and falls, making it hard to discern the optimal number of clusters.

The Silhouette plot in Supplementary Figure 2 also shows a similar difficulty in determining the optimal number of clusters in the data, as the maximum Silhouette value varies widely.

On the other hand, the Calinski-Harabasz index (depicted in Supplementary Figure 3) presented a clearer picture, showing an optimal 16 clusters with a Calinski-Harabasz index of 969.2. All 16 of these putative baramins have a statistically significant p-value.

These 16 clusters were examined more closely. The first cluster is made up of only two species, an Alstroemeria hybrid and Luzuriaga radicans . This contradicts the morphological evidence of Wood (2008) who separates Alstroemeria and Luzuriaga. In this study, these two species do not provide solid enough evidence to join these two genera together into a single holobaramin.

The second cluster is made up of 15 species from the two genera Amana and Tulipa, two genera from the family Liliaceae. Both of these groups are known as tulips, Amana species being known as East Asian tulips (Li et al., 2017). Several authors even place Amana within Tulipa (Liang, 1995). Other studies place Amana and Tulipa , as well as the genus Erythronium into a monophyletic group (Kim et al., 2013). Compared to Tulipa, Amana has lost the infA gene (translation initiation factor 1), and its rps19 gene lies entirely within the LSC region in Amana edulis, as opposed to Tulipa species where rps19 lies on the boundary between the LSC and the IRb. Furthermore, the ndhF gene lies entirely within the SSC region in A. edulis , whereas it transverses the IRb/SSC boundary (see Figure 2 in Li et al., 2021).

A baraminic tree showing species from Tulipa and Amana can be seen in Figure 2. T. patens, T. thianschanica, T. schrenkii, T. iliensis, T. gesneriana, and T. sinkiangensis form a visible monobaramin compared to the seven Amana species. T. altaica is an outlier species compared to the other 14 species. Its clp genome length (146,877 bp) is shorter than the average of the other Tulipa species (151,986 bp) (Zhou et al., 2019). Its GC% is also at 37.1%, compared to the mean value of 36.6±0.03% (p = 1.08 x 10-62) of all the other species. It also has only 81 coding sequences compared to 87 for all the other Tulipa species (with T. gesneriana having 85 coding sequences). A list of deletions greater than 100 bp from the T. altaica clp genome compared to the other genomes of Tulipa can be seen in Table I. Together, these deletions make up 5,566 bp, which makes up 3.8% of the 146,877 bp genome of T. altaica. The multiple alignment file is available online as Supplementary File 3. The following 15 genes and rRNA molecules are missing from the clp genome of T. altaica, which are present in the clp genomes of the other Tulipa species are listed in Table II. The gene order of the Tulipa species can be found in Supplementary File 4.

The third cluster consists of five species: Calochortus uniflorus, Prosartes lanuginosa, Scoliopus bigelovii, Tricyrtis formosana , and Tricyrtis macropoda. These five species come from three separate families within Liliales, Liliaceae, Colchicaceae, and Melanthiaceae. The placement of these species has been in flux. Various authors have placed Tricyrtis as a sister group to Calochortus , whereas others have allied it with Scoliopus and Prosartes . The position of Tricyrtis has varied in evolutionary phylogenetic trees based on the rbcL and ndhF genes, thus highlighting the usefulness of whole genome sequence analysis as opposed to single genes (Patterson and Givnish, 2002).

Cluster #4 consists of a single species, Campynema lineare (the green mountain lily), found in Tasmania. This species is the only one that belongs to the genus Campynema, which belongs to the family Campynemataceae. Vinnersten and Bremer (2001) report that this family is monophyletic, suggesting that it is thus a holobaramin. Interestingly, this species is the least similar compared to all the rest of the species, with an average clp genome sequence similarity of 0.745. Some evolutionists claim that C. lineare is the most basal, and thus the most ancient member of the order Liliales (Janssen, 2004). However, if this were true, then it should also be the most diverse group-quite the opposite is true.

The next cluster (#5) consists of only two species, Clintonia udensis and Lloydia tibetica. These two species belong to separate genera in the family Liliaceae.

Cluster #6 has four species in it, Colchicum autumnale, Disporum sessile , Gloriosa superba, and Iphigenia indica. All these species belong to different genera in the family Colchicaceae. However, D. sessile appears to be misplaced, as the next cluster (#7) contains four Disporum species: D. cantoniense, D. megalanthum , D. uniflorum, and D. viridescens. It might be possible to consolidate clusters 6 and 7 into one putative baramin. The clp genomes of all five Disporum species were aligned with MAFFT, and it was found that D. sessile contains several large indels compared to the other four Disporum species. The multiple alignment file is available online as Supplementary File 5. These indels are listed in Table III. The total length of all those indels longer than 50 bp comes up to 5,469 bp, which makes up 3.4% of the 159,102 bp clp genome of D. sessile . Whereas D. sessile has an average sequence similarity of only 93.7% with the other four Disporum species, these other four species have an average sequence similarity of 99.4% among themselves. A large part of the 5.7% difference could be due to these indel sequences. The DNA insertions within the D. sessile genome may possibly lead to the formation of a monobaramin within Disporum.

The next cluster (#8) is made up of 67 species from the genera Fritillaria , Lilium, Nomocharis, and Notholirion. These are all members of the family Liliaceae. These four genera and Cardiocrinum comprise the tribe Lilieae (Li et al., 2022). Some taxonomies accommodate Nomocharis into Lilium (Gao and Gao, 2016). The GC% of the species in this study also fits into a narrow range (36.9-37.1%). Figure 3 shows a baraminic tree with species from Lilium, Nomacharis , Notholirion, and Fritillaria. The species in this group are characterized karyotypically of two long metacentric chromosomes and ten medium-length telocentric chromosomes (Patterson and Givnish, 2002).

Cluster #9 is made up of three species, Heloniopsis tubiflora, Ypsilandra thibetica, and Ypsilandra yunnanensis. These are two genera from the family Melanthiaceae.

Cluster #10 is made up of a single species, Medeola virginiana. It is the only species in its genus, Medeola. It belongs to the family Liliaceae.

Cluster #11 is made up of 35 species, two from the genus Trillium, four from Daiswa, and 29 from Paris. Whereas Wood (2008) assigned Paris and Daiswa into the same holobaramin, he found that Trillium forms its own holobaramin.

Chloroplast genome analysis performed by Huang et al. (2016) shows that the genus Paris can be split up into two segregate genera, namely Paris and Daiswa. Jiang et al. (2022) also found that several species from Trillium and Paris intermingle with one another. Furthermore, several species from Paris and Trillium have a genome size 1C greater than 100 Gb, with P. japonica having the largest known eukaryotic genome at around 150 Gbp (Yang et al., 2019).

The baraminic tree in Figure 4 shows four Daiswa species intermingling with species from the genus Paris. Besides this, Daiswa and Trillium are sister taxa, implying that Trillium could be included in the Paris+Daiswa+Trillium = Parideae tribe/holobaramin. In the clp genome sequence similarity matrix T. camschatcense and T. govanianum have a sequence similarity of 92.4%. A bit surprisingly, they have an average sequence similarity of 92.8% with species from Paris and Daiswa. This heavily implies that Trillium is part of the Parideae holobaramin.

Do et al. (2014) suggest that the chloroplast envelope membrane protein A (cemA) gene may be used as a molecular taxonomic marker to identify species from Paridae. They base this on the divergent protein sequence after the first four amino acids. This gene has apparently undergone pseudogenization due to stop codons in its amino acid sequence.

Cluster #12 is made up of six species from the genus Smilax, in the family Smilacaceae. These are Smilax glabra, S. goeringii , S. microphylla, S. moranensis, S. riparia, and S. weniae.

The next two clusters, #13 and #14 are a special case. Zhang et al. (2021) also found that the genus Veratrum is monophyletic based on a study of the genes and intergenic regions of 10 Veratrum species. Here the 13 species from the genus Veratrum are split into two groups, which show up separate clusters on the baraminic heatmap in Figure 1. The first group, cluster #13 contains ten species: Veratrum dahuricum , V. grandiflorum, V. japonicum, V. maackii, V. mengtzeanum, V. nanchuanense, V. nigrum, V. oblongum , V. schindleri, and V. stenophyllum. The three species from cluster #14 are V. oxysepalum, V. patulum, and V. taliense . The mean sequence similarity between these two groups is 90±1.1%. Within cluster #14 this is 97.8±1.7%. Within cluster #14 this value is 97.9±1.2%.

What is the cause of this disparity in sequence similarity? Does it mean that there are two Veratrum holobaramins? Or could there be an underlying genetic reason for this difference? It could be possible that since the chloroplast genome is approximately ten times larger than the mitochondrial genome, in a geometric sense this allows for more mutations, such as insertions, deletions, and inversions to occur. This we have already seen in the case of the five Disporum species. Indeed, a 17,608 bp inversion was found in the genomes of cluster #13. This inversion was also discovered by Chen et al. (2021) in the chloroplast genome of V. oxysepalum, which they found clustered the closest to V. patulum . This inversion encompasses 13 genes: NdhF, RPL32, tRNA-Leu, CcsA, NdhD, PsaC, NdhE, NdhG, NdhI, NdhA, NdhH, RPS15, and Ycf1. See Supplementary File 6 to view the gene orders of these 13 Veratrum species as well as Xerophyllum tenax, an outlier. Figure 5 displays the clp genomes of the 14 Veratrum species as well as the genome of X. tenax for comparison.

When did this large-scale inversion happen? Did this inversion affect the smaller group, meaning that they are a younger group? Or vice versa? The order of the 13 genes is the same between X. tenax and the species from cluster #14. Thus, the inversion predates the formation of the ten species in cluster #13. Interestingly, the loss of rps16 exon 2 is characteristic of Veratrum species compared to all other members of Liliales (Do et al., 2013).

This large-scale inversion of this segment of the DNA could have led to a speciation event, leading to the formation of two monobaramins. Since the inversion event affected 13 genes (9.6% of the 135 genes in the genomes of species from the Veratrum genus), these genes may have become dysregulated.

Cluster #15 is made up of a single species, X. tenax. It belongs to the genus Xerophyllum, which belongs to the family Melanthiaceae.

Lastly, cluster #16 is made up of the four outlier species from the genus Stemona. This genus belongs to the order Pandanales.

Summary and Conclusion

This is the first molecular baraminology study that involves the analysis of the entire chloroplast genome. While there was no mitochondrial data available for these Liliales species, the present results were comparable with a previously made morphological analysis. Whereas Wood unified Alstromeria and Disporum, the molecular evidence splits them into two separate groups, while putting together Luzuriaga and Alstromeria. Conversely, molecular evidence unifies Daiswa, Paris, and Trillium, whereas morphological evidence separates Trillium from the other two. Although the groups found here are tentative, if they coincide with the results from other studies, that may strengthen their status as a holobaramin.

Table IV shows the similarities and differences in baraminic classification in 42 species from these six genera between the molecular and the morphological study. It may be that further study is needed, especially since the chloroplast genome makes up only a small fraction of the genome, similar to the mtDNA. However, since the genetic evidence is primary compared to the morphological data, the results drawn from the molecular analysis should be given prominence.

Since the chloroplast genome is longer than the mtDNA, geometrically it offers a larger surface for mutations to occur in. Even though a longer sequence makes more robust analysis possible, mutations, such as insertions and deletions make the picture more complex. These kinds of mutations may possibly lead to speciation events and the formation of new monobaramins. As in the case of Disporum, Tulipa, and Veratrum, insertions, deletions, and large-scale inversions decreased the sequence similarity between species. This means that further detailed analysis might be needed to clarify baraminic relationships between species.

In summary, six groups can be proposed to form holobaramins, each with at least five members. These are Tulipa+Amana, Disporum , Fritillaria+Lilium+Nomocharis+Notholirion, Daiswa+Paris+Trillium, Smilax, and Veratrum . Whereas in 2014 there were only 340 angiosperm species with available chloroplast genomes (Shaw et al., 2014), as of today (February 22, 2023), there are 11,834 chloroplast/plastid genome sequences in the Organelle Genome Browser.

This is definitely a significant increase, making future chloroplast genome-based molecular baraminology studies possible. Jeansen (2013) analyzed approximately 2,700 mitochondrial genomes, and found that differences between kinds were due not to random changes since creation, but rather, were due to functional changes in the mtDNA. A similar study could be performed for a large number of chloroplast genomes to discover similar functional changes that may possibly be the basis for differences within plant kinds.

Tables

Table I. A list of deletions from the T. altaica clp genome compared to the other species of Tulipa.

Start position

End position

Deletion length

6167

6476

310

13947

14060

114

29737

29897

161

30209

31056

848

43552

43749

198

44137

44364

228

45956

46349

394

59234

59583

350

64825

65359

535

66671

67559

889

110019

110440

422

112854

113970

1117

Table II. List of genes and rRNA molecules missing from the T. altaica clp genome compared to the other Tulipa species.

Gene name/function

ATP-dependent Clp protease proteolytic subunit

acetyl-CoA carboxylase carboxyltransferase beta subunit

chloroplast envelope membrane protein

cytochrome c heme attachment protein

hypothetical chloroplast RF19

hypothetical protein RF1

hypothetical protein RF2

hypothetical protein RF68

NADH-plastoquinone oxidoreductase subunit 7

PetG

photosystem II CP43 chlorophyll apoprotein

photosystem I subunit IX

ribosomal protein L32

RNA polymerase beta'' subunit

Ycf3

Table III. A list of indels ≥ 50 bp between Disporum sessile and other Disporum species in a multiple alignment of their clp genomes.

Start pos. in alignment

Length of indel

Insertion (ins) or deletion (del)

1

164

Ins

6096

54

Ins

6844

366

Del

8866

149

Ins

30297

119

Del

31528

125

Del

32528

434

Del

33365

73

Del

47605

56

Ins

48670

143

Ins

50104

79

Del

53089

89

Del

59164

106

Ins

59617

51

Ins

66571

54

Del

67716

157

Ins

96781

616

Del

114518

1727

Ins

118877

50

Ins

119887

61

Ins

133685

180

Ins

152308

616

Del

Table IV. Baraminic placement of 42 Liliales species with both an available chloroplast genome sequence, and which were analyzed in the Wood (2008) morphology study.

Species

Sequence similarity classification

Morphological classification

Alstroemeria hybrid

1

A

Luzuriaga radicans

1

B

Disporum sessile

2

A

Disporum cantoniense

2

A

Disporum megalanthum

2

A

Disporum uniflorum

2

A

Disporum viridescens

2

A

Daiswa chinensis

3

C

Daiswa dunniana

3

C

Daiswa forrestii

3

C

Daiswa yunnanensis

3

C

Paris axialis

3

C

Paris bashanensis

3

C

Paris birmanica

3

C

Paris caobangensis

3

C

Paris cronquistii

3

C

Paris daliensis

3

C

Paris delavayi

3

C

Paris dulongensis

3

C

Paris forrestii

3

C

Paris incomplete

3

C

Paris japonica

3

C

Paris liana

3

C

Paris luquanensis

3

C

Paris mairei

3

C

Paris marmorata

3

C

Paris polyphylla

3

C

Paris polyphylla var. emeiensis

3

C

Paris qiliangiana

3

C

Paris quadrifolia

3

C

Paris rugosa

3

C

Paris sp. CG-2021b

3

C

Paris tengchongensis

3

C

Paris tetraphylla

3

C

Paris thibetica

3

C

Paris undulata

3

C

Paris vaniotii

3

C

Paris vietnamensis

3

C

Paris xichouensis

3

C

Paris yanchii

3

C

Trillium camschatcense

3

D

Trillium govanianum

3

D

Figure captions

Figure 1. Heatmap showing tentative baraminic relationships between 163 Lilailes species. Dark green areas represent higher sequence similarity between the clp genomes of species within the same baramin. Yellow colors denote lower sequence similarity between species from two different baramins.

Figure 2. Baraminic tree of the tentative Tulipa holobaramin.

Figure 3. Baraminic tree of the tentative Lilieae holobaramin.

Figure 4. Baraminic tree of the tentative Parideae holobaramin.

Figure 5. Gene order map of the chloroplast genomes of 13 Veratrum species and Xerophyllum tenax (outlier). V. oxysepalum, V. patulum, and V. taliense form a small group separate from the other ten species. The red line denotes the large-scale inversion of 17,608 bp.

Supplementary Files

Supplementary File 1. Sequence metadata, sequence similarity matrix, cluster partitioning and cluster statistics of the 165 Liliales species used in this study.

Supplementary File 2. Values from the elbow, silhouette and Calinski plots, seen in Supplementary Figures 1, 2, and 3, respectively.

Supplementary File 3. Multiple alignment of species from the genus Tulipa .

Supplementary File 4. Gene order list of species from the genus Tulipa .

Supplementary File 5. Multiple alignment of species from the genus Disporum .

Supplementary File 6. Gene order list of species from the genus Veratrum .

References

Chen, Z.J., J.J. Liu, Y.M. Zhang, Z.G. Qian and G.D. Li. 2021. The complete chloroplast genome sequence of Veratrum oxysepalum and phylogenetic analysis. Mitochondrial DNA B: Resources 6(7): 2015-2016.

Cserhati, M., and J. Alquist. 2019. Baraminology suggests cryptic relationships among Caprimulgiformes, Journal of Creation 33(3): 66-76.

Cserhati, M. 2021. Molecular and morphological analysis predicts four bat baramins. Journal of Creation 35(1): 117-127.

Cserhati, M. 2021. Prediction of monophyletic groups based on gene order and sequence similarity in organelle DNA. American Journal of Molecular Biology 11(4): 83-99.

Cserhati, M. 2022. Molecular baraminology of primates. Creation Research Society Quarterly 59(2): 72-80.

Cserhati, M., and E. Moynaugh. 2022. Pinniped molecular baraminology. Creation Research Society Quarterly 58(3): 193-203.

Do, H.D., J.S. Kim, and J.H. Kim. 2013. Comparative genomics of four Liliales families inferred from the complete chloroplast genome sequence of Veratrum patulum O. Loes. (Melanthiaceae). Gene 530(2): 229-235. doi: 10.1016/j.gene.2013.07.100.

Do, H.D., J.S. Kim, and J.H. Kim. 2014. A trnI_CAU triplication event in the complete chloroplast genome of Paris verticillata M.Bieb. (Melanthiaceae, Liliales). Genome Biology and Evolution 6(7): 1699-1706. doi: 10.1093/gbe/evu138.

Gao, Y.D., and X.F. Gao. 2016. Accommodating Nomocharis in Lilium (Liliaceae). Phytotaxa 277(2): 205-210. doi: 10.11646/phytotaxa.277.2.8.

Göker, M.W. 2011. Cluster analysis: UPGMA and WPGMA. In The Yeasts: A Taxonomic Study , Volume 1. C.P. Kurtzman, J.W. Fell, and T. Boekhout, (editors). Elsevier, New York, NY.

Huang, Y., X. Li, Z. Yang, C. Yang, J. Yang, and Y. Ji. 2016. Analysis of complete chloroplast genome sequences improves phylogenetic resolution in Paris (Melanthiaceae). Frontiers in Plant Science 7: 1797. doi: 10.3389/fpls.2016.01797.

Janssen, T., and K. Bremer. 2004. The age of major monocot groups inferred from 800+ rbcL sequences. Botanical Journal of the Linnean Society 146(4): 385-398.

Jeansen, N. 2013. Recent, functionally diverse origin for mitochondrial genes from ~2700 metazoan species. Answers Research Journal 6(2013): 467-501.

Jiang, Y., Y. Miao, J. Qian, Y. Zheng, C. Xia, Q. Yang, C. L iu, L. Huang, and B. Duan. 2022. Comparative analysis of complete chloroplast genome sequences of five endangered species and new insights into phylogenetic relationships of Paris. Gene 833: 146572. doi: 10.1016/j.gene.2022.146572.

Katoh, K., and D.M. Standley. 2014. MAFFT: Iterative refinement and additional methods. Methods in Molecular Biology 1079: 131-46. doi: 10.1007/978-1-62703-646-7_8.

Kim, J.S., and J.H. Kim. 2013. Comparative genome analysis and phylogenetic relationship of order Liliales insight from the complete plastid genome sequences of two Lilies ( Lilium longiflorum and Alstroemeria aurea ). PLoS ONE 8: e68180. doi: 10.1371/journal.pone.0068180.

Li, P., R.S. Lu, W.Q. Xu, T. Ohi-Toma, M.Q. Cai, Y.X. Qiu, K.M. Cameron, and C.X. Fu. 2017. Comparative genomics and phylogenomics of East Asian Tulips (Amana, Liliaceae). Frontiers in Plant Science 8: 451. doi: 10.3389/fpls.2017.00451.

Li, J., M. Price, D.M. Su, Z. Zhang, Y. Yu, D.F. Xie, S.D. Zhou, X.J. He, and X.F. Gao. 2021. Phylogeny and comparative analysis for the plastid genomes of five Tulipa(Liliaceae). BioMed Research International 2021: 6648429. doi: 10.1155/2021/6648429.

Li, J., J. Cai, H.H. Qin, M. Price, Z. Zhang, Y. Yu, D.F. Xie, X.J. He, S.D. Zhou, and X.F. Gao. 2022. Phylogeny, age, and evolution of Tribe Lilieae (Liliaceae) based on whole plastid genomes. Frontiers in Plant Science 12: 699226. doi: 10.3389/fpls.2021.699226.

Liang, S.Y. 1995. Chorology of Liliaceae (s. str.) and its bearing on the Chinese flora. Acta Phytotaxonomica Sinica 33(1): 27-51.

O'Micks, J. 2018. A preliminary cephalopod baraminology study based on the analysis of mitochondrial genomes and morphological characteristics. Answers Research Journal 11: 193-204.

Patterson, T.B., and T.J. Givnish. 2002. Phylogeny, concerted convergence, and phylogenetic niche conservatism in the core Liliales: Insights from rbcL and ndhF sequence data. Evolution 56(2): 233-252. doi: 10.1111/j.0014-3820.2002.tb01334.x.

Robinson, D.A. 1997. A mitochondrial DNA analysis of the testudine apobaramin. Creation Research Society Quarterly 33(4): 262-272.

Ujjwal, M., and B. Sanghamitra. 2002. Performance evaluation of some clustering algorithms and validity indices. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(12): 1650-1654.

Vinnersten, A., and K. Bremer. 2001. Age and biogeography of major clades in Liliales. American Journal of Botany 88(9): 1695-703. PMID: 21669704.

Wood, T.C., and D.P. Cavanaugh. 2001. A baraminological analysis of subtribe Flaveriinae (Asteraceae: Helenieae) and the origin of biological complexity. Origins 2001(52).

Wood, T.C. 2008. Animal and plant baramins. CORE Issues in Creation 3: 195-204.

Wood, T.C. 2013. Mitochondrial DNA analysis of three terrestrial mammal baramins (Equidae, Felidae, and Canidae) implies an accelerated mutation rate near the time of the Flood. The Proceedings of the International Conference on Creationism 7, article 35.

Yang, L., Z. Yang, C. Liu, Z. He, Z. Zhang, J. Yang, H. Liu, J. Yang, and Y. Ji. 2019. Chloroplast phylogenomic analysis provides insights into the evolution of the largest eukaryotic genome holder, Paris japonica (Melanthiaceae). BMC [BioMed Central] Plant Biology 19(1): 293. doi: 10.1186/s12870-019-1879-7.

Zhang, Y.M., L.J., Han, C.W. Yang, Z.L. Yin, X. Tian, Z.G. Qian and G.D. Li. 2021. Comparative chloroplast genome analysis of medicinally important Veratrum(Melanthiaceae) in China: Insights into genomic characterization and phylogenetic relationships. Plant Diversity 44(1): 70-82. doi: 10.1016/j.pld.2021.05.004.

Zhou, J-T., P-P. Yin, Y. Chen, and Y-P. Zhao. 2019. The complete chloroplast genome of Tulipa altaica (Liliaceae), a wild relative of tulip. Mitochondrial DNA Part B 4(1): 2017-2018. DOI: 10.1080/23802359.2019.1618221.

 

Figures

Figure 1. Heatmap showing tentative baraminic relationships between 163 Lilailes species. Dark green areas represent higher sequence similarity between the clp genomes of species within the same baramin. Yellow colors denote lower sequence similarity between species from two different baramins. 

Figure 2. Baraminic tree of the tentative Tulipa holobaramin. 

Figure 3 (right). Baraminic tree of the tentative Lilieae holobaramin 

Figure 4. Baraminic tree of the tentative Parideae holobaramin. 

Figure 5. Gene order map of the chloroplast genomes of 13 Veratrum species and Xerophyllum tenax (outlier). V. oxysepalum, V. patulum, and V. taliense form a small group separate from the other ten species. The red line denotes the large-scale inversion of 17,608 bp.