Plant Biosystems Design Research Roadmap 1.0

Human life intimately depends on plants for food, biomaterials, health, energy, and a sustainable environment. Various plants have been genetically improved mostly through breeding, along with limited modification via genetic engineering, yet they are still not able to meet the ever-increasing needs, in terms of both quantity and quality, resulting from the rapid increase in world population and expected standards of living. A step change that may address these challenges would be to expand the potential of plants using biosystems design approaches. This represents a shift in plant science research from relatively simple trial-and-error approaches to innovative strategies based on predictive models of biological systems. Plant biosystems design seeks to accelerate plant genetic improvement using genome editing and genetic circuit engineering or create novel plant systems through de novo synthesis of plant genomes. From this perspective, we present a comprehensive roadmap of plant biosystems design covering theories, principles, and technical methods, along with potential applications in basic and applied plant biology research. We highlight current challenges, future opportunities, and research priorities, along with a framework for international collaboration, towards rapid advancement of this emerging interdisciplinary area of research. Finally, we discuss the importance of social responsibility in utilizing plant biosystems design and suggest strategies for improving public perception, trust, and acceptance.


Introduction
Humans depend on plants for a variety of important resources including sustenance, energy, clothing, bio-based products, and shelter [1][2][3]. On a global scale, plants play critical roles in biogeochemical cycling and environmental stability [4,5]. There are currently~374,000 known plant species on Earth, of which approximately 82% are vascular plants [6], and only a small fraction of these have been domesticated. There still exists vast potential for incorporating useful traits from wild relatives and from natural plant populations to design plants for human and environmental use. It has become increasingly clear that the current trajectories of yield increase for staple crop varieties/cultivars will not be adequate to meet the future demands of the increasing global population [7,8]. Furthermore, many crop plants may not be sufficiently robust to cope with impending stresses of rapid climate change such as extreme weather, reduced water resources (e.g., reduction in both quantity due to drought and quality due to pollution), and deteriorated soil quality [9,10]. Therefore, there is an urgent need for new strategies to accelerate crop development and domestication and expand the possibilities for a thriving plant-based bioeconomy (e.g., production of novel biobased products) to address our projected economic, social, and environmental needs. To this end, a new frontier in plant research called "plant biosystems design" is emerging and quickly evolving. Plant biosystems design is an interdisciplinary field of research that seeks to genetically/epigenetically improve plants or create novel plant traits or organisms through editing, engineering and refactoring of native, heterologous, or synthetic biological parts based on predictive design ( Figure 1). To promote this emerging field, we present a roadmap for plant biosystems design that aims to identify knowledge gaps, technical challenges, and opportunities. We review theoretical and technical approaches and propose innovative applications of biosystems design for basic and applied plant science research, along with strategies to enhance social responsibility of scientists and companies in terms of biosafety and ethics (e.g., public beneficence, intellectual freedom and responsibility, and fairness).

Theoretical Approaches and Principles of Plant Biosystems Design
Plants are complex, multicellular organisms. The predictive design of plant biosystems requires a comprehensive understanding of the principles underlying biological processes across all scales, from molecular interactions to cellular metabolism, cell/tissue/organ growth and development, and environmental responses of plants. Plant biosystems design involves several theoretical approaches: (1) graph theory providing a graphic view of the structure of plant systems, (2) mechanistic models linking genes to phenotypic traits, and (3) evolutionary dynamics theory enabling prediction of the genetic stability and evolvability of genetically modified plants or de novo plant systems. These theoretical approaches enable the design of complex plant systems based on the principles of modular design, dynamic programming, natural and artificial selection (i.e., selective breeding), genetic stability, and upgradability.

Theoretical Approaches
2.1.1. The Graph Theory Approach for Plant Biosystems Design. A graph can be used to describe complex biological systems where the components and interactions of the system are represented by thousands of nodes (e.g., genes and metabolites) connected with thousands of edges (e.g., interactions) [11]. Inherent to the graph theoretic approach for describing biological systems is the use of network graphs to represent, for example, the extensive communication between metabolic and gene regulatory networks [12]. Metabolites can regulate protein activity via allosteric regulation and posttranslational modifications [13], and gene expression in plants is subject to epigenetic regulation mediated by metabolic fluxes and cellular redox states [14,15]. From the perspective of biosystems design, a plant biosystem can be defined as a dynamic network of genes and multiple intermediate molecular phenotypes, such as proteins and metabolites, distributed in a four-dimensional space: three spatial dimensions of structure (e.g., cell and tissue) and one temporal dimension (e.g., cell cycle, circadian time, season, developmental stage, and life cycle) (Figure 2(a)). Along the spatial dimensions, plant tissue/organ growth and development are precisely orchestrated in a distributed fashion through collective interactions of many connected cells [16,17], and therefore, the subnetworks spatially distributed in individual cells are interconnected as the nodes of tissue/organ-scale networks. Furthermore, tissue/organ-scale subnetworks are interconnected as nodes of whole-plant-scale networks. Along the temporal dimension, genes are turned on and off at various time scales, and their expression profiles vary with changes in cell cycle, circadian clock, growing season, development stage, and life cycle. Also, the products of gene expression (RNAs and proteins) are degraded at various time scales, resulting in variation in their turnover times. A plant gene-metabolite network contains nodes and edges, where the nodes are genes/RNAs/proteins/metabolites, and the edges represent either promotional or inhibitory relationships in protein-protein, protein-RNA, protein-DNA, protein-metabolite, and RNA-RNA interactions about the concentration of metabolites in different compartments and different cell types, as well as the transport between different compartments and cell types, which provides a challenge for modeling. New computational tools like MAGI [23] will be needed to facilitate the integration of metabolic and genetic networks.

The Mechanistic Modeling Theory of Plant Biosystems
Design. Mechanistic modeling of cellular metabolism, based on the law of mass conservation, is used to interrogate and characterize complex plant biosystems with capabilities of linking genes, enzymes, pathways, cells, tissues, and wholeplant organisms. Starting from the plant genome sequence and omics datasets, a metabolic network can be constructed based on metabolites and reactions representing nodes and edges, respectively [24] (Figure 3(a)). By defining the plant cell as a control system, the mass conservation for each metabolite can be written to decipher the fluxes of chemical elements (e.g., carbon, electron, nitrogen, and phosphate) within the plant system. These fluxes can be used as a basis to quantitatively describe cellular phenotypic characteristics [25][26][27]. Mathematically, mass conservation can be expressed as a system of ordinary differential equations (ODEs) to delineate the rate of change for each metabolite in the network (Figure 3(b)). The metabolic fluxes are the   [370]. (c) The structure of regulatory/signaling network motifs; arrows indicate positive regulation; T-shape arrows indicate negative regulation; adapted from Gupta and Singh [20]. 4 BioDesign Research highly underdetermined, the exact phenotypes of the cell can be evaluated by performing extensive flux measurement to make the system being determined via stable isotopelabeling (e.g., 13 C-labeled CO 2 ) experiments. Alternatively, constraint-based metabolic analyses can be employed using either flux balance analysis (FBA) or elementary mode analysis (EMA) [30,34]. FBA can predict a cellular phenotype based on an objective function (e.g., maximization of cell growth or a product synthesis), whereas EMA unbiasedly identifies all possible phenotypes for a given network. Through decades of development, tools for constructing and analyzing metabolic networks are quite mature and useful for plant biosystems design. However, several key challenges still remain: (1) the lack of knowledge of gene functions and their regulation required for accurate and comprehensive network curation and analysis [23]; (2) the lack of experimental data to decipher metabolites, reactions, and pathways that exist in compartments within a cell and among different cell types of a plant; and (3) the hidden underground metabolism due to enzyme promiscuity [22,35]. Advances in single-cell/single-cell-type omics (see Section 3.6.2) are critically required to address these challenges. Genome-scale metabolic network reconstruction 1  Gene regulation imposed on the network can be used as additional constraints r = [r 1 r 2 r 3 r 4 r 5 r 6r r 7 r 8r r 9 ] T   is a system of ordinary differential equations describing dynamic chemical transformation of metabolites in a metabolic network where C is a metabolite concentration vector, S is a stoichiometric matrix, r is a reaction flux vector, and μ is the cell growth. (c) Calculation of steady-state flux distributions. Three common methods can be employed to determine metabolic flux distributions including metabolic flux analysis (MFA), flux balance analysis (FBA), and elementary mode analysis (EMA). For a typical metabolic network, a system of homogenous equations [2] is highly underdetermined, resulting in an infinite solution space. MFA determines a physiological state of a cell under a defined condition by calculating r u based on experimentally measured fluxes r m that make [2] being determined. Here, r = ½r u r m , and S = ½S u S m . FBA also determines a physiological state of a cell by implementing a cellular objective function subject to (s.t.) mass balance and flux bounds. Different from MFA and FBA, EMA unbiasedly seeks to identify all finite admissible fluxes in the solution space by imposing the thermodynamic constraints of reaction direction and pathway nondecomposability. Adapted from [30,34,371].

The Evolutionary Dynamics Theory of Plant Biosystems
Design. Extant plants are the products of evolution driven by selection and random genetic drift acting on heritable phenotypic variations caused by mutations (e.g., point mutations, insertions, deletions, and gene birth/death), recombination, gene/genome duplication followed by diversification, and transgenerational epigenetic changes [36,37]. Within an evolutionary context, several important theoretical questions remain to be answered: To what extent are the existing plant biosystems optimized for adaptation vs. production? Which plant genes and metabolites are essential, and which are spandrels: nonadaptive byproducts of evolution [38,39]? In other words, can we simplify and perhaps improve the gene-metabolite network by removing some optional edges or nodes? Can we rewire/modify natural networks and/or introduce new components into existing networks for genetic improvement of certain traits without negative impact on other traits? The implementation of novel, orthogonal features poses a special challenge, as the interaction with the native network(s) cannot be predicted and require a strong evolutionary adaption of a system.
Plant biosystems design generates either genetically modified or de novo plant genomes, which will likely face evolutionary pressures caused by spontaneous mutations and natural/artificial selection. In Arabidopsis thaliana, the estimated haploid single nucleotide mutation rate and insertion/deletion mutation rate are 6:95 × 10 −9 per site per generation and 1:30 × 10 −9 per site per generation, respectively [40]. Somatic mutations have been reported in various tissues of multiple plant species, with a high proportion of mutations in shoots in perennials being transmissible [41]. In a long-lived woody perennial species Populus trichocarpa, the somatic mutation rate is estimated to be approximately 1:99 × 10 −9 base substitutions per site per generation [42], slightly lower than that in A. thaliana (7 × 10 −9 ) [43]. Also, the plant phenotype generated from a genome could potentially exert a feed-back loop by either maintaining genome stability or guiding genome variations [44]. Therefore, a key question emerges: What new mechanisms can be implemented to maintain the stability of genetically modified genomes or de novo genomes?

Modularity of Plant Biosystems Design.
Modularity is the fundamental principle of efficient and reproducible construction and maintenance of complex systems [45]. From the perspective of engineering, this has been the driving force for the modern industrialization. A module can be defined as an essential and self-contained functional unit relative to the product of which it is part, with standardized interfaces and interactions that allow composition of products by combination [46]. A modular system can be classified into sessional and chassis-based architectures. The sessional architecture has all components assimilated to modules and shares a common interface, e.g., a piping system, in which pipes are connected with a common interface for fluid transport. The chassis-based architecture can be further subclassified into the bus and slot architectures. The bus architecture (e.g., a USB port) uses the same interface whereas the slot architecture uses different interfaces, e.g., an automobile which is comprised of a chassis with many interfaces with various modules (e.g., wheels and headlights). Plant biological systems exhibit a similar principle of modularity, which has persisted for millions of years under natural selection. The principle of modular design in biological systems has been revealed at the molecular level using network theory in combination with advances in sequencing, omics, and imaging technologies over the past few decades [46][47][48][49][50][51]. Even though the principles of modular design in both biological and engineered systems are very similar, the former is much more complex, exhibiting all modular design architectures across scales (from genes to enzymes, pathways, cells, and whole organisms), and more importantly, having a unique capability to evolve (e.g., plasticity with a rewiring in response to perturbations).
It is critical for plant biosystems design to fundamentally understand the principles of modular design so as to harness them for innovative applications such as challenges related to human health, food, energy, and the environment. For instance, the collective effect of genes within a module or subnetwork should be considered to achieve desirable phenotypic traits. Characterization of input/output properties of the subsystems (or modules) in isolation, and understanding how these are connected to each other, would allow inferring the behavior of complex systems by composing the behaviors of its subsystems (Figure 4(a)). For optimizing existing gene modules in plants, several alternative strategies can be explored, including (1) modifying the protein sequence or changing the gene expression of rate-limiting steps of the signaling or metabolic pathways, (2) manipulating the gene expression of master regulators that control the expression of multiple genes in the target module, (3) engineering enzymes to regulate metabolites that mediate epigenetic control of multiple genes in the target module, and (4) optimizing kinetics of metabolic reaction. These represent "homologous" approaches, whereas "heterologous" approaches, which use cell free systems or a simplified host for reconstruction of modules to evolve and/or identify essential components in the absence of endogenous interference [52,53], would be useful for optimizing modular design. For installing new modules in plants, all the network components should be configured in gene circuits with the appropriate spatial and temporal expression patterns, with no or minimal negative side effect on the target plants.
Improved plant biosystems design requires a better understanding of the following questions: What makes up a modular (chassis) cell and exchangeable production modules and their interfaces? How can a modular cell be created as a chassis to effectively couple with exchangeable production modules to achieve desirable phenotypes? How can modular design be implemented to minimize potential tradeoff among robustness, compatibility, and evolution [51,54]? Critical to addressing these challenges is to reconstruct an accurate plant metabolic network (see Section 2.1.2). Furthermore, recent advances in metabolic network modeling and analyses in combination with Pareto optimization theory, computational algorithms, and high performance computing will help 6 BioDesign Research  and M2 represent two different modules; F1 and F2 represent the processes converting inputs S1 and S2 to outputs O1 and O2, respectively; M1 and M2 are connected, with O2 = S1; redrawn from Grunberg and Del Vecchio [372]. (b) Dynamic programming as exemplified by the expression of regulators (suppressors or activators) programmed in the sequential developmental stages from a vegetative meristem to a floral meristem; redrawn from Kaufmann et al. [56]. (c) Tradeoff between natural selection and artificial selection. (d) Genetic stability, as exemplified by the pathogen-associated molecular pattern-(PAMP-) triggered immunity signaling network, with the inhibitory loops within the network to provide buffering interference (i.e., loss-of-function of some network components releases associated inhibitory loops allowing other components of the network to compensate for the loss); redrawn from Tyler [373]. (e) Upgradability, as exemplified by marker-free systems, in which the selectable marker gene can be excised from the plant genome after transformation, to allow for unlimited rounds of genetic transformation.

BioDesign Research
shed light on understanding and harnessing modularity of plants at various levels such as single-cell level (e.g., stem cells and stomata) and tissue level [51,54,55]. Application of modular design to plant biosystems needs to consider orthogonal interactions (i.e., if a nonnative/nonnatural metabolite is introduced into the system, how can we test and/or predict if other enzymes and/or transcription factors would react with this unnatural metabolite). This would in principle require an intensive testing of crossreactivity of metabolites with enzymes and proteins to assess side activities.

Dynamic Programming of Plant Biosystems Design.
For biosystems design, it is critical to consider the dynamic genetic programming for plant growth, development, and response to environmental perturbations. Plant biosystems design involves an ability to turn gene networks on or off in the designated tissue, time, and life cycle, while interacting with environmental input. For example, the expression patterns of regulators (suppressors or activators) are genetically programed in the sequential developmental stages from a vegetative meristem to a floral meristem [56], as illustrated in Figure 4(b). The change in gene expression during the transition from apical to floral meristem is governed by various regulators (e.g., ncRNAs, transcription factors, chromatin remodelers, and hormones) in response to environmental signals (e.g., temperature, photoperiod, and nutrient status) and endogenous cues (e.g., plant age) [57,58]. For dynamic genetic programming in plants, it is necessary to consider not only the abundance of transcripts and proteins but also epigenetic or posttranslational modifications. For example, in Arabidopsis, a novel regulatory mechanism, which depends on cofactor switching mediated by phosphorylation of the photorespiratory enzyme hydroxypyruvate reductase 1, is responsible for the regulation of photorespiratory fluxes in response to the changing environmental conditions [59]. The major challenges in dynamic programming of plant biosystems are: What are sensors and regulators to enable dynamic programming? How can these dynamic regulatory systems be created and controlled, e.g., how can the turnover of mRNAs and proteins be controlled?

Tradeoff between Natural Selection and Artificial
Selection. Biosystems design for the industrial purpose of yield maximization or minimal resource utilization may not be orthologous with natural evolution, in which some natural biochemical pathways are presumably optimized for environmental fitness [60]. As most crop plants are grown in open environments, they are still at least partially under natural selection pressure while artificial selection plays an important role in plant domestication. Plant biosystems design might encounter a compromise between the natural selection for fitness and the artificial selection for agricultural and/or industrial purposes (Figure 4(c)). Alternatively, some biosystems design modifications of crops may be selected for under both natural and artificial selection. Engineered photorespiratory bypasses that increase growth rate may form such an example [61,62]. For example, genetic improvement in yield or quality of crop plants needs to be balanced with stress tolerance enhancement. Alternatively, can these beneficial traits be coupled to ameliorate the tradeoff as part of the biosystems design to increase crop yield and quality under natural environmental inputs and fluctuations? 2.2.4. Genetic Stability of Plant Biosystems Design. As plant genomes are prone to spontaneous mutations (see Section 2.1.3), the capability to maintain the genetic stability of plant biosystems design over a long period of time (e.g., many generations of annual plants and life span of perennial plants) is critical. Also, epigenetic changes may have an impact on the stability of plant biosystems design. Robust traits in multiagent complex systems can be generated through networked buffering mechanism, which features a concurrent, distributed response involving chains of agents with versatility (i.e., agents perform more than one single functional role) and degeneracy (i.e., there exists partial overlap in the functional capabilities of agents) [63]. For example, the pathogen-associated molecular pattern-(PAMP-) triggered immunity signaling network is highly buffered against interference, with the inhibitory loops within the network providing buffering (i.e., loss-of-function of some network components releases associated inhibitory loops to allow other components of the network to compensate for the loss) ( Figure 4(d)). Plant biosystems design using long-term buffering strategies such as network buffering or using proteins with multiple functions may produce more robust traits (e.g., disease resistance) that would last for a long period of time (e.g., many generations, tens or hundreds of years).

Upgradability of Plant Biosystems Design.
In general, it is important to design a product that can adapt to future required performance and functions via upgrading the components of a biosystem [64]. Since plant biosystems design may require multiple iterations of Design-Build-Test-Learn (DBTL) cycles (for details see Section 3), it is essential that the genetically modified plants or de novo plant systems can be easily upgraded for improving performance or adding new functions. In general, upgrading the plant genome requires consecutive stable plant transformation processes, which is constrained by a limited number of selectable marker genes available for plant transformation, including widely used selectable marker genes conferring antibiotic (e.g., kanamycin and hygromycin) or herbicide (e.g., BASTA) resistance [65], along with some nonantibiotic and nonherbicide markers such as plant phosphomannose isomerase [66], broad-specificity amino acid racemase [67], and fluorescent proteins [68,69]. For enabling upgradability of plant biosystems design, it would be desirable to consider marker-free plant transformation systems, in which the selectable marker gene can be excised from the plant genome after transformation (Figure 4(e)). Selectable marker genes can be self-excised in plants using various approaches mediated by site-specific recombinase [70][71][72][73], zinc finger nuclease [74], and CRISPR [75,76]

Technical Approaches for Plant Biosystems Design
In general, a plant biosystems design approach goes through iterative DBTL cycles. This approach has been widely 8 BioDesign Research practiced in the biosystems design of microbial systems [77,78], but its application to plant biosystems design is still limited, mainly due to much longer time needed to complete DBTL cycles in plants. It would be important to see how recent attempts of accelerating DBTL using cell-free protein synthesis (CFPS) systems [53,79] would impact cellular approaches in plants. On the other hand, at the organismal level, there is a need for establishing state-of-the-art capabilities for plant biosystems design, including modular cell design, validated biological parts, automated design and build of genetic constructs, generation and testing of plant genetically modified or de novo plant systems, and learning from the test data, and integrating "design (D)," "build (B)," "test (T)," and "learn (L)" as well as executing mini-DBTLs ( Figure 5). For effective execution of the DBTL cycles, a laboratory information management system (LIMS) could be used to facilitate local data acquisition and sharing. Also, a Plant Biodesign Hub (PBH) needs to be established as an open access online platform for biological parts registration, genetic circuit design, and predictive modeling based on test data. Although other repositories are already in use and have proven effective, they cannot meet the increasing needs of the growing biosystems design community in terms of data comparability as well as the integration of data curation, submission, biological knowledge, and circuit design.
3.1. Mini-DBTL and Integration. Each component of the Shewhart cycle (D, design; B, build; T, test and L, learn) has their steps for control and continuous improvement forming a DBTL cycle within each D, B, T, and L, named here mini-DBTL. For example, a mini-DBTL within the "D" step could represent formulated DNA sequences that have failed syn-thetic DNA fragments synthesis and thus have their nucleotide sequence studied, redesigned, resubmitted for synthesis, and reanalyzed to conjunctively inform core adjustments necessary (e.g., maximum local GC content, repeats, homopolymers, and hairpins) for future attempts to succeed [80]. Similarly, researchers may need to (1) attempt different approaches before completing the assemblies (e.g., in the "B" step), (2) evaluate process variation impacting data acquisition (e.g., in the "T" step) [81], (3) develop tools that improve predictive power while making dissimilar suggestions (e.g., in the "L" step) [82], and (4) iterate automation [83]. Such improvements, along with the use of LIMS, robotics, and physical/electronic repositories (e.g., ICE, EDD) [84,85], will need to be accounted for accelerated and reproducible biosystems design.

Modular Cell Design.
Network modeling has mainly been employed to elucidate complex phenotypes of existing plant biosystems. In principle, the approach can be applied to design plant systems de novo. A recent computational advancement in network modeling using Pareto optimization theory has been described in the ModCell algorithm, which has enabled rational design of a modular (chassis) cell that can be coupled with many exchangeable production modules to achieve various desirable production phenotypes in bacteria [45,51,54,55,86,87]. This modular cell engineering approach is aimed at generating production strains rapidly with efficient performance while minimizing the number and cycle time of DBTL cycles. While the ModCell tool may prove very useful to guide the parts, modules, and chassis selection for prokaryotic cells, the compartmentalized endomembrane architecture of eukaryotic cells and the   BioDesign Research interconnected multicellular nature of complex plant systems featuring specialized cell types remain a considerable challenge for plant biosystems design.

Curation of Validated Biological Parts.
Libraries of validated parts (e.g., protein coding sequences, regulatory elements for gene expression, signaling, and other functional genetic elements) are critical for the engineering of multicomponent biological systems quickly and reliably [88,89]. Several repositories have been established, such as the iGEM Registry of Standard Biological Parts (http://parts.igem.org) [90,91], SynBioHub (https://synbiohub.org) [92], and the Addgene repository (https://www.addgene.org) [93,94]. While these repositories are overwhelmingly dominated by biological parts of prokaryotic origin, they host some DNA parts useful for plant biosystems design, such as the MoClo Toolkit [95,96] deposited in Addgene. Recently, a library of chloroplastspecific parts was established for plant biosystems design using the plant chloroplast as a chassis [97].
Given the increased complexity of plant genes over those of prokaryotes due to the presence of introns, distal regulatory elements, and posttranscriptional processing signals, a common "Phytobrick" syntax has been developed to enable universal Type IIS assembly with standardized parts (see Section 3.5.1) [89]. Despite these advancements, the conversion of natural DNA sequences into Phytobricks can be laborious, and the removal of "illegal" restriction sites in these sequences to enable Type IIS assembly can introduce unintended alterations to the part's function. In the recent production of a standardized parts library of 221 Eucalyptus transcription factors and 65 promoters [98], the risk of altering promoter function in their conversion to Phytobricks was minimized by using known single nucleotide polymorphism data in Eucalyptus populations to mutate undesirable restriction sites.
The biological parts for plant biosystems design have been obtained mainly from natural sources (i.e., plant genomes). Recently, a library of synthetic transcriptional regulator systems, which include synthetic activators, synthetic repressors, and synthetic promoters, was established to control plant gene expression in a tissue-specific and environmentally responsive manner [99]. Genome recoding (i.e., rewriting codon meaning through chemical synthesis for new features) has been practiced in microbial systems [100] and could be used to generate novel biological parts for plant biosystems design.
To facilitate international access and reproducibility of data, a centralized knowledgebase of validated biological parts needs to be established, which includes specific experimental context, standardization, and crossed references among different databases (e.g., iGEM, SynBioHub, and Addgene) [85]. The nomenclature of such a knowledgebase would build upon structures already defined by the parts repositories listed above and expand the concept from genomic constructs such as promoters and coding sequences into plant specific structures. Such an initiative could be established through collaboration with KBase [101], with the capability of submission, query, functional mapping onto the gene-metabolite network, as illustrated in Figure 6. There are several challenges to realize the standardization of biolog-ical parts across the international plant biodesign research community. For example, (1) how can data be standardized and made comparable between different laboratories? And (2) how can part characterization be rewarded? These challenges may be addressed through international workshops sponsored by a professional society, with participants from academia and industries in the future. For registration of biological parts, it would be important to include negative results, which typically do not get published or reported, to avoid wasteful repetitions in different labs. The negative results would also be very helpful for computational design of synthetic biological parts using machine learning approaches.
3.4. Genetic Construct Design. Genetic constructs are designed for nucleic acid sequence modification, gene expression regulation, and metabolic engineering. The genetic design for genomic sequence modification in vivo can be achieved through different methods, including CRISPR/Cas-mediated genome editing, with an emphasis on maximizing on-target efficiency and minimizing offtarget effects [102]. Recently, rapid progress has been made in the development of new genome editing technologies, such as high precision prime editing [103], cytosine base editors [104], and adenine base editors [105]. These new technologies have been tested and adapted for genome editing in plants [106][107][108][109][110]. The design for modulating gene expression can be achieved through CRISPR interference (CRISPRi) and activation (CRISPRa) systems, in which nuclease-deactivated Cas9 (dCas9) is tethered to inhibitory and activating domains, respectively, to regulate gene expression [111,112]. CRISPRi may also be achieved with dCas9 alone acting on promoter or exonic sequences. In addition, RNA editing allows altering splicing or introducing nonheritable changes to protein sequences [113]. It is also possible to multiplex activation, repression, sensing, and emulation of gene expression using homologous CRISPR-sgRNA pairs or different CRISPR-associated RNA scaffolds, which can be potentially used to build complex synthetic programs [102,114]. For designing predictable gene circuits, genetic design automation (GDA) tools compatible with Synthetic Biology Open Language (SBOL), such as Cello [115] and SBOLDesigner 2 [116], could be adapted to plants and used in an integrated fashion with the knowledgebase of biological parts to streamline the design process, as illustrated in Figure 7.

Building Genetic Constructs and Synthetic Plant Genomes
3.5.1. Assembly of DNA Parts into Genetic Constructs. Genetic constructs are built using various DNA assembly methods which can be based solely on PCR reactions (e.g., T-type), sequence-dependent recombinases (e.g., Gateway), enzymes causing specific DNA double strand breaks (e.g., types II and IIS), or an enzyme mix that coordinate nucleotide polymerization (e.g., Gibson assembly) [117][118][119][120][121][122] (Figure 8(a)). Each of these methods features a unique set of characteristics, and their combinatorial (i.e., shuffling) capacity, hierarchical (i.e., stacking) support, strengths dealing with secondary structure and repetitive sequences will guide their appropriation to different tasks for plant biosystems design. A major challenge in the field has been the editing of specific parts within a large preassembled DNA fragment, which can be addressed by the CCTL method (Cpf1-assisted cutting and Taq DNA ligase-assisted ligation) [123][124][125] (Figure 8(b)).
Type IIS restriction endonuclease-based DNA assembly systems are widely used for hierarchical assembly of DNA fragments into genetic constructs, such as GoldenBraid [120], TNT-cloning [122], and universal Loop assembly (uLoop) [126], which are based on Golden Gate [118]. These approaches have two advantages: (1) the DNA parts can be individually cloned into the entry vectors to establish biological parts libraries, which can be shared in the scientific community, and (2) multiple rounds of binary to hexanary assemblies can be performed to join various numbers of DNA fragments in flexible configurations. However, all type IIS-based approaches suffer from prohibitive internal restriction sites. These sites can be partially masked [122] but not eliminated, and innovative approaches mutating deoxyadenine within such prohibitive sites to unnatural nucleotides (e.g., deoxy-NaM) during PCR could advance these methods due to E. coli's natural ability to restore the original adenine-rich restriction site in vivo [127].
Multiple DNA fragments with unique short (e.g.,15-20 bp) overlaps between neighboring parts can be assembled using isothermal assembly such as Gibson Assembly, in which a 5′ exonuclease removes nucleotides from the 5 ′ ends of double-stranded DNA molecules, complementary single-stranded DNA overhangs are annealed, a DNA polymerase fills the gaps, and a DNA ligase seals the nicks [119]. Recently, another similar approach, called "SureVector," was developed to assemble multiple DNA fragments with 30 bp overlapping ends, in which DNA parts are denatured and adjacent parts are annealed due to the overlaps followed by DNA polymerase-mediated partial extension of exposed 3 ′ -OH ends, resulting in flaps that are digested by an endonuclease and covalently joined by a ligase [128]. These approaches have several key advantages: (1) allowing for assembling multiple bluntend DNA fragments in a single-tube reaction and (2) no reliance on restriction enzyme digestion of DNA fragments and consequently no requirement for removing or mutating type IIS restriction sites. On the other hand, these methods have a reduced capacity to assemble multiple parts at once (compared to Golden Gate) and have their efficiency strongly impacted by repetitive sequences as well as sequences prone to secondary structure when  Figure 6: A biological parts registration and curation module for plant biosystems design. The registration form includes accession number, name, type, description, function, sequence (i.e., DNA and protein sequence), and references (e.g., publications associated with the biological parts). For illustration purpose, the functional items are listed for protein-encoding genes only. CDS: protein-encoding sequence; ncRNA: noncoding RNA sequence, including natural noncoding RNAs and guide RNAs for genome editing.
11 BioDesign Research single stranded. Under such circumstance, using hybrid methods, which combine the advantages of two or more approaches at different levels while being amenable to the drawbacks, will be highly beneficial [129].

Plant Synthetic Genomes.
Recently, a 785 kb Caulobacter ethensis-2.0 (C. eth-2.0) genome was constructed in yeast using multiple rounds of homologous gap repair approach [130]. A 4 Mb synthetic Escherichia coli genome was constructed through a high-fidelity convergent total synthesis [131]. However, the construction of plant synthetic chromosomes through DNA synthesis has not been reported yet. A synthetic plant chromosome vector requires a minimum of centromeric and telomeric sequences, origins of replication, and a selectable marker gene [132]. A notable step towards the generation of synthetic plant genomes was the full cloning and yeast-mediated modification of the 204 kb plastid genome of the algae Chlamydomonas reinhardtii [133]. Plastids are among the defining features of plants, and their relatively small and well conserved genomes, along with the potential for high-level expression of desired genes, are currently more tractable candidates for total synthesis than plant nuclear genomes [134].

Testing Genetic Constructs in Plants.
Testing of plant biosystems designs is mainly achieved through stable transformation and transient expression (e.g., agroinfiltration and protoplast transformation) of genetic constructs, sometimes followed by omics (epigenomics, transcriptomics, metabolomics, proteomics, and phenomics) analysis of genetically  modified plants or de novo plant systems. Stable transformation is a bottleneck of the DBTL cycle because the main limitation remains to generate many transgenic plants transformed with multigene constructs. What can be improved at a throughput level using protoplasts is at the expense of an understanding of the construct design's effect on whole-plant performance and fitness, and likewise, protoplast-based systems will not be suitable for all traits being engineered.
3.6.1. Stable Transformation and Transient Expression of Genetic Constructs. Agrobacterium-mediated transformation has been the major approach for plant genetic engineering. However, there is substantial variation in the amenability to   [123]; overhang is underlined with the last 4 nt being programmable [124]. ¥ Produce scarless assemblies (NEBuilder allows ssDNA oligos in substitution of homologous overlap). ¤ May leave scars. * Undesired type IIS restriction sites can be partially masked by oligos [122]. For BRAID systems, see [120].
§ Hybrid system using golden-gate followed by in vivo homology-based assembly [129]. CX: complex design and execution; PL20: parts size/length beyond 20 kb; RS: repetitive sequences; SS: secondary structures; IRS: internal restriction sites. Recombination-based approaches (e.g., Gateway) were omitted due to limited use for biosystems design.

BioDesign Research
Agrobacterium-mediated transformation among plant species and even cultivars of the same species, with highefficiency transformation protocols available for a limited number of plant species/cultivars. One limitation for using Agrobacterium to transform plants is that not all plant species are Agrobacterium-infectable. Another major bottleneck in Agrobacterium-mediated transformation is in vitro regeneration of shoots or embryos from transformed cells. Ectopic expression of morphogenic or developmental regulator genes (e.g., Baby boom and Wuschel2) can promote somatic cells to form embryos, which develop into whole plants, in monocot species and consequently improve Agrobacterium-mediated transformation efficiencies dramatically [135]. Plant transformation often requires tissue culture by exposure of cells to various hormones, which are inefficient and timeconsuming. Recently, a de novo induction of meristem approach based on the use of development regulators was developed in Nicotiana benthamiana, tomato, potato, and grape, avoiding the use of traditional tissue culture [136]. The generation of transgenic roots (hairy roots) through Agrobacterium rhizogenes-mediated transformation, leading to the production of composite plants (i.e., transgenic roots on wild-type shoots), has proven to be a fast and versatile system particularly suited for certain woody plants recalcitrant to transformation such as Eucalyptus [137,138]. There is an urgent need to extend these methods to other plant species or to develop new capabilities for enabling or improving transformation in a wider range of dicot and monocot species, particularly for the ones that are currently recalcitrant to genetic transformation. In planta transformation methods can be particularly useful because no in vitro regeneration of shoots or embryos is required. It is preferred that morphogenic regulator genes are not used or that the morphogenic regulator genes could be excised out of the genome by inducible recombinase excision system, because expression of these transgenes can affect plant growth and development [139][140][141].
Besides Agrobacterium-mediated transformation, DNA, RNA, or protein molecules can be directly delivered to target sites via particle bombardment, nanoparticles, or direct injection. For example, Cas9-gRNA ribonucleases (RNPs) were directly injected into plant zygotes for DNA-and selectable-marker-free genome editing in rice [142]. Furthermore, carbon nanotubes were recently used for efficient plasmid DNA delivery into multiple plant species (e.g., arugula, wheat, and cotton) to enable high protein expression levels without transgene integration [143], which has great potential for application to transgene-free genome-modification [144] and may also be useful for functional testing of plant biosystems design.
Multigene transformation is important for plant biosystems design. It can be achieved using binary vectors based on transformation-competent artificial chromosomes, such as pHUGE-Red which is suited for cloning large DNA fragments [145]. Alternatively, multiple binary vectors with compatible replication origins can be hosted in a single Agrobacterium cell for simultaneous delivery of multiple gene constructs into plants [146]. Furthermore, gene stacking based on site-specific recombination and nuclease activity has a potential for in planta stacking of a large number of genes at a single genomic site of the same plant [147]. Also, multiple genes could be stacked using plastid transformation with operon-like and polyprotein expression systems [148][149][150]. CRISPR/Cas9-mediated targeted T-DNA integration and precise knock-in [102,151,152] can potentially be used for in planta stacking.
As an alternative to stable transformation approaches, which are often time-consuming, transient expression techniques via virus-induced gene silencing (VIGS) enable rapid knockdown of a targeted gene in a high-throughput manner, even in plant species that are difficult to transform [153]. However, RNA-directed transcriptional gene silencing cannot induce heritable changes that target the coding region, although targeting of the promoter sequence can cause heritable changes in gene expression mediated by methylation [154]. In combination with CRISPR/Cas systems, viruses can be used to quickly induce heritable changes in plant genomes. Recently, viruses have been used to deliver the guide RNAs or the entire CRISPR-Cas9 cassette for Cas9-mediated gene editing in plants, providing another high-throughput method for testing the function of gene constructs for plant biosystems design [155][156][157]. Also, multiplexed heritable gene editing was recently achieved through virus-mediated in planta delivery of mobile single guide RNAs (sgRNAs) into Nicotiana benthamiana transgenic plants expressing Cas9 [158], which provides another approach to produce heritable gene editing without tissue culture. Alternatively, a recently developed nanotube-based platform for RNA delivery, which enables stable siRNA delivery and efficient silencing of target genes in intact plant cells [159], might be useful for a broad range of applications including direct delivery of sgRNAs and Cas9 mRNA for DNA-free genome editing.
3.6.2. Omics Analysis of Genetically Modified Plants. Integrative multiomics (e.g., transcriptomics, metabolomics, proteomics, epigenomics, and phenomics) analysis of genetically modified plants could provide rich experimental data for linking genetic design to plant phenotype. So far, plant omics data have been collected at the organ or tissue level, resulting in the molecular, metabolic, and biochemical information averaged over a population of heterogeneous cells [160,161]. Because plant cellular processes vary spatially, singlecell multiomics is necessary for simultaneous analysis of different biomolecules to achieve accurate assessment of nodes and edges in the gene/metabolite networks operating in an individual cell [162]. Single-cell technologies have evolved intensively in the last decade [163], and transcriptomics and proteomics are viable at the single-cell level [164,165]. However, most plant species still face major technical hurdles that make it challenging to achieve single-cell resolution, in large part because it is challenging to dissociate cells from the plant tissues [166] and/or collect adequate amounts of the desired biomass when no amplification strategies are available [167,168]. Progress has been made to solve these challenges. For example, the protoplasts of Arabidopsis root cells have been successfully used for single-cell RNA sequencing using droplet-based microfluidics platform 14 BioDesign Research [164,[169][170][171], and laser capture microdissection has been used to isolate individual cell layers of tomato roots for single-cell-type proteomics [172]. Also, live single-cell mass spectrometry has been used for direct analysis of metabolites in a single live plant cell [173].

Learning from the Testing of Designed Plant Systems.
Integration of multiomics data can provide a multiperspective view of dynamic molecular behavior and interacting networks of genes occurring in plants [174], as demonstrated in an integrative analysis of transcriptomic, proteomic, fluxomic, and phenomic data in relation to lignin biosynthesis in Populus [175]. The omics datasets can be integrated using statistical or advanced machine learning approaches [176] to generate results related to simple pathways or complex networks at cell, tissue, organ, or organism levels, providing insights into the effect of biosystems design on plant phenotype. For example, a machine learning method called Multiview Factorization AutoEncoder, which uses a deep representation learning approach to simultaneously learn feature and patient embeddings, was recently developed for seamless integration of multiomics data and biological domain knowledge such as molecular interaction networks in humans [177]. This method can be extended to omics data analysis of genetically modified plants or de novo plant systems for improving the design of plant systems in an iterative application to the DBTL cycle, as illustrated in Figure 5. Complementary tools that use probabilistic modeling techniques to converge to the desired specification (e.g., increased expres-sion of a gene cluster) accurately, without requiring full mechanistic understanding of the biological system [82], will need to be recruited to plant biosystems design. An exemplary module for integrative analysis of multiomics data in the "Plant Biodesign Hub" is presented in Figure 9(a). The results from the integrative analysis of multiomics data could be used for metabolic modeling based on the modeling module to be built in the "Plant Biodesign Hub" (Figure 9(b)). Besides learning from engineered plants, it would be important to learn from model species or from cross-species models incorporating prior knowledge through explainable artificial intelligence [178].

Applications of Plant Biosystems Design
Biosystems design has great potential for applications in (1) basic plant biology research to gain a deeper understanding of molecular functions and biological processes in plant biosystems ( Figure 10) and (2) various aspects of applied plant science research to accelerate the improvement of plant traits or to create new germplasms with improved traits to benefit ecosystem health and human society ( Figure 11).

Application of Biosystems Design to Basic Plant Science
Research. Biosystems design can be used to further our understanding of molecular mechanisms driving biological processes in plant systems by dissecting the function of individual genes or multigene modules.   [179], with an increasing number of new plant genome sequences being released every year. However, even in Arabidopsis thaliana, which is one of the best studied model plant species, approximately 60% of predicted enzyme-and transporter-encoding genes do not have credible functional annotations, and this number is even higher in nonmodel plant species [180]. Until recently, only~5% of genes in the Arabidopsis genome have experimental evidence for their functions (e.g., biochemical activity, subcellular location, and biological role) [181]. Traditionally, experimental characterization of plant gene function depends mainly on (1) knockout mediated by T-DNA insertion or chemical/radiation-induced mutagenesis, (2) knockdown mediated by RNA interference (RNAi) or VIGS, and (3) overexpression of one or a few genes in individual genetically modified plant lines. These traditional approaches suffer from several limitations: (1) the knockout mutations created by T-DNA insertion or chemical/radiation-induction occur as random insertions or deletions and often accompanied by additional unrelated mutations in the genome, (2) it is challenging to obtain homozygous multigene knockout mutants in diploids or single-gene mutants in polyploid species, as it requires multiple generations of self-pollinated plants while being almost impossible in vegetatively propagated plants and perennials with long life cycles, and (3) RNAi works on protein-coding genes only, along with incomplete loss of function and extensive off-target activities [182]. These limitations can be overcome by genome-engineering tools such as CRISPR/Cas-systems, which can be used to generate targeted homozygous knockout mutations without the need for selffertilization [102,182]. CRISPR/Cas-mediated gene knockout has one disadvantage for functional genomics research: it is not suitable for studying the function of essential genes due to the lethality of their knockout mutants generated by CRISPR/Cas systems, although it is possible to identify essential genes using CRISPR/Cas targeted mutagenesis in some cases where homozygous knockout mutant seeds can be obtained from heterozygous mutant parents [183]. Still, CRIS-PRi and CRISPRa offer the opportunities to repress and activate gene expression, respectively, for both coding and noncoding RNAs [112]. However, being similar to the traditional genetic transformation system, the CRISPR/Cas systems have not been established in many plant species. In some cases, low editing efficiency and off-target issues still cannot be fully addressed. Beyond examining the functional roles of a particular gene, more elaborate biosystems design strategies offer a powerful approach for studying the collective function of multigene modules in metabolic pathways, signal transduction cascades, and regulatory networks [184]. It is very challenging to map the protein-DNA interactions in gene regulatory networks using experimental approaches. Biosystems design could enable scalable epitope tagging for highthroughput chromatin immunoprecipitation followed by sequencing (ChIP-Seq) [185], combined with CRISPR/Casmediated knockout experiment, to identify accurately the target genes of plant transcriptional factors.
In general, plant functional genomics research is still substantially hindered by labor intensive and time-consuming work and therefore could greatly benefit from biosystems design approaches that provide high-throughput capabilities  16 BioDesign Research for determining gene function. For example, multiplex genome editing can generate more than 100 targeting events [186], enabling gain-of-function or loss-of-function screening of a large number of genes. Also, the automated design and high-throughput assembly of gene constructs described in Section 3 would greatly speed up the elucidation of plant gene functions.  17 BioDesign Research minimized genomes can be obtained by deletion of cryptic genes and mobile DNAs in microbes [191]. It is anticipated that creating minimized genomes for plants would be much more challenging than for microbes due to much larger and complex genomes in plants. From a biosystems design perspective, a minimized plant genome could be potentially obtained through reduction via genome-wide geneknockout using CRISPR/Cas systems. The minimized plant genomes would provide a unique opportunity for dissecting the minimal gene network of a functional plant system and validating modular cell design. Furthermore, it would allow for adding genes or gene modules to study their function.

Application of Biosystems Design to Applied Plant Science
Research. With guidance from the principle of biosystems design, the cutting-edge genome editing and genome-writing/rewriting technologies can be used for modifying or redesigning crop plants for various applications, including genetic improvement of photosynthetic efficiency, plant stress tolerance, crop quality, climate change mitigation, production of biomaterials, bioenergy and medicines, phytoremediation, biosentinel, tissue-engineering, and space exploration.

Plant Biosystems Design for Increasing Photosynthetic
Efficiency. The average yields of staple crops currently increase at a rate of about 1% each year, but this will need to increase by two-fold to feed the estimated world population of 9 billion people in 2050 [7]. Improvements in the yield potential of crops could be accelerated through genetic engineering approaches to enhance photosynthetic efficiency. A number of strategies have been employed to achieve this goal (for a detailed discussion see Long et al. [192]), with several recent successes that highlight our capacity to modify both the light-dependent and light-independent reactions of photosynthesis, both of which are important in determining crop yield potential. Here, we discuss some of these strategies and successes that could be further complemented by a plant biosystems design approach.
The photosynthetic reaction center complexes that capture light energy in plants (i.e., photosystem I (PSI) and photosystem II (PSII)) utilize only half of the incident solar energy (i.e., 400 to 700 nm) and work in series, connected by an electron transport chain. Redesign of the photosystems to expand the region of photosynthetic absorption from the visible region of the spectrum to include far-red and infrared regions could improve the efficiency of light capture [193]. For example, introducing novel light-harvesting pigments (e.g., chlorophylls d and f from cyanobacteria Acaryochloris marina and Halomicronema hongdechloris, respectively) could allow for light capture up to 750 nm [194]. In a more ambitious design, it would conceptually be feasible to redesign several components of the photosynthetic electron transport chain to harvest light < 1000 nm [195]. Thus far, experimental successes to improve light-use efficiency and plant yields have come from overexpressing components of the cytochrome b 6 f complex, which facilitates the transfer of electrons from PSII to PSI [196,197]. Furthermore, accelerating the repair of photodamage to PSII, through nuclear overexpression of the core PSII subunit protein D1, can improve photosynthesis and plant productivity and enhance survival under heat stress [198]. Lastly, engineering the photoprotective mechanisms of the photosystems can also facilitate growth enhancements. Photosystems dissipate excess absorbed light energy as heat in full sunlight but do not adapt to fluctuating light conditions rapidly, resulting in suboptimal photosynthetic efficiency and consequently losses of up to 20% of potential yield in field crops. This issue has been addressed by accelerating the induction and recovery from photoprotection in tobacco via bioengineering of an accelerated response to natural shading events, which increased dry biomass yield by~15% in fluctuating light [199]. In contrast, the same approach in Arabidopsis has led to growth impairments, suggesting that the success of this strategy requires careful balance so as not to interfere with other regulatory processes [200].
For the light-independent reactions of photosynthesis (i.e., CO 2 capture and conversion to sugars and starch), ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) is the key enzyme for CO 2 fixation in all plants. Considering its importance, Rubisco is surprisingly inefficient and considered a bottleneck to photosynthetic productivity. Rubisco is a relatively slow enzyme, so to compensate, most plants invest ca. 30% soluble protein in leaves to the Rubisco pool. Furthermore, the dual specificity of Rubisco for CO 2 and O 2 results in two separate reactions: carboxylation and oxygenation. The latter results in the production of the toxic intermediate phosphoglycolate that is removed by the photorespiratory salvage pathway, causing the loss of carbon and energy. Photorespiration is widely regarded as a necessary but wasteful biochemical pathway [201]. Rubisco is one of the most well-studied enzymes and has been a prime engineering target for decades [202]. However, Rubisco consists of subunits expressed in the nuclear and chloroplast genomes and requires several chaperone proteins for functional assembly. Thus, engineering the catalytic properties of Rubisco has been challenging and progress has been slow. However, recent success in the assembly of Arabidopsis and tobacco Rubisco in E. coli could pave the way for more rapid screens to identify mutants with substantial improvements in function [52]. As an alternative strategy to combat Rubisco oxygenation, several photorespiratory bypasses in the chloroplast have been developed [203,204] including a synthetic glycolate metabolism pathway that has been shown to increase tobacco biomass yield by~40% in a field study [205]. Recent work has expanded this approach to rice and highlighted the need for readdressing source-sink flow in plants engineered to have enhanced photosynthetic potential [206]. Several alternative synthetic bypass routes have also been suggested, including a synthetic pathway for converting glyoxylate to hydroxypyruvate in the peroxisomes, which circumvents the mitochondrial reactions, avoiding decarboxylation and deamination [207], and enzyme engineering approaches to perform new-to-nature reactions such as the reduction of glycolate to glycolaldehyde [61]. More ambitious synthetic strategies include the development of alternative carboxylases and synthetic cycling pathways (i.e., not integrated into the Calvin-Benson-Bassham cycle) for CO 2 18 BioDesign Research assimilation that completely bypass the shortcomings of Rubisco [208][209][210]. Computational modeling could be used to design further novel synthetic pathways to assimilate CO 2 and bypass photorespiration. Natural evolution in plants has only explored a fraction of the potential metabolic design space to drive photosynthesis [211,212]; thus, there are likely many opportunities to further redesign and enhance crop performances. Several photosynthetic organisms have evolved CO 2concentrating mechanisms (CCMs) to overcome the limitations of Rubisco and reduce photorespiration [213]. Plants have evolved two CCM pathways: C 4 photosynthesis and crassulacean acid metabolism (CAM) photosynthesis. Although the potential for exploiting CAM photosynthesis for future agricultural production has been highlighted [214], most engineering work has focused on the benefits of C 4 photosynthesis. Plants that perform C 4 photosynthesis (i.e., C 4 plants like maize, sorghum, sugarcane, and switchgrass) typically separate CO 2 fixation across two cells types: initial capture of CO 2 in mesophyll cells as the C 4 acid oxaloacetate and conversion to malate, and then transport to Rubisco-laden bundle sheath cells, where malate is decarboxylated to release CO 2 . C 4 photosynthesis facilitates above atmosphere local concentrations of CO 2 around Rubisco, thereby favoring the carboxylation reaction over the oxygenation reaction and reducing photorespiration. As a result, C 4 plants generally have higher photosynthetic efficiencies than C 3 plants. Nevertheless, several important staple crops (e.g., rice, wheat, and soybean) are C 3 plants. International efforts have been undertaken to engineer C 4 photosynthesis into rice to increase photosynthetic efficiency and productivity [215]. However, due to the two-cell complexity of C 4 photosynthesis, converting C 3 into C 4 photosynthesis requires considerable reengineering of metabolism and dramatic changes in leaf anatomy, both of which impose a significant challenge. Engineering a single-celled C 4 system using biosystems design could be a promising alternative strategy [216]. In addition, introducing the single-celled physical CCMs found in algae and cyanobacteria into plants is predicted to lead to some of the largest improvements in yield potential (>60%) [217,218]. Promisingly, several of the components required to build such systems have now been successfully introduced into plants [ [219][220][221]. Recently, the draft genome sequence of a single-cell C 4 (SCC4) plant species, Suaeda aralocaspica, became available [222], providing an excellent genomics resource for engineering SCC4 in C 3 plants. C 2 photosynthesis, which utilizes glycine decarboxylase activity in the bundle sheath to decarboxylate the photorespiratory glycine produced in the mesophyll and deliver CO 2 around Rubisco, is another CCM that operates by capturing, concentrating, and reassimilating CO 2 released by photorespiration, and therefore, engineering of C 2 photosynthesis has the potential to improve photosynthetic performance under high temperature, bright light, and low CO 2 conditions [223,224].
Roughly 50% of the carbon captured by photosynthesis (net of photorespiration) is subsequently lost, and strategies for cutting this large carbon loss include (1) reducing unnecessary turnover of proteins (e.g., THI4 which is a suicide enzyme with a very high turnover rate) and membranes; (2) replacing, relocating, or rescheduling metabolic activities (e.g., replacing the Phe route to lignin, relocating nitrate assimilation from root to shoot, and rescheduling biosynthetic processes from night to day); (3) suppressing futile cycles (e.g., futile cycles between sucrose synthesis and degradation or between fructose-6P and fructose-1,6BP); and (4) reducing ion transport costs (e.g., reducing efflux of nitrate to the rhizosphere) [225]. These strategies can be implemented through synthetic metabolic engineering approach [226].

Plant Biosystems Design for Increasing Plant Stress
Tolerance. Abiotic stresses (e.g., drought, heat, and salt stress) account for more than 60% of the yield loss in some major crops such as maize, wheat, rice, and soybean [227,228]. Plant resistance to abiotic stresses can be divided into escape, avoidance, and tolerance [229,230]. Various types of genes have been proposed as candidates for engineering to increase tolerance to abiotic stresses, including genes encoding (1) enzymes for production of protective metabolites (e.g., proline and sugars), (2) enzymes for membrane lipid biosynthesis, (3) enzymes for biosynthesis of antioxidants (e.g., ROS scavenging), (4) protective proteins (e.g., LEAs and molecular chaperones), (5) transporters (e.g., water and ion transport), (6) regulatory proteins, (7) kinases, and (8) proteins regulating transcription (e.g., transcription factors), along with genes involved in posttranscriptional (e.g., microRNAs) and posttranslational (e.g., ubiquitination) regulation of abiotic stress responses [230]. Previous efforts have been focused on engineering of individual genes to enhance tolerance to a specific abiotic stress. Plant biosystems design has the potential of integrating multiple genes to confer resistance to a broad range of abiotic stresses. Also, tissue-specific and stress-inducible expression of genes relevant to abiotic stress tolerance could be implemented via biosystems design to reduce energy costs and avoid pleiotropic effects.
One important example of stress avoidance mechanism is CAM photosynthesis, which is a natural solution to the challenge caused by drought stress. CAM plants close their stomata (the pores on the leaf surface) during the heat of day and open them at night, resulting in lower water loss and higher water use efficiency than C 3 or C 4 plants, which close their stomata during the nighttime and open them during the daytime [231]. The maximum yield of CAM crops is much higher than that of C 3 or C 4 crops under water-limited conditions [232]. Engineering of CAM machinery into C 3 or C 4 plants has great potential for increasing crop yield under drought conditions. CAM engineering requires design of multiple gene modules involved in carboxylation, decarboxylation, and stomatal movement, as well as genes involved in leaf succulence and vacuole size [233][234][235]. Biosystems design approaches could be used to integrate these CAM-related gene modules into plants, preferably using gene circuits to establish drought-inducible CAM (or CAM-on-demand) systems [236].
Biotic stresses imposed by pathogens and pests can also cause massive losses in crop yield. Engineering synthetic plant immunity would be a promising strategy for increasing or broadening plant resistance to diseases [237]. Immune receptors, such as nucleotide-binding leucine-rich repeat (NLR) receptors, are promising targets for increasing disease resistance using biosystems design approaches [238,239]. Creating genetically modified crops resistant to insects is a useful approach to reduce the yield loss caused by pests [240], such as transgenic crops overexpressing Bacillus thuringiensis (Bt) insecticidal proteins [241]. It is critical to design gene constructs that specifically target pests without toxic effects on humans or negative impacts on beneficial organisms. Host-induced gene silencing (HIGS), in which double-stranded RNAs (dsRNAs) directed against suitable insect target genes are expressed in transgenic plants, has been used to confer protection against pests [242]. The HIGS approach has two major advantages: (1) dsRNAs can be designed to be highly specific to target insects without negative impact on other organisms, and (2) multiple dsRNAs can be engineered into each individual plant for protection against multiple pests. However, the design of HIGS requires rich genomics resources of target insects and related species.
Beneficial microbes (e.g., bacteria and fungi) can enhance plant resistance to abiotic and biotic sources of stress [243]. Plants can generate molecular and metabolic effectors for promoting beneficial plant-microbe interactions. Synthetic genetic circuits can be engineered into plants to reshape the rhizosphere microbiome to enhance stress tolerance and acquisition of nutrients (e.g., nitrogen). For example, opine biosynthesis pathways have been engineered into plants to reshape rhizosphere populations to increase the population densities of opine-catabolizing bacteria [244]. Similarly, a synthetic pathway has been engineered in Medicago truncatula and barley for the production of the rhizopine scyllo-inosamine to regulate bacterial gene expression in the rhizosphere [245]. There is considerable potential for engineering host plants to promote the beneficial interactions between plants and microbes.

Plant Biosystems Design for Improving Food Crop
Quality. The ever-increasing living standard worldwide, combined with limited arable land availability, calls for genetic improvement of food crop quality. Deficiencies in vitamins collectively affect billions of people worldwide and are a cause of substantial morbidity and mortality. Vitamin A deficiency is the global leading cause of preventable blindness [246], iron deficiency delays cognitive development [247], and folate deficiency is especially common among pregnant women and is associated with defects in fetal neural crest development [248]. Eliminating vitamin deficiency is a global public health priority and one of the World Health Organization Millennium Development Goals [249]. Biofortification (i.e., improvement of nutritional quality of food crops during plant growth and development) is a costeffective strategy to mitigate vitamin deficiencies, particularly in the developing world where other vitamin supplementation programs suffer from logistical problems with transportation. Plant biosystems design is a useful approach to achieve biofortification through the engineering of superior nutritive properties in crop plants. Examples of biofortified crops include those enhanced with beta-carotene (provitamin A) [250,251]; arachidonic acid [250]; carotenoids asso-ciated with eye and cardiovascular health, immunity, and cognitive function [250,252]; iron [253]; and folate [254], along with efforts underway for α-tocopherol (vitamin E) [255] and zinc [256].
Crop plants generally contain various types of antinutritional factors such as cyanogenic glycosides (e.g., phaseolunatin and dhurrin), enzyme inhibitors (e.g., alkaloids and protease inhibitors), physiological disorganizers (e.g., lectins and saponins), hormone biosynthesis inhibitors (e.g., goitrogens), and antivitamins (e.g., antivitamin E) [257,258]. For example, cowpea (Vigna unguiculata L. Walp) plants, which are able to grow in semiarid regions with low input requirements, provide a sustainable source of essential nutrients (e.g., high protein and low fat content), but dietary utilization of cowpea has been seriously constrained by its antinutrients (e.g., phytic acid, protease inhibitors, and cyanogenic glucosides) and low protein digestibility [259]. Altogether, biosystems design could simultaneously leverage biofortification, remove antinutritional compounds, and increase protein digestibility, but care needs to be exercised to not consequently increase pest and pathogen susceptibility.

Plant Biosystems Design for Mitigation of Climate
Change. Carbon dioxide released into the atmosphere is the primary cause of anthropogenic global warming [260,261]. Terrestrial plants are a major player of atmospheric CO 2 capture and storage [262,263]. Carbon sustainability and carbon neutrality would be the great benefit that can be achieved with faster growing plants. Biosystems design has great potential in CO 2 capture and storage in various aspects, including (1) improving photosynthetic efficiency aboveground and allocation of photosynthates to below ground structures (e.g., source-sink modulation) [264], (2) converting annual crops to perennial plants that have much larger root systems [265], (3) generating more recalcitrant carbon-containing compounds (e.g., lignin and suberin) in the roots [266], (4) establishing deeper root systems [267], (5) enabling animal to plant-sourced protein shifts [268], (6) restoring forests [269], (7) enabling carbon mineralization [270,271], and (8) increasing biomass accumulation through genetic improvement of photosynthetic efficiency [272] (see also Section 4.2.1). Armed with these biological targets, opportunities to mitigate rising atmospheric CO 2 concentrations are many, with promising avenues to pursue through biological carbon capture and storage in soils. This undertaking should be viewed with increasing optimism, especially as the strategies and technologies-aided by plant biosystems design-employed to store carbon in soil pools with long residence times have emerged and will continue to evolve over time. The postgenomics era provides an unprecedented opportunity to identify genes, enzymes, biochemical pathways, and regulatory networks that underlie rate-limiting steps in carbon acquisition, transport, and fate; and thereby yield new approaches to enhance terrestrial carbon sequestration. An investment in plant biosystems design could harness these new approaches to increase biomass production in agricultural crops and fast-growing trees in managed plantations. 20 BioDesign Research

Plant Biosystems Design for Bio-Based Materials.
Living organisms produce a series of proteins and compounds, i.e., bioproducts, used as building blocks for the manufacture of biomaterials. Such manufacture can take place ex vivo (e.g., through chemical manipulations of extracted bioproducts) or in vivo, being naturally synthetized by the living organism (i.e., biomanufactured). Bio-derived or bio-based products or materials are simply referred to here as "biomaterials". Importantly, these concepts are independent from bioinspired materials, which are characterized by the application of biological design rules and principles by material scientists during synthetic (nano)material synthesis ( Figure 12). Bioproducts come from organic, inorganic, and/or living cell sources, which can be "mixed and matched" to promote composites with a diverse set of properties and modularity often unavailable to chemists and material scientists [273]. Plants are a prime source for various bioproducts, including (1) biopolymers such as cellulose, lignin, and derivatives; (2) extractives such as latex, starch, and fatty acids (e.g., polyhydroxyalkanoates); (3) small molecules such as phenylacetic acid and muconic acid, which often undergo additional downstream chemical, enzymatic, and/or thermal processing for commercial applications; (4) inorganic biominerals such as phytoliths; and (5) organic-inorganic composites such as calcium oxalate crystals and calcium carbonate. Plantderived bioproducts can be used for fiber, bioplastics, liquid crystals, energy storage, and insulants [274][275][276][277][278][279][280][281][282][283]. Their applications span various fields, including medicine, engineering, and material sciences, representing complementary and replaceable alternatives to bioproducts and biomaterials from animal origin, which can draw environmental and ethical concerns.
One beneficial template example for biomanufacturing is the composite of calcium carbonate (CaCO 3 ), which is a plant crystal equivalent to nacre (mother-of-pearl) in composition. However, their multifactorial architecture is different, with plant crystals being good insulators and nacre having unique mechanical and optical properties. Exploring plant biosystems design with multidisciplinary tools could enhance our predictive power to modify such plant crystals, programming a set of characteristics to solve urgent energy, engineering, and environmental problems [284][285][286][287][288][289].
Another example, in biomaterials fabrication, is the bioengineering and use of cellulose nanocrystals (CNC) and nanofibrils (CNF). Structural and functional properties of CNFs can be influenced by plant cellulose properties, such as crystallinity and interfacial binding with matrix polysaccharides [290]. The relevance of biologically synthesized 18-(glucan) chain cellulose microfibril structure on CNF properties [290] opens up the prospect of leveraging the biosynthetic processes in plants. Potential biosystems design strategies for varying the chain length and crystallinity of cellulose include altering the distribution and composition of cellulose synthase (CesA) complexes (CSCs), which have been proposed to be composed of 18 CesA proteins [291], to potentially optimize cellulosederived nanocellulose and nanocellulosic composites. Uses of conventional approaches to enhance cellulose content via overexpression of single secondary wall-associated CesA types have generally not resulted in increased cellulose content in woody plants due to challenges with effective transgene expression in the presence of endogenous copy [292] or varying CesA stoichiometry needed for functional CSCs [293]. Use of biosystems design approaches will aid in understanding the extent to which domain swapping, modified CSC composition, heterologous CesA expression, and use of optimized promoters can impact content, crystallinity, degree of polymerization, and crosslinking properties of cellulose.
In use of biomaterials in clinical contexts, scaffolds of animal origin can lead to variability and environmental and ethical concerns, which can be potentially addressed using animal-free scaffolds such as bioproducts of plant origin [278]. Due to the natural strength of plant and marine algae-derived bioproducts (e.g., nanocellulose and alginate) and their functional roles as scaffolds for growth, plant-

21
BioDesign Research derived products, or materials are highly promising as bioinks for printing of novel biomaterials with applications in drug delivery, wound healing, and implantable medical devices [279].
Overall, biosystems design can be a powerful aid in accelerating the research and development of plant-derived bioproducts. Although plants have been the source of biomaterials for a long time, the recent need for petroleumindependent products, associated with the tremendous opportunities and potential for developing renewable and better performing products from biomaterials, has accelerated studies in biomanufacturing [286]. Also, in vivo biomanufacturing provides us with materials having characteristics unable to be reproduced by chemistry alone [273]. However, we still lack understanding of the "material loci" in plants controlling the synthesis, transport, modification, assembly, and storage of biomaterials. In addition, there has been limited exploration of chemical composition, ultrastructure, and bonding within/across interfaces in hybrid biomaterials, which often have appealing physical, optical, and electromagnetic properties. Biosystems design approaches will be required for leveraging such needs as well as recruiting and integrating emerging theoretical, computational, and in situ characterization tools to establish a knowledge toolbox and bridge gaps between disciplines, accelerating the overall biomaterial cycle "design-discover, synthesize, characterize, learn and apply" (DiSCLA). For example, a biosystems design approach has been proposed to reconfigure plant metabolism for cost-effective production of biodegradable plastic [294]. One challenge of this approach is how to minimize the negative impact of biodesign for bio-derived products and materials on plant growth performance both above and below ground [264,294].

Plant Biosystems Design for Bioenergy Production.
The interest in biofuels has largely shifted from bioethanol to drop-in advanced fuels due to the increasing popularity of electric automobiles, which has shifted the major potential for biofuels into aviation, where the use of heavy batteries remains unlikely to become economically feasible. Because energy density is a key consideration for aviation fuels, drop-in fuels are more promising than bioethanol [295]. Current bioethanol production suffers from a number of problems such as relatively small lifecycle reductions in greenhouse gas emissions [296] and a competition with food production that raises the price of staple foodstuffs [297,298]. In order to overcome these drawbacks, bioenergy crops should be engineered to grow with fewer inputs, on marginal land, or with valuable coproducts such as food, medicine, or industrial chemicals [299]. Biomass feedstocks (e.g., lignocellulose, starches, and lipids) can be converted into jet fuels via several different routes, such as oil to jet fuels (OTJ), syngas to jet fuels (STJ), and alcohol to jet fuels (ATJ) [300]. Currently, the cost of biomass-based jet fuels is relatively high (e.g., 4.4 to 5.1 $/gal from the OTJ route), which can be offset by generating high-value coproducts from the biomass feedstock [300]. Therefore, it is important to design the metabolic pathways in plants for optimizing biomass feedstock for production of both jet fuels and value-added coproducts. Previ-ous efforts have discovered a lot of genes relevant to yield and quality of biomass in multiple bioenergy crops. Biosystems design provides an excellent opportunity for combining the improved traits conferred by individual genes to optimize the performance of bioenergy crops, with simultaneous improvement of biomass quality (e.g., high cellulose content and low recalcitrance to deconstruction) and biomass accumulation under both normal and stress conditions.
Typically, plants have a low photosynthetic efficiency, converting less than 1% of the available sunlight to stored chemical energy [301], which limits the economic feasibility of plant biomass as feedstock for biofuels production [302]. Therefore, it is critical to increase the photosynthetic efficiency of bioenergy crops using biosystems design approaches, as described in Section 4.2.1. In general, the stem of woody bioenergy crops (e.g., poplar) is used for biofuels production while the leaves are discarded as waste. To increase the economic value of woody bioenergy crops, their leaves can be used as bioreactors to produce high-value biobased products (e.g., biodegradable plastics and specialty or commodity chemicals) using the strategies described in Section 4.2.5 as well as medicine using synthetic biology approaches described in Section 4.2.8. Furthermore, the below-ground tissue of bioenergy crops can be optimized for long-term carbon storage using biosystems design to mitigate climate change (see Section 4.2.4).

Plant Biosystems Design for Phytoremediation and
Phytomining. Pollution by heavy metals, which cannot be chemically degraded, poses a serious long-term threat to the environment and human health [303]. Some plants, called hyperaccumulators, can accumulate metal and metalloid trace elements (e.g., nickel, zinc, cadmium, manganese, arsenic, and selenium) to extraordinarily high concentrations in their above-ground living biomass [304]. There are 721 plant species identified as hyperaccumulators [305], and some of the hyperaccumulator plants can take metals up to 1-2% of total dry weight, which may be hundreds or thousands of times greater than commonly grown plants [305,306]. In comparison with different physical and chemical methods of extracting heavy metals, use of hyperaccumulator plants is perceived as a green, low-cost, and efficient approach [307]. Hyperaccumulator plants can be utilized for phytoremediation to clean up the soils contaminated by heavy metals and/or for phytomining to recover an economic amount of metals (e.g., nickel) from the plants [303,308].
Comparative analyses of hyperaccumulators and closely related nonhyperaccumulators have improved the understanding of molecular mechanisms of heavy metal uptake, transport, sequestration, and tolerance in hyperaccumulators. Particularly, functional characterization of heavy metal transporters, such as Zinc-regulated transporter Ironregulated transporter Proteins (ZIP), Heavy Metal transporting ATPases (HMA), Multidrug And Toxin Efflux (MATE), and Metal Transporter Proteins (MTP) gene families, has yielded valuable gene resources for designing and engineering more effective phytoremediation systems [306,309]. Overexpression of such transporters has been widely successful in enhancing the uptake of heavy metals in model plants 22 BioDesign Research or nonhyperaccumulators [306]. However, due to the complexity of plant metal transporting and trafficking systems, a much more sophisticated design of plant biosystems will be required to enhance the capability of phytoremediation. A nascent area of interest is the role of plants in accumulation of rare earth elements (REEs), which are critical materials with unique light, catalytic, and magnetic activities but do not have reliable supply chains. Plants generally have low concentrations of rare earth elements, which in part reflect diffuse distribution and low concentration of rare elements in soil. However, certain types of plant-like ferns and citrus trees have been reported to have a higher capacity for accumulating rare earth elements [310][311][312]. Future research on phytomining can focus on the understanding of the molecular mechanism underlying REE accumulation in hyperaccumulating plants and then transfer the REEaccumulating mechanism into the existing crop plants using biosystems design, which may involve engineering of transporters and metabolic pathways.
Biosystems design research for phytoremediation and phytomining can be focused on the following aspects: (1) due to the narrow distribution and low biomass yield of hyperaccumulator plants [305], it is necessary to extend the geographic distribution and biomass yield of existing hyperaccumulators through targeted mutation or gene circuit design; (2) it is important to enhance the uptake and tolerance of heavy metals in nonhyperaccumulators using a biosystems design approach; (3) because hyperaccumulation of lead, copper, cobalt, chromium, and thallium has not been well established in natural plant systems yet [304], it is urgent to either find natural or synthetic genes and pathways for accumulating these heavy metals through systems biology research, in silico modeling, and metal transporter protein engineering; and (4) comparative cross-species studies can unravel the fundamental pathways unique to hyperaccumulators of rare earth elements and present avenues for potential biosystems design approaches in deployable plant species.

Plant Biosystems Design for Medicine Production and
Medical Research. Plants have been the primary production chassis for medicine for millennia and continue to play an important role in modern supply chains. Plants are the source of approximately 25% of modern drugs and in many cases remain the most cost effective method for their production [313]. Examples include the antimalarial drug artemisinin [314] and the anesthetic morphine [315]. Many of these high-value medicinal compounds are produced from nonmodel plants, but plant engineering efforts have nonetheless been successful in improving their yield [316], which remains economically competitive with chemical and microbial synthesis [317].
Plants have been used as bioreactors to produce vaccines, such as anticancer or viral vaccines [318]. Genetically engineered plants can produce recombinant proteins at larger scale than conventional platforms and are on track to become cost competitive with conventional production platforms. In fact, plants are being used as a platform to produce a wide range of antibodies in different organs (leaves, roots, seeds, tubers, fruits) of various plant species, such as tobacco, potato, rice, tomato, and pea [319][320][321][322]. This has been possible due to rapid improvement in plant genetic engineering and transformation technologies. There are several benefits for the use of plants to produce therapeutic antibodies. First, it reduces the production cost. Plants can be used to produce recombinant proteins at 0.1% and 2-10% of the cost of mammalian cell cultures and microbial fermentation systems, respectively [323]. Second, plants are usually regarded as safe systems for antibody production because they do not harbor mammalian pathogens or produce endotoxin. Finally, production of antibodies in edible tissue will allow convenient, needle-free oral immunization at the gastric mucosal surface [320].
There are several examples of successful use of a plantbased antibody to treat human diseases. One such case is that the secretory antibody "CaroRxTM" derived from tobacco leaves was used to treat Ebola patients during outbreaks of this virus in Africa in 2014 [324]. Other plant-made antibody products that are currently being used to cure human diseases are "DoxoRxTM" for treating drug-induced alopecia, a common side effect of cancer therapy and RhinoRxTM for treating the common cold [320,325]. Another outstanding example of plant-based molecular pharming to produce biopharmaceuticals is the production of vaccine against the SARS-CoV-2 virus. A biotech company called Medicago Inc. (Quebec City, Canada) is using plants to produce virus-like particles (VLPs), which is the first step in developing a vaccine against COVID-19 before preclinical testing for safety and efficacy. VLPs are the noninfectious viral proteins that lack the key genetic materials for infection. However, they are still recognized by the immune system and therefore can be used to produce antibodies against the SARS-CoV-2. A similar effort is also underway by a US-based company iBio, Inc.
Plants can also be used to produce reagents for detecting human pathogen. For example, plants can be used to generate diagnostic reagents for COVID-19 in multiple ways: (1) generating positive control reagents for RT-PCR detection of SARS-CoV-2 virus by producing artificial RNA containing the target virus genomic regions which are packed inside the VLPs derived from Cowpea mosaic virus (CPMV), (2) generating antibodies for detecting the spike (S) protein of SARS-CoV-2, and (3) generating recombinant proteins for detecting antibodies against SARS-CoV-2 to identify people who are currently infected or recovering from infection [326].
In addition to serving as chassis for producing medicinal compounds and biologics, plants are emerging as a robust discovery and functional validation platform to elucidate the genetic basis of heritable human diseases including cancers and developmental abnormalities. Plants and humans share a common eukaryotic ancestry represented by evolution of the first complex and multicellular organism. Evolution of this complex life form necessitated the emergence of genetic mechanisms to coordinate DNA replication-repair, cell division, signaling between neighboring cells, and their adhesion to facilitate hierarchical assembly into tissues and organs with specialized functions [327]. Given this shared ancestry, proteins underlying basal processes such as DNA replication-repair, cell division, and cell adhesion remain 23 BioDesign Research highly conserved across disparate eukaryotes. Additionally, such conservation at the protein level has been shown to result in orthologous phenotypes (or phenologs) across divergent eukaryotes including Arabidopsis, humans, and mice [328], suggesting that comparative studies across eukaryotes have the potential to identify critical amino acid motifs mediating function of these conserved proteins.
Although significant progress has been made in linking mutations to disease outcomes in human populations using genome-wide association studies (GWAS), pin-pointing causal variants that can be used as reliable biomarkers in disease diagnosis remains a major challenge. This is largely due to the fact that heritable diseases are often extremely rare in human populations which limits the ability to acquire sufficient sample sizes for robust statistical associations [270]. In contrast, long-lived perennial plants like poplar (Populus spp.) exhibit wide ranges of phenotypic variation that is underpinned by their ability to maintain diverse sequence variants, including high impact loss-of-function mutations, at surprisingly high levels [329]. As such, high-resolution GWAS in plants often require fewer samples and can precisely identify causal mutations underlying phenotypic expression. For example, using~300 individual poplar plants, Tuskan et al. [330] demonstrated that shared homology at the protein-level was manifested as an orthologous cell-proliferation phenotype between the poplar and humans. Specifically, they found significant similarities between genes implicated in callus formation in poplar and tumorigenesis in humans. Callus formation, which is the rapid growth of undifferentiated cell masses in plants, is orthologous to the uncontrolled cell proliferation during tumorigenesis in humans. The rate of callus formation was significantly associated with loss-of-function mutations occurring within a poplar SOK1 kinase related to the Mammalian Sterile-20 kinase, which has been shown to function in tumor suppression in humans [331]. In a separate study, Bdeir et al. [332] identified sequence variants in a desmosome protein, which were associated with adhesion of bark tissue in poplar. A functional variant of the protein prevented bark abscission resulting in annual accumulation of bark layers. In humans and mice, orthologous function of the same protein has been implicated in onset of the extremely rare skin cancer, keratoderma, which is manifested as an abnormal accumulation skin layer [333].
Given these unique advantages in precise genetic mapping of causal variants in addition to ease of experimental manipulation as well as fewer ethical issues, plants offer an attractive discovery and validation platform for understanding how mutations modulate the function of proteins that are fundamentally conserved across eukaryotes to provide highly resolved therapeutic targets for treatment of heritable human diseases in the rapidly expanding precision medicine field [334].
4.2.9. Plant Biosystems Design for Biosentinel. Biosensors are defined as molecules, organisms, or devices in a biological context that emit quantifiable signals in response to specific molecules or biological processes [335]. Plants have rapid (within the seconds to minutes time-scale) responses to various biotic and abiotic stimuli [336]. Therefore, inducible expression of reporter genes, such as the genes encoding fluorescent protein (e.g., GFP) and pigments (e.g., anthocyanin and chlorophyll), could be engineered to detect environmental stimuli. For example, a resettable synthetic degreening plant-based biosensor (also called phytosensor) system was successfully created, in which plants lost their chlorophyll from induction of the degreening circuit by a synthetic steroid (4-hydroxytamoxifen, 4-OHT), and regreened after the inducer was removed [337]. Also, a synthetic signal transduction pathway was constructed for detecting the nitroaromatic explosive 2,4,6-trinitrotoluene (TNT) [338].

Plant Biosystems Design for Tissue
Engineering. Synthetic morphogenesis enabled by synthetic biology can be achieved for the de novo generation of programmable tissues and organs [339]. Three-dimensional microstructure has recently been created from single plant cells in vitro by mimicking the plant tissue environment and using biocompatible scaffolds similar to those used in mammalian tissue engineering, with the scaffolds providing both developmental cues and structural stability to isolated callus-derived cells grown in liquid culture [340,341]. Furthermore, one interesting science-fiction-like question is: Can plants be genetically reprogrammed to form new-to-nature structures useful for human life? For example, it has been envisioned that a tree could be potentially reprogrammed to grow into a fully functional house based on the genetic instructions designed by scientists [342].

Plant Biosystems Design for Space Exploration.
There are growing interests and ongoing plans for human to travel to Mars in the near future, and plants on Earth need to be redesigned for meeting the needs of humans living in the Martian environment [343]. The targets include increasing drought-resistance to allow for plant growth under water scarcity in an extraterrestrial environment [344], cold tolerance, nutrient-utilization efficiency, and adjusting photosynthetic tuning to optimize for the lower Martian light intensity [343]. For practical reasons, it would be better to engineer plants for survival and yield in a space station and protected facilities on the surface of other celestial bodies (e.g., controlled environment like growth chambers), rather than in the open space on the surfaces of those bodies. The Martian surface temperatures are generally between -60 and 0°C and are not suitable to plant growth without radical engineering out of reach of current technology.

Social Responsibility of Plant Biosystems Design
Biosystems design is highly powerful for plant science research with enormous promise and needed caution. While plant biosystems design has great potential for genetic improvement of crop plants or creation of synthetic plant genomes for the benefit of our society, this new research discipline has a huge social responsibility to ensure biosafety and address the potential ethical issues. Genetically_Modified_Trees.aspx). Nonetheless, plant biotechnology still presents some potential risks that must be considered during plant biosystems design research. These risks can be grouped into six broad categories: nutritional nonequivalence, potential allergenicity, escape of noxious transgenes, creation of resistant weeds and pathogens, unintended changes, and disruption of ecosystem function. Strategies are available to reduce each of these risks but must be implemented through the design, testing, and implementation phases of plant biosystems design in order to effectively manage the risks. While there is a great concern about the potential safety issues caused by transgenes, the risks of not using transgenes have been ignored. It is critical to evaluate the risks of using vs. not using transgenes. For transgenes which are not noxious, their risks can be considered very low if there are no unintended changes (e.g., side effects and off-target effects) caused by the transgenes. Risk analyses should focus on the engineered plant products, not the process through which they are created (e.g., CRISPR/Cas-based gene editing vs. transgenesis). Therefore, genome-edited crops are not regulated as GMOs in more and more countries since genetic variants created through genome editing are indistinguishable from naturally evolved variants [345], although they are still categorized as GMOs in Europe, with concerns on unintended effects (e.g., off-target effects, unintended on-target effects, and other unintended consequences) [346] (see Section 5.1.5 for more details).

Nutritional Nonequivalence.
Nutritional nonequivalence is frequently a design feature for transgenic crops, as in the case of biofortified cereal grains [347,348]. However, there is always the possibility during biosystem engineering of unintended changes to the crop product metabolome, which may result in lower nutritional quality. Careful quantification of compounds metabolically linked to any plant biodesign changes should be carried out under a variety of field conditions to test for changes to nutritional content or production of harmful levels of unintentional off-target metabolites. Flux analysis is a biosystems design tool that can guide the search for relevant metabolites to be analyzed [349].

Potential Allergenicity.
Introduced allergenicity is another risk that arises naturally from a design feature. If genes encoding allergenic proteins are introduced into a plant, the resulting engineered plant is likely to act as an allergen. The generation of allergenic plants occurred early in the history of transgenic plant production with soybeans encoding the 2S albumin protein from Brazil nut [350]. Allergies to Brazil nut 2S albumin are relatively common and often severe, so it is likely that introduction of the allergenic soybean to the human food supply would have harmed some consumers, leading to a public relation disaster. Fortunately, the product was never brought to market, but this work acts as a reminder of the seriousness of this issue [351]. All transgenic proteins introduced to crops should be rigorously tested with a library of immune cells and serum samples to ensure no known or unknown allergens are being added to crops.
5.1.3. Potential Escape of Noxious Transgenes. Engineered plants containing transgenes are broadly useful for agriculture, with transgenes intended to remain within the specific cultivars for which they are designed. However, multiple instances have been documented of "transgene escape," which occurs when transgenes are transferred to other crop cultivars or wild relatives [352,353]. Transgene escape is widely discussed by opponents of plant biotechnology and could potentially interfere with wild gene pools, generating agronomic problems such as herbicide resistant weeds. Steps should therefore be taken to reduce transgene escape rates. Such methods include the use of transgenesis in the plastid genome rather than the plant nuclear genome, resulting in maternal rather than biparental inheritance [354], self-limiting genes that result in eventual sterilization of the plants [355], male sterility [356], and bisexual sterility [357].

Emergence of Herbicide and Pesticide
Resistance. Agricultural fields are usually designed with extremely high densities of genetically similar plants and high levels of nutrients, which are ideal conditions for the spread of disease, parasitism, or opportunistic growth by undesired plants. In the ongoing battle against pathogens, pests, and weeds, new tools tend to be effective for a length of time before evolution of the undesired organisms renders the tool useless [241,358]. The situation is analogous to the problem of antibiotic resistance in medicine, and the balance will likely continue until the elimination of agriculture or the elimination of the undesired pests, pathogens, and weeds. In some cases, transgene escape could lead to evolution of resistance to existing tools, but even if that case is prevented, evolution will take its course, and organisms will develop resistance to the current agricultural defenses. Agronomic and biotechnological steps can be taken to delay the evolution of resistance including the use of multiple independent forms of defense simultaneously (e.g., several distinct herbicides and herbivory deterrents) and the planting of undefended "refuges" to reduce the selective pressure for undesired organisms to evolve resistance to the crop defense chemical or engineered trait [359].

Unintended
Changes. CRISPR/Cas-mediated genome editing has been widely used for genetic improvement of plants. However, there are concerns about potential, unintended genetic modifications caused by gene editing due to the off-target effects. Although off-target mutations in CRISPR/Cas9-edited plants can be negligible and at a level lower than inherent natural variation when highly specific gRNAs are used [360], it would be prudent to carry out risk 25 BioDesign Research analysis for practical application or commercialization of genome-edited plants [361,362]. For example, a recent study assessed, in addition to off-target mutations, potential epigenetic changes attributable to CRISPR-mediated genome editing, reporting no detectable changes on DNA methylation status in edited plants [363].
5.1.6. Disruption of Ecosystem Function. While uncommon, there is a risk that transgenic plants have negative consequences on natural ecosystems. A classic example is the case of plants transformed with the genetic material from the bacterium Bacillus thuringiensis (Bt) and the monarch butterfly, in which a substantial research effort led by the US Department of Agriculture occurred following reports that monarch larvae had a decrease in survival following feedings on milkweed dusted with Bt pollen [364]. Although it was determined that only a very small portion of adjacent milkweed plants accumulate pollen in sufficient concentration to have a negative effect, it is important to recognize that there may be ecological risks associated with the release of transgenic plants.

Ethics of Plant Biosystems Design.
As an emerging cutting-edge discipline, plant biosystems design should meet the following ethic principles: public beneficence, responsible stewardship, intellectual freedom and responsibility, democratic deliberation, and justice and fairness [365].

Ethical Issues of Plant Biosystems
Design. Plant biosystems design and plant biotechnology show great promise in improving environmental and human health. Continued funding for applied plant science is critical to ensure this promise is fulfilled, especially for cases without obvious commercial interests such as design of biofortified food for poor consumers in the developing world. Since national research organizations provide a substantial fraction of the funding for plant biosystems design research, there is a moral obligation to ensure that the benefits of the products are shared by as many people as possible.

Solutions to Address the Ethical Issues.
A well-known genetically engineered crop is Golden Rice, a widely celebrated biofortified crop that resulted from a major publicprivate collaboration [366]. The product was designed to benefit the global poor, and the licensing agreements regarding the underlying intellectual property are exemplary: while large scale farmers in Western countries must pay for use of the product, Golden Rice is free for those who need it, specifically breeding programs and smallholder farmers (http://www.goldenrice.org/Content1-Who/ who4_IP.php). Such licensing arrangements can simultaneously give companies the required economic incentives to produce products and democratize access to those products to the global poor who stand to benefit most.

Conclusion
As an emerging interdisciplinary research field, plant biosystems design shows massive potential for not only increasing our understanding of the mechanisms underlying the biolog-ical complexity of plant organisms but also accelerating the domestication of crop plants or creation of novel plant organisms to address challenges related to food and energy security, environmental sustainability, and human health. In this roadmap, we discuss the principles, methods, applications, and social responsibility of plant biosystems design. Currently, the theories and principles of plant biosystems design are only partially understood, and the knowledge gaps are expected to be filled through systems biology research and the DBTL process(es) in the future. Significant progress has already been made in method development for plant biosystems design, such as efficient assembly of DNA parts, high-precision gene editing, enhancement of plant transformation using morphogenic regulators, and virus/nanotubemediated in planta transformation. Still, new technologies will need to be developed for enabling large-scale genome refactoring or construction of functional synthetic plant genomes. Exciting achievements have been made in the application of biosystems design to pathway engineering for improving agricultural crops, yet the potential of biosystems design needs to be exploited in many other aspects, such as bio-based materials, climate change mitigation, bioenergy production, phytosensor, tissue engineering, and space exploration. The potential of new machine learning capabilities (e.g., explainable artificial intelligence) could be exploited for predictive learning from big genomics and phenomics data. Also, plant biosystems design will need to be deployed for a new type of high precision agriculture and bioeconomy as a contribution to the fourth industrial revolution, in which genetic engineering plays an important role [367][368][369]. While it will be very important to achieve the scientific and technological advancements in plant biosystems design, we should pay special attention to its social responsibility (i.e., biosafety and ethics) to improve the public perception and acceptance of this new research discipline as benefiting humanity and the environment. In particular, there is a need to engage private and academic stakeholders to bring all these technologies to the poorest populations. Finally, plant biosystems design is much more challenging than traditional genetic engineering and therefore needs extensive interdisciplinary collaborations among many researchers around the world. We hope that the three computational modules in the "Plant Biodesign Hub" (Figures 6,  7, and 9) outlined in this roadmap will serve as a major public platform for national and international collaborations to realize the great promise of plant biosystems design for the future of our global society.

Disclosure
This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide 26 BioDesign Research public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http:// energy.gov/downloads/doe-public-access-plan).

Conflicts of Interest
The authors declare that they have no conflicts of interest regarding the publication of this article.

Authors' Contributions
XY planned and drafted the manuscript. All authors read and contributed to the content, edited, or reviewed it.