Level: M2
Contact: Philippe Dague (pdague@lmf.cnrs.fr)
Location: Laboratoire Méthodes Formelles, ENS Paris-Saclay
Keywords: Metabolic pathway analysis, models of cellular growth, thermodynamics, kinetics, regulation, convex polyhedral geometry, combinatorial optimization, ASP
Context
In the context of systems biology, the analysis of metabolic pathways is an essential step in studying the behavior of metabolic networks, which has important societal implications for human health (metabolic therapies) and biotechnology (designing cell factories to produce metabolites of interest to the agri-food industry, animal husbandry, energy production, and environmental protection).
In a steady state (i.e., constant concentrations of all molecular species), and when only the stoichiometry of the reactions and the known irreversibility of certain of them are taken into account, metabolic pathways constitute a convex polyhedral cone or, more generally, a convex polyhedron if inhomogeneous linear constraints on fluxes are added [19, 20, 18, 10, 21, 14, 8, 5]. They are thus characterized by those pathways called elementary vectors (EVs), in the sense that they are not decomposable without cancellations: any pathway can actually be broken down into a weighted sum of non-negative coefficients of elementary vectors. In the simplest case of null growth and no inhomogeneous linear constraints, metabolic pathways constitute a subspace cone or flux cone, and EVs are identical to minimal (for the set inclusion of the sets of reactions that compose them) pathways, called elementary flux modes (EFMs). Enumerating EVs or EFMs boils down to enumerating the vertices of a convex polyhedron. The most powerful tools for this are based on the Fourier-Motzkin double description method [3]. However, despite optimized implementations that allow the enumeration of millions of elementary pathways [17, 9], this method cannot handle metabolic models at the genome scale.
Furthermore, most of the pathways produced in this way are not biologically relevant because other biological constraints than only stoichiometry and irreversibility of certain reactions must be taken into account, such as thermodynamics, kinetics, and regulation. Doing so avoids producing pathways that have no biological reality and limits their number with the aim of scaling up to the genome level. Recent international work has begun to incorporate thermodynamic constraints on the one hand [4, 16] and regulatory constraints on the other [7] into the iterative double description algorithm. Other work uses Answer Set Programming (ASP), possibly coupled with Linear Programming, to express and solve the constraints, including thermodynamics constraints [11, 1]. A theoretical study of the structure of the solution space of metabolic pathways in the presence of thermodynamic and kinetic constraints (and, more generally, of sign-monotone constraints) and of Boolean regulatory constraints (and, more generally, of sign-invariant constraints), and of the characterization of the elementary vectors has been conducted in the particular case of null growth [2].
The same author is currently generalizing this work to the case of models of cellular growth, by drawing on the recent articles [6, 13, 15]. In the process of cellular growth, steady-state self-fabrication requires a cell to synthesize all of its components, including metabolites, enzymes, and ribosomes, in proportions that match its own composition. Simultaneously, the concentrations of these components affect the rates of metabolism and biosynthesis, and hence the growth rate. The theory describes all phenotypes that solve this circular problem. Metabolic pathways generalize in growth vectors (GVs) and EVs generalize in elementary growth vectors (EGVs), but the situation is quite different from that in the case of null growth: in the latter case, the concentrations of the molecular species are independent and can be imposed; in case of growth, the concentrations of all species (metabolites, enzymes, ribosome) are fixed by each solution flux (as well as the growth rate). In addition, the kinetic constraints are necessarily taken into account to elaborate the theory, either in the form of complete kinetic models if available, or, most of the time, of (semi-)kinetic models that are constraint-based models [12] where only capacity constraints (bounds on the kinetics) are expressed. Results are obtained concerning growth rate maximization: one can eliminate the fluxes and the concentrations of the enzymes and the ribosome, keeping as variables of the problem the ribosome fractions allocated to the synthesis of each enzyme and the ribosome itself, as well as the metabolite concentrations and growth rate. The solutions in the space of ribosome fractions, for each given metabolite concentration and growth rate, are called growth states (GSs) and constitute a polyhedron whose vertices are called elementary growth states (EGSs) and are also the support- minimal vectors of the polyhedron. One finds that the maximum growth rate is attained at an EGS.
Objectives
The objectives are to design and implement algorithms for enumerating EGVs and EGSs in comprehensive metabolic models of cellular growth, taking thus necessarily kinetic constraints into account (and possibly also regulatory constraints) for large-scale metabolic networks (with the genome level as the ultimate target).
The work will build from one side on the theoretical work [2] and on the software developed during a Master 1 internship in 2015 and during a thesis defended in 2023 for the case of null growth (with thermodynamic or regulatory constraints), and from the other side on the theoretical results obtained in [6, 13, 15] for the case of non-null growth, for which the integration of kinetic constraints will be studied. Methods relying on the double description method by calling on a convex programming solver, or on a constraint solver such as SMT (Satisfiability Modulo Theory) solver or ASP will be implemented, and the effectiveness of the two approaches will be compared. Validation of methods, algorithms, and software will be conducted first on small-scale networks and then on genome-scale networks derived from Bacillus subtilis.
Requirements
Knowledge to acquire about metabolic networks and convex polyhedral geometry, by reading [2], which is self-contained, and [6, 13, 15] for the non-null growth case. Use of a logic-based programming language, such as ASP, and/or of convex programming or combinatorial optimization languages, for experimentation.
Supervision
Philippe Dague, emeritus professor at Université Paris-Saclay (philippe.dague@universite-paris- saclay.fr).
References
[1] Emma Crisci, Maxime Mahout, and Sabine Peres. “Computing Thermodynamically Consistent Elementary Flux Modes with Answer Set Programming”. In: CMSB 2024 - 22nd International Conference on Computational Methods in Systems Biology. Pise, Italy, Sept. 2024, pp. 80–88.
[2] Philippe Dague. “Metabolic Pathway Analysis in the Presence of Biological Constraints”. In: Compu- tation 9(10):111 (2021).
[3] Komei Fukuda and Alain Prodon. “Double Description Method Revisited”. In: Combinatorics and Computer Science. Ed. by M. Deza, R. Euler, and I. Manoussakis. Vol. 1120. Lecture Notes in Computer Science. Springer, 1996, pp. 91–111.
[4] Matthias P. Gerstl, Christian Jungreuthmayer, and Jürgen Zanghellini. “tEFMA: computing thermo- dynamically feasible elementary flux modes in metabolic networks”. In: Bioinformatics 31(13) (2015), pp. 2232–2234.
[5] Matthias P. Gerstl, Stefan Müller, Georg Regensburger, and Jürgen Zanghellini. “Flux tope analysis: studying the coordination of reaction directions in metabolic networks”. In: Bioinformatics 35(2) (2019), pp. 266–273.
[6] Dean H. de Groot, Josephus Hulshof, Bas Teusink, Frank J. Bruggeman, and Robert Planqué. “Elementary Growth Modes provide a molecular description of cellular self-fabrication”. In: PLoS Comput Biol 16(1):e1007559 (2020).
[7] Christian Jungreuthmayer, David E. Ruckerbauer, Matthias P. Gerstl, Michael Hanscho, and Jürgen Zanghellini. “Avoiding the enumeration of infeasible elementary flux modes by including transcriptional regulatory rules in the enumeration process saves computational costs”. In: PLoS ONE 10(6):e0129840 (2015).
[8] Steffen Klamt et al. “From elementary flux modes to elementary flux vectors: metabolic pathway analysis with arbitrary linear flux constraints”. In: PLoS Comput Biol 13:e1005409 (2017).
[9] Jan Bert van Klinken and Ko Willems van Dijk. “FluxModeCalculator: an efficient tool for large-scale flux mode computation”. In: Bioinformatics 32(8) (2016), pp. 1265–1266.
[10] Francisco Llaneras and Jesús Picó. “Which Metabolic Pathways Generate and Characterize the Flux Space? A Comparison among Elementary Modes, Extreme Pathways and Minimal Generators”. In: Journal of Biomedicine and Biotechnology 2010 (2010), pp. 1–13.
[11] Maxime Mahout, Ross P. Carlson, and Sabine Peres. “Answer Set Programming for Computing Constraints-Based Elementary Flux Modes: Application to Escherichia coli Core Metabolism”. In: Processes 8(12) (2020).
[12] Cécile Moulin, Laurent Tournier, and Sabine Peres. “Combining Kinetic and Constraint-Based Modelling to Better Understand Metabolism Dynamics”. In: Processes 9(10) (2021).
[13] Stefan Müller. “Elementary growth modes/vectors and minimal autocatalytic sets for kinetic/constraint-
based models of cellular growth”. In: bioRxiv (2021). url: org/content/
early/2021/02/25/2021.02.24.432769.
[14] Stefan Müller and Georg Regensburger. “Elementary Vectors and Conformal Sums in Polyhedral Geometry and their Relevance for Metabolic Pathway Analysis”. In: Frontiers in Genetics 7(90) (2016).
[15] Stefan Müller, Diana Széliová, and Jürgen Zanghellini. “Elementary vectors and autocatalytic sets for resource allocation in next-generation models of cellular growth”. In: PLoS Comput Biol 18(2):e1009843 (2022).
[16] Sabine Peres, Stefan Schuster, and Philippe Dague. “Thermodynamic constraints for identifying the elementary flux modes”. In: Biochemical Society Transactions 46(3) (2018), pp. 641–647.
[17] Marco Terzer and Jörg Stelling. “Large-scale computation of elementary flux modes with bit pattern trees”. In: Bioinformatics 24(19) (2008), pp. 2229–2235.
[18] Cong T. Trinh, Aaron Wlaschin, and Friedrich Srienc. “Elementary mode analysis: a useful metabolic pathway analysis tool for characterizing cellular metabolism”. In: Applied Microbiology and Biotech- nology 81(5) (2009), pp. 813–826.
[19] Robert Urbanczik and Clemens Wagner. “An improved algorithm for stoichiometric network analysis: theory and applications”. In: Bioinformatics 21(7) (2005), pp. 1203–1210.
[20] Clemens Wagner and Robert Urbanczik. “The Geometry of the Flux Cone of a Metabolic Network”. In: Biophysical Journal 89(6) (2005), pp. 3837–3845.
[21] Jürgen Zanghellini, David E. Ruckerbauer, Michael Hanscho, and Christian Jungreuthmayer. “Elementary flux modes in a nutshell: properties, calculation and applications”. In: Biotechnology Journal 8(9) (2013), pp. 1009–1016.