Software and Online Tools
Janak, Taylor, Floudas, Burka, Mountziaris
The panel assignment problem can be defined as follows. Given:
In addition, there are several requirements that must be met as follows:
Note that the panel assignment problem can be formulated as a linear integer programming problem. For more information, refer to Janak et al. (2006).
First, Gounaris, Wei, Floudas
Zeolites are crystalline aluminosilicate materials with microporous structure commonly used for catalysis, adsorption, and ion exchange. Though zeolites have found extensive use in industry due to their molecular shape selectivity, fundamental understanding of zeolites is limited by difficulty with rigorous experimental study. ZEOMICS is a computational approach for the automated characterization of zeolites and other microporous materials. In addition to offering access to an online database of structure characterizations, ZEOMICS provides users with the ability to upload a structure of interest to be processed by our methods, which include:
For more details, please refer to the ZEOMICS characterization steps and associated publications.
Metal-organic frameworks (MOFs) are highly porous materials suitable for applications including gas separation, gas storage, and catalysis. MOFomics is a novel computational framework for three-dimensional MOF pore characterization. It is the first computational method that automatically identifies portals, channels, cages, and their geometry and connectivity. We also calculate quantities of interest including pore size distribution, accessible volume, accessible surface area, largest cavity diameter, and pore limiting diameter. Our web tool MOFomics features colorful visual displays of MOF pores for an extensive database of structures. Furthermore, users can submit additional structures of interest to be automatically characterized by computational framework.
For more information, please refer to the MOFomics characterization steps.
Khoury, Tamamis, Pinnaduwage, Smadbeck, Kieslich, Floudas
Protein structure refinement aims to perform a set of operations given a predicted structure to improve model quality and accuracy with respect to a native structure in a blind fashion. Despite the numerous computational approaches to the refinement problem reported in CASPs 8,9, and 10, the overwhelming majority of methods degrade models rather than improve them. To address this challenge, we have created Princeton_TIGRESS, which when benchmarked on all CASP 7,8,9, and 10 refinement targets, simultaneously increased GDT_TS 76% of the time with an average improvement of 0.83 GDT_TS points per structure. The precursor to Princeton_TIGRESS was ranked in 5th place in the refinement category of CASP10. The method was additionally benchmarked on models produced by top performing three-dimensional structure prediction servers during CASP10. Using this tool, a user can submit a model of a protein structure generated through template-based methods with confidence that the output structure will be more like the native.
Deterministic global optimization of mixed-integer nonlinear programs (MINLP) is broadly applicable in diverse domains ranging from molecular biology to refinery operations to computational chemistry to synthesizing sustainable processes.
ANTIGONE (Algorithms for coNTinuous / Integer Global Optimization of Nonlinear Equations) is a computational framework for globally solving mixed-integer nonlinear optimization problems.
Major applications of mixed-integer quadratically-constrained quadratic programs (MIQCQP) include quality blending in process networks, separating objects in computational geometry, and portfolio optimization in finance. Specific instantiations of MIQCQP in process networks optimization problems include: pooling problems, distillation sequences, wastewater treatment and total water systems, hybrid energy systems, heat exchanger networks, reactor-separator-recycle systems, separation systems, data reconciliation, batch processes, crude oil scheduling, and natural gas production. Computational geometry problems formulated as MIQCQP include: point packing, cutting convex shapes from rectangles, maximizing the area of a convex polygon, and chip layout and compaction. Portfolio optimization in financial engineering can also be formulated as MIQCQP.
GloMIQO (Global Mixed-Integer Quadratic Optimizer) is a numerical solver addressing MIQCQP to ε-global optimality. GloMIQO is available through Princeton University and the General Algebraic Modeling System (GAMS; now available).
Misener, Thompson, Floudas
The pooling problem, an optimization challenge of maximizing profit subject product availability, storage capacity, demand, and product specification constraints, has applications to petroleum refining, wastewater treatment, supply-chain operations, and communications. The problem involves optimally selecting flowrates on a network topology of input, intermediate, and output nodes that represent feedstocks, storage tanks (or pools), and products, respectively (Misener and Floudas, 2009). The topology may be fixed (standard and extended problems) or treated as a decision variable (generalized problem).
APOGEE (Algorithms for Pooling-problem Optimization in GEneralized and Extended classes) is a generic computational tool for globally optimizing all three classes of pooling problems:
For more information and to download the client software, please visit the APOGEE webpage.
Khoury, Thompson, Smadbeck, Kieslich, Floudas
Post-translational modifications are ubiquitous in biology, but our knowledge of their interactions at an atomic level is severely limited by the lack of suitable parameters to enable their simulation for both protein folding and protein design applications. Forcefield_PTM is an AMBER forcefield compatible with ff03. It is the first forcefield for frequently occurring post-translational modifications that is based on quantum chemical calculations. The associated homepage contains an interactive online utility to derivatize PDB-format structures with one or more post-translational modifications contained in the forcefield. It outputs the modified structure and the appropriate input files to perform additional calculations locally on a user's local machine using AMBER. The forcefield parameters can be downloaded from the homepage and can be directly imported into AMBER, with instructions for use.
Khoury, Smadbeck, Tamamis, Vandris, Kieslich, Floudas
The goal of drug discovery is to design new compounds as disease therapeutics. The discovery of such molecules is challenging and costly due to the large combinatorial space that must be tested experimentally. Here, we present Forcefield_NCAA, a library of AMBER charge parameters derived from high-level quantum chemical calculations for 147 non-canonical amino acids (NCAAs) to aid in this endeavor. The charge parameters were derived using B3LYP/cc-pVTZ//HF/6-31G** calculations in an diethylether-like implicit solvent environment and have been developed to work with AMBER ff03. The Forcefield_NCAA webtool allows for the interactive modification of PDB structures with any combination of modifications contained in the AMBER or Forcefield_NCAA libraries. These NCAAs expand the searchable chemical space in computational protein design. Forcefield_NCAA with AMBER can be used to perform molecular dynamics simulations and subsequent screening through binding studies to eliminate the need to perform brute force experimental testing.
Khoury, Baliban, Floudas
We have developed an automated computational method for quantifying the number of post-translational modifications reported experimentally and non-experimentally in the Swiss-Prot Knowledgebase. Non-experimental qualifiers (Potential, Probable, By Similarity) are consistent with those defined in Swiss-Prot. This method will automatically load the new PTM IDs, curate them, and quantify this information every month as the Swiss-Prot Knowledgebase is updated. It was created with the intention of being used at-large as a continuously updated resource for use by the academic community. The raw and summarized statistics are of special interest to those pursuing research in the fields of Systems Biology, Proteomics, Protein Engineering, Molecular Biology.
For more details, please refer to the algorithmic steps and associated publications.
First, Gounaris, Floudas
Reaction mappings are one-to-one mappings between the reactant and product atoms in a chemical reaction. Such mappings correspond to potential chemical reaction mechanisms. DREAM (Determination of REAction Mechanisms) is an automated computational method to identify optimal reaction mappings that minimize either the number of bond changes or number of bond order changes in the corresponding reaction mechanisms. In contrast to other reaction mapping methods, DREAM is consistent with respect to stereochemistry and also takes hydrogen atoms into account. Additionally, DREAM can be used to locate multiple distinct mechanisms for a chemical reaction.
Note that the reaction mapping problem can be formulated as a linear integer optimization problem. For more information, please refer to First et al. (2012).
Subramani, DiMaggio, Floudas
ICON is an iterative, traveling salesman problem (TSP)-based clustering method for identifying near-native protein structures from an ensemble of conformers. Clustering of structures is carried out in the dihedral angle space and follows the TSP implementation of the rigorous global rearrangement approach OREO (DiMaggio et al., 2008). The method consists of an iterative procedure, which aims at eliminating clusters of structures at each iteration that are unlikely to be of similar fold to the native, based on a statistical analysis of cluster density and average spherical radius.
At the end of the iterative clustering procedure, the structures most likely to be close to the putative native structure are selected. To do this, the medoids of the densest clusters at the end of the final stage are collected. For each of these cluster mediods, novel Cα-Cα and centroid-centroid distance-dependent, high-resolution force fields (Rajgaria et al., 2006; Rajgaria et al., 2008) are implemented. These force fields aim to isolate native and near-native folds of a protein as lower energy structures, compared to structures further away from the native structure. The five cluster medoids with the lowest energies from each force field are selected as the top structures.
For more information, refer to Subramani et al. (2009).
DiMaggio, McAllister, Subramani, Floudas
OREO is an iterative framework for biclustering dense and sparse data matrices via Optimal RE-Ordering of rows and columns. Given an input data matrix and a chosen distance metric, the algorithm provides the globally optimal re-ordering of the rows and columns of the matrix
BeST is a mixed-integer linear optimization based approach towards beta-sheet prediction in beta and mixed alpha/beta proteins. The objective is to maximize the total strand-to-strand contact potential of the protein. A large number of physical constraints are applied to provide biologically meaningful topology results.
Paules, Ciric, Aggarwal, Kokossis, Floudas
APROS (Algorithms for PROcess Synthesis) is an algorithmic development methodology for classes of mathematical programming problems that require application of some form of decomposition technique and involve extensive communication of data between a set of subproblems whose sizes and structures may vary during an iterative solution procedure. Typical examples of such algorithms include most classes of mixed-integer nonlinear programming (MINLP) problems, variable partitioning algorithms that result in sequences of subproblems, large-scale mixed-integer linear programming (MILP) problems, as well as a wide variety of algorithms for large-scale MINLP, NLP, and LP systems exhibiting special structure.
The application of APROS to MINLP problems has a number of key features that are common to most decomposition algorithms. It should be noted that it is the generality of these features and their application to other classes of problems and algorithms that make APROS a very promising algorithmic development tool. These features include:
MINOPT is modeling language and algorithmic framework for the solution of various types of optimization problems. The types of problems MINOPT is capable of solving are described by the types of variables and constraints used in the model. MINOPT handles continuous time invariant, continuous dynamic, and integer variables and it recognizes linear, nonlinear, dynamic, dynamic point, and dynamic path constraints. The types of models MINOPT is capable of addressing are
The MINOPT code is written in ANSI C and has been developed with an expandable platform. Additional solvers and algorithms can be added to the program as they become available. The modeling language can be expanded to recognize additional variable types, constraint types, and commands should they be needed.
For more information about MINOPT, including how to obtain the program, visit the MINOPT Home Page.
GLOPEQ (GLobal Optimization for the Phase and chemical EQuilibrium problem) is a package written entirely in C for the computation of equilibrium solutions corresponding to a global minimum of the Gibbs free energy function. It can be used for systems of liquid phases whose behavior can be adequately modeled by any of the following equations:
GLOPEQ can be used to solve either of the following two problems:
cGOP is a package for rigorously solving nonconvex optimization problems to global optimality. The package implements the GOP algorithm using a set of C subroutines to solve these problems using decomposition and branch-and-bound techniques. It also incorporates several improvements made to the original GOP algorithm to reduce the computational complexity, as well as new formulations that permit implicit solutions of some of the subproblems encountered during the algorithmic steps. The algorithms use local optimization solvers (currently MINOS and CPLEX) to solve linear, mixed-integer linear and convex subproblems. Currently, the package can be used to solve problems with linear constraints. The original algorithm and its derivatives can be accessed by calls to high-level subroutines as well as through a standalone mode for quadratic problems. Furthermore, it can also be accessed using a high-level interface that permits easy description of the problems.
cGOP is designed as a library of subroutines that can be called from any high level programming language. It has been written in portable ANSI C, and is currently available for the HP 9000, IBM RS6000, SUN4 and Silicon Graphics architectures. There are no restrictions on the sizes of problems that can be solved; the algorithms are only limited by the available memory and CPU resources on the machine on which the software is run.
cGOP can currently be used to solve problems of the following form: where x and y are n1 and n2 vectors of variables, c, d, xL, xU, yL, and yU are constant vectors, Q is a constant matrix and can be indefinite, and the functions F1(x) and F2(y) are smooth convex scalar functions. Note that some (or all) of the y variables can be binary variables; in this case, the function F2(y) must be linear.
The cGOP algorithm is summarized as follows: at each node of the branch-and-bound tree, a primal problem is solved in the x variables to provide an upper bound on the global solution. The solution of this problem is used to construct a Lagrange function that underestimates the optimal solution of the original problem in the current domain of interest. The gradients of the Lagrange function are then used as a basis for partitioning this domain into subdomains. Subsequently, in each subdomain, a linear or convex lower bounding problem in y is solved. The solutions to these problems provide a set of "children" nodes for the incumbent node, and also provide a lower bound for the global solution. One of these children nodes is then selected as a candidate node for further exploration, and the procedure is repeated. Due to accumulation of cuts from previous iterations, the upper and lower bounds are successively tightened until the algorithm converges to the global solution.
You may download a beta-test version of cGOP as a 1.4 MB compressed tar file, or a 1.0 MB GNU-zipped tar file. The online version of the user's manual for cGOP (113 K compressed PostScript file) and the manual for cparse (50 K PostScript file) are also available.
Adjiman, Androulakis, Maranas, Floudas
αBB is a global optimization algorithm for general constrained nonconvex optimization problems, with twice-differentiable functions. It combines a novel convex lower bounding procedure within a branch and bound framework. General nonconvex terms are underestimated through quadratic functions derived from second-order information. In addition, a number of special underestimators have been incorporated into the package in order to take full advantage of any known mathematical properties of the functions.
The aim was to develop a user-friendly environment that allows the efficient optimization of twice differentiable NLP's while maintaining its simplicity and flexibility. Thus, when preparing the input file, the user can decide to use the very general quadratic underestimators, to use built-in special underestimators or to specify his/her own lower bounding functions.
The structure of the algorithm allows for the following features: