Software and Online ToolsPanel Assignment Problem Online SolverJanak, Taylor, Floudas, Burka, Mountziaris The panel assignment problem can be defined as follows. Given:
In addition, there are several requirements that must be met as follows:
Note that the panel assignment problem can be formulated as a linear integer programming problem. For more information, refer to Janak et al. (2006). ZEOMICSFirst, Gounaris, Wei, Floudas Zeolites are crystalline aluminosilicate materials with microporous structure commonly used for catalysis, adsorption, and ion exchange. Though zeolites have found extensive use in industry due to their molecular shape selectivity, fundamental understanding of zeolites is limited by difficulty with rigorous experimental study. ZEOMICS is a computational approach for the automated characterization of zeolites and other microporous materials. In addition to offering access to an online database of structure characterizations, ZEOMICS provides users with the ability to upload a structure of interest to be processed by our methods, which include:
For more details, please refer to the ZEOMICS characterization steps and associated publications. MOFomicsFirst, Floudas Metalorganic frameworks (MOFs) are highly porous materials suitable for applications including gas separation, gas storage, and catalysis. MOFomics is a novel computational framework for threedimensional MOF pore characterization. It is the first computational method that automatically identifies portals, channels, cages, and their geometry and connectivity. We also calculate quantities of interest including pore size distribution, accessible volume, accessible surface area, largest cavity diameter, and pore limiting diameter. Our web tool MOFomics features colorful visual displays of MOF pores for an extensive database of structures. Furthermore, users can submit additional structures of interest to be automatically characterized by computational framework. For more information, please refer to the MOFomics characterization steps. Princeton_TIGRESS: Protein geometry refinement using simulations and support vector machinesKhoury, Tamamis, Pinnaduwage, Smadbeck, Kieslich, Floudas Protein structure refinement aims to perform a set of operations given a predicted structure to improve model quality and accuracy with respect to a native structure in a blind fashion. Despite the numerous computational approaches to the refinement problem reported in CASPs 8,9, and 10, the overwhelming majority of methods degrade models rather than improve them. To address this challenge, we have created Princeton_TIGRESS, which when benchmarked on all CASP 7,8,9, and 10 refinement targets, simultaneously increased GDT_TS 76% of the time with an average improvement of 0.83 GDT_TS points per structure. The precursor to Princeton_TIGRESS was ranked in 5th place in the refinement category of CASP10. The method was additionally benchmarked on models produced by top performing threedimensional structure prediction servers during CASP10. Using this tool, a user can submit a model of a protein structure generated through templatebased methods with confidence that the output structure will be more like the native. ANTIGONEMisener, Floudas Deterministic global optimization of mixedinteger nonlinear programs (MINLP) is broadly applicable in diverse domains ranging from molecular biology to refinery operations to computational chemistry to synthesizing sustainable processes. ANTIGONE (Algorithms for coNTinuous / Integer Global Optimization of Nonlinear Equations) is a computational framework for globally solving mixedinteger nonlinear optimization problems. ANTIGONE is available through Princeton University and the General Algebraic Modeling System (GAMS (beginning distribution 24.1). GloMIQOMisener, Floudas Major applications of mixedinteger quadraticallyconstrained quadratic programs (MIQCQP) include quality blending in process networks, separating objects in computational geometry, and portfolio optimization in finance. Specific instantiations of MIQCQP in process networks optimization problems include: pooling problems, distillation sequences, wastewater treatment and total water systems, hybrid energy systems, heat exchanger networks, reactorseparatorrecycle systems, separation systems, data reconciliation, batch processes, crude oil scheduling, and natural gas production. Computational geometry problems formulated as MIQCQP include: point packing, cutting convex shapes from rectangles, maximizing the area of a convex polygon, and chip layout and compaction. Portfolio optimization in financial engineering can also be formulated as MIQCQP. GloMIQO (Global MixedInteger Quadratic Optimizer) is a numerical solver addressing MIQCQP to εglobal optimality. GloMIQO is available through Princeton University and the General Algebraic Modeling System (GAMS; now available). APOGEEMisener, Thompson, Floudas The pooling problem, an optimization challenge of maximizing profit subject product availability, storage capacity, demand, and product specification constraints, has applications to petroleum refining, wastewater treatment, supplychain operations, and communications. The problem involves optimally selecting flowrates on a network topology of input, intermediate, and output nodes that represent feedstocks, storage tanks (or pools), and products, respectively (Misener and Floudas, 2009). The topology may be fixed (standard and extended problems) or treated as a decision variable (generalized problem). APOGEE (Algorithms for Poolingproblem Optimization in GEneralized and Extended classes) is a generic computational tool for globally optimizing all three classes of pooling problems:
For more information and to download the client software, please visit the APOGEE webpage. Forcefield_PTMKhoury, Thompson, Smadbeck, Kieslich, Floudas Posttranslational modifications are ubiquitous in biology, but our knowledge of their interactions at an atomic level is severely limited by the lack of suitable parameters to enable their simulation for both protein folding and protein design applications. Forcefield_PTM is an AMBER forcefield compatible with ff03. It is the first forcefield for frequently occurring posttranslational modifications that is based on quantum chemical calculations. The associated homepage contains an interactive online utility to derivatize PDBformat structures with one or more posttranslational modifications contained in the forcefield. It outputs the modified structure and the appropriate input files to perform additional calculations locally on a user's local machine using AMBER. The forcefield parameters can be downloaded from the homepage and can be directly imported into AMBER, with instructions for use. Forcefield_NCAAKhoury, Smadbeck, Tamamis, Vandris, Kieslich, Floudas The goal of drug discovery is to design new compounds as disease therapeutics. The discovery of such molecules is challenging and costly due to the large combinatorial space that must be tested experimentally. Here, we present Forcefield_NCAA, a library of AMBER charge parameters derived from highlevel quantum chemical calculations for 147 noncanonical amino acids (NCAAs) to aid in this endeavor. The charge parameters were derived using B3LYP/ccpVTZ//HF/631G** calculations in an diethyletherlike implicit solvent environment and have been developed to work with AMBER ff03. The Forcefield_NCAA webtool allows for the interactive modification of PDB structures with any combination of modifications contained in the AMBER or Forcefield_NCAA libraries. These NCAAs expand the searchable chemical space in computational protein design. Forcefield_NCAA with AMBER can be used to perform molecular dynamics simulations and subsequent screening through binding studies to eliminate the need to perform brute force experimental testing. ProteomeWide PostTranslational Modification StatisticsKhoury, Baliban, Floudas We have developed an automated computational method for quantifying the number of posttranslational modifications reported experimentally and nonexperimentally in the SwissProt Knowledgebase. Nonexperimental qualifiers (Potential, Probable, By Similarity) are consistent with those defined in SwissProt. This method will automatically load the new PTM IDs, curate them, and quantify this information every month as the SwissProt Knowledgebase is updated. It was created with the intention of being used atlarge as a continuously updated resource for use by the academic community. The raw and summarized statistics are of special interest to those pursuing research in the fields of Systems Biology, Proteomics, Protein Engineering, Molecular Biology. For more details, please refer to the algorithmic steps and associated publications. DREAMFirst, Gounaris, Floudas Reaction mappings are onetoone mappings between the reactant and product atoms in a chemical reaction. Such mappings correspond to potential chemical reaction mechanisms. DREAM (Determination of REAction Mechanisms) is an automated computational method to identify optimal reaction mappings that minimize either the number of bond changes or number of bond order changes in the corresponding reaction mechanisms. In contrast to other reaction mapping methods, DREAM is consistent with respect to stereochemistry and also takes hydrogen atoms into account. Additionally, DREAM can be used to locate multiple distinct mechanisms for a chemical reaction. Note that the reaction mapping problem can be formulated as a linear integer optimization problem. For more information, please refer to First et al. (2012). ICONSubramani, DiMaggio, Floudas ICON is an iterative, traveling salesman problem (TSP)based clustering method for identifying nearnative protein structures from an ensemble of conformers. Clustering of structures is carried out in the dihedral angle space and follows the TSP implementation of the rigorous global rearrangement approach OREO (DiMaggio et al., 2008). The method consists of an iterative procedure, which aims at eliminating clusters of structures at each iteration that are unlikely to be of similar fold to the native, based on a statistical analysis of cluster density and average spherical radius. At the end of the iterative clustering procedure, the structures most likely to be close to the putative native structure are selected. To do this, the medoids of the densest clusters at the end of the final stage are collected. For each of these cluster mediods, novel C_{α}C_{α} and centroidcentroid distancedependent, highresolution force fields (Rajgaria et al., 2006; Rajgaria et al., 2008) are implemented. These force fields aim to isolate native and nearnative folds of a protein as lower energy structures, compared to structures further away from the native structure. The five cluster medoids with the lowest energies from each force field are selected as the top structures. For more information, refer to Subramani et al. (2009). OREODiMaggio, McAllister, Subramani, Floudas OREO is an iterative framework for biclustering dense and sparse data matrices via Optimal REOrdering of rows and columns. Given an input data matrix and a chosen distance metric, the algorithm provides the globally optimal reordering of the rows and columns of the matrix For more information, refer to DiMaggio et al. (2008), DiMaggio et al. (2008), and McAllister et al. (2009). BeSTSubramani, Floudas BeST is a mixedinteger linear optimization based approach towards betasheet prediction in beta and mixed alpha/beta proteins. The objective is to maximize the total strandtostrand contact potential of the protein. A large number of physical constraints are applied to provide biologically meaningful topology results. APROSPaules, Ciric, Aggarwal, Kokossis, Floudas APROS (Algorithms for PROcess Synthesis) is an algorithmic development methodology for classes of mathematical programming problems that require application of some form of decomposition technique and involve extensive communication of data between a set of subproblems whose sizes and structures may vary during an iterative solution procedure. Typical examples of such algorithms include most classes of mixedinteger nonlinear programming (MINLP) problems, variable partitioning algorithms that result in sequences of subproblems, largescale mixedinteger linear programming (MILP) problems, as well as a wide variety of algorithms for largescale MINLP, NLP, and LP systems exhibiting special structure. The application of APROS to MINLP problems has a number of key features that are common to most decomposition algorithms. It should be noted that it is the generality of these features and their application to other classes of problems and algorithms that make APROS a very promising algorithmic development tool. These features include:
MINOPTSchweiger, Floudas MINOPT is modeling language and algorithmic framework for the solution of various types of optimization problems. The types of problems MINOPT is capable of solving are described by the types of variables and constraints used in the model. MINOPT handles continuous time invariant, continuous dynamic, and integer variables and it recognizes linear, nonlinear, dynamic, dynamic point, and dynamic path constraints. The types of models MINOPT is capable of addressing are
The MINOPT code is written in ANSI C and has been developed with an expandable platform. Additional solvers and algorithms can be added to the program as they become available. The modeling language can be expanded to recognize additional variable types, constraint types, and commands should they be needed. For more information about MINOPT, including how to obtain the program, visit the MINOPT Home Page. GLOPEQMcDonald, Floudas GLOPEQ (GLobal Optimization for the Phase and chemical EQuilibrium problem) is a package written entirely in C for the computation of equilibrium solutions corresponding to a global minimum of the Gibbs free energy function. It can be used for systems of liquid phases whose behavior can be adequately modeled by any of the following equations:
GLOPEQ can be used to solve either of the following two problems:
cGOPVisweswaran, Floudas cGOP is a package for rigorously solving nonconvex optimization problems to global optimality. The package implements the GOP algorithm using a set of C subroutines to solve these problems using decomposition and branchandbound techniques. It also incorporates several improvements made to the original GOP algorithm to reduce the computational complexity, as well as new formulations that permit implicit solutions of some of the subproblems encountered during the algorithmic steps. The algorithms use local optimization solvers (currently MINOS and CPLEX) to solve linear, mixedinteger linear and convex subproblems. Currently, the package can be used to solve problems with linear constraints. The original algorithm and its derivatives can be accessed by calls to highlevel subroutines as well as through a standalone mode for quadratic problems. Furthermore, it can also be accessed using a highlevel interface that permits easy description of the problems. cGOP is designed as a library of subroutines that can be called from any high level programming language. It has been written in portable ANSI C, and is currently available for the HP 9000, IBM RS6000, SUN4 and Silicon Graphics architectures. There are no restrictions on the sizes of problems that can be solved; the algorithms are only limited by the available memory and CPU resources on the machine on which the software is run. cGOP can currently be used to solve problems of the following form: where x and y are n_{1} and n_{2} vectors of variables, c, d, x^{L}, x^{U}, y^{L}, and y^{U} are constant vectors, Q is a constant matrix and can be indefinite, and the functions F_{1}(x) and F_{2}(y) are smooth convex scalar functions. Note that some (or all) of the y variables can be binary variables; in this case, the function F_{2}(y) must be linear. The cGOP algorithm is summarized as follows: at each node of the branchandbound tree, a primal problem is solved in the x variables to provide an upper bound on the global solution. The solution of this problem is used to construct a Lagrange function that underestimates the optimal solution of the original problem in the current domain of interest. The gradients of the Lagrange function are then used as a basis for partitioning this domain into subdomains. Subsequently, in each subdomain, a linear or convex lower bounding problem in y is solved. The solutions to these problems provide a set of "children" nodes for the incumbent node, and also provide a lower bound for the global solution. One of these children nodes is then selected as a candidate node for further exploration, and the procedure is repeated. Due to accumulation of cuts from previous iterations, the upper and lower bounds are successively tightened until the algorithm converges to the global solution. You may download a betatest version of cGOP as a 1.4 MB compressed tar file, or a 1.0 MB GNUzipped tar file. The online version of the user's manual for cGOP (113 K compressed PostScript file) and the manual for cparse (50 K PostScript file) are also available. αBBAdjiman, Androulakis, Maranas, Floudas αBB is a global optimization algorithm for general constrained nonconvex optimization problems, with twicedifferentiable functions. It combines a novel convex lower bounding procedure within a branch and bound framework. General nonconvex terms are underestimated through quadratic functions derived from secondorder information. In addition, a number of special underestimators have been incorporated into the package in order to take full advantage of any known mathematical properties of the functions. The aim was to develop a userfriendly environment that allows the efficient optimization of twice differentiable NLP's while maintaining its simplicity and flexibility. Thus, when preparing the input file, the user can decide to use the very general quadratic underestimators, to use builtin special underestimators or to specify his/her own lower bounding functions. The structure of the algorithm allows for the following features:
