The COSMIC Simulator, genetic evolution in a box


  This is the front page for the bacterial simulator known as COSMIC, COmputing Systems of Microbial InteraCtions. The aim of COSMIC is to simulate bacterial evolution from a genetic scale while maintaining a population of individual cells. As a result of these broad aims COSMIC has grown into a large piece of software that is not like any other bacteria inspired simulator, some look at the scale of the population (BacSim), some look inside the cell (E-Cell) - COSMIC does both by being careful of what is actually simulated and by using the added power of parallel processing.

Evolution has frequently been seen as a result of the continuous or discontinuous accumulation of small mutations. Over the many years since Darwin, it has been found that mutations are not the only mechanism driving genomic change, for example, plasmids, transposons, bacteriophages, insertion sequences, deletion and duplication, and stress-sensitive mutation all have a part to play in directing the genetic composition of organisms towards meeting the moving target that is the environmental ideal at any one time. Considering the probability of single point mutations arising and the repair mechanisms that may act to counteract their accumulation, it is unlikely that simple mutation can create rapid diversity. It is clear that evolutionary change depends more on larger scale changes in genomic sequences caused by sexual and other forms of horizontal gene transfer. These generate the variation necessary to allow rapid evolutionary response to changing environmental conditions.

Predictive models of E.coli cellular processes already exist, the E-Cell project aims to use gene data directly in a mathematical model of transcription. The Virtual Cell project makes use of user defined protein reactions to simulate compartments at the nucleus and cell level. Gepasi3 also models protein reactions, but from within an enclosed box environment. The BacSim project simulates individual cell growth at the population scale. Eos is also based at the population scale, but is intended as a framework for testing idealised ecologies, represented by evolutionary algorithms. These tools and those that they rely on are excellent models of behaviour. However, they suffer the same drawbacks; all rely on actual experimental data to be input and more importantly, once input that data is static. The aim of this study is to answer some of the questions regarding bacterial evolution and the role played by genetic events other than simple point mutation using an evolving multicellular and multispecies model that builds up from the scale of the genome. In effect, it is not bacterial evolution that is being questioned, but the co-evolution of bacteria and any micro organism that has a direct effect on the genetics of bacteria.

To test these questions, it is necessary to build a model that tries to encompass what are considered the important qualities of bacterial evolution and bacterial life, but is not overly specified as to constrain the results. The model is therefore a careful balance of biological and computational realities with an emphasis on open-endedness. The biological literature has many examples of the possible forms of mechanism within the relatively `simple' example of E.coli, but even this must be carefully constrained. It is clear that computational models lack computational power when compared to real world processes.

In focusing attention on aspects of the E.coli system, it is clear that there are two new insights provided by the merging disciplines of genomics and proteomics. Proteomics is the study of enzyme and protein interactions. Traditionally this meant differential equation models of interaction but, nowadays there seems also to be an implicit link with the application of protein descriptors derived from sequence information in identified genes, an application that has only recently become tractable with the arrival of accurate genome data. Genomics is the study of genome structure, interaction and encoding and has been stimulated by the Human Genome project as well as whole genome sequencing projects for many other organisms, notably those for numerous bacteria. It now appears that the genome is not a book that is continually read from, but a program that is continually executed over the life time of individual cells, tissues or entire organisms. From this it appears interactions within cells involve the combined effects of enzymes, structural and regulatory proteins acting on genes, which in turn acts on those enzymes and other proteins, creating a huge number of both positive and negative feedback loops necessary for controlled execution. The ideal model therefore is one that takes both these stages into account and allows for the evolution of the genome in the presence of other genomes, each genome being an implementation of what many conceive as the computational cell.

Looking at this another way, the main aim of COSMIC is evolutionary modelling based on biologically realistic organisms. Evolution requires something akin to a genome and the above argument shows a genome by itself is inert and only part of the overall mechanism. There are then three themes to the COSMIC model: the environment, the genome and functional proteins, including enzymes.