Development of MICROBASE for Linking Large Microbial and Environmental Datasets

Author(s): Trivedi BA, Patel PN, Jani HJ and Ratna Trivedi

New analytical techniques in microbiology have created the potential to investigate microbial communities, their interactions, and their role in ecosystem functions in novel ways. Combinations of such techniques allow the rapid generation of large datasets describingmicrobial community composition and variation across time and space. In order to address ecologically relevant questions,microbial community datasetsmust be linked with related environmental datasets. This challenging task is made feasible in a rich data mining environment through a complex data model and interactive querying tools against a relational database. This paper discusses the motivation and design for one such microbial ecology database, the research questions made tractable by this informatics project. The design principles and data-model used during database development are presented. Architecture that supports the progressive evolution of the informatics system is also discussed. Interactions with the user community in data model development were essential. This application is customdesigned to the needs and objectives of linking microbial and environmental questions, highlighting contributions frominformatics to ecology. The bio-data model reflects the data mining regimes of the microbial disciplines and supports research questions that could not have been asked without such informatics tools. The project also serves as an illustrative case study in the design of data models and information systems, not only for microbialenvironmental datasets, but in a broader perspective, for other biological databases that could adapt the techniques used here for data integration and mining.

