RIT Digital Archive

Connected in a small world: Rapid integration of heterogenous biology resources

RIT Digital Archive

Show simple item record

dc.contributor.author Park, Sang P.
dc.contributor.author Song, Carol X.
dc.contributor.author Topkara, Umut
dc.contributor.author Woo, Jungha
dc.date.accessioned 2008-11-05T17:10:17Z
dc.date.available 2008-11-05T17:10:17Z
dc.date.issued 2006
dc.identifier.uri http://hdl.handle.net/1850/7338
dc.description.abstract Timely access to the most up to date versions of resources, such as data and software, is of paramount importance for researchers in an active field like Biology. We introduce a grid enabled biological data and software collection portal architecture, SALSA (a Scalable Simple Architecture), that is tailored towards fast integration of new computational resources made available by ever faster advancing and diversifying research in this area. We identify two models that guide the design of SALSA: heterogeneous database model and network growth model with preferential attachment. SALSA recognizes the challenges that are noted by the previous research on heterogeneous database model inherent in biological database resources; these resources are autonomously managed and lack a common database schema. SALSA is also guided by a model for the growth of the portal’s collection (of data and associated software to process this data) from previous research on related collections (e.g. citation networks and software package dependencies). This model suggests that in the presence of components that have a higher likelihood of gaining new connections (e.g., popular resources such as BLAST or FASTA sequences), the relationships between components tend to organize in a small-world scale-free network. The growth model helps the portal developers identify important hub components that emerge by taking part in increasing number of tasks as the portal grows. In order to effectively improve the overall user experience, developers can direct expensive development efforts (e.g., query optimization, user interface, documentation, etc.) to hub components, rather than to specialized components that have a lesser likelihood of developing to become hubs. In this paper we discuss a grid enabled web portal implementation that is built to contain a growing collection of biological data and software to process this data. The implementation that we present is a realization of Scalable Simple Architecture (SALSA) that strives to rapidly integrate newly published components into the existing collection in a sustainable fashion. Notably, this implementation uses flexibility of XML for component management, XSL for web user interface, SRB and MCAT for large data storage.
dc.description.sponsorship ACM, IEEE en_US
dc.language.iso en
dc.relation RIT Scholars content from RIT Digital Media Library has moved from http://ritdml.rit.edu/handle/1850/7338 to RIT Scholar Works http://scholarworks.rit.edu/article/976, please update your feeds & links!
dc.subject Heterogeneous database model en_US
dc.subject Network growth model en_US
dc.subject Preferential attachment en_US
dc.subject Scalable simple architecture en_US
dc.title Connected in a small world: Rapid integration of heterogenous biology resources
dc.type Preprint

Files in this item

The following license files are associated with this item:

This item appears in the following Collection(s)

  • International Workshop on Grid Computing Environments--Open Access (2006)
    The purpose of this Workshop is to bring together the community to discuss and foster the exchange of ideas for the development of tools, methodologies and frameworks to build Grid computing environments. Last year’s conference at SC05 focussed on tools for Grid Portals. This years conference will broaden the discussion to also include efforts of other tools that make the development and use of Grids more easily accessible from the desktop.

Show simple item record

Search RIT Digital Archive

Advanced Search


My Account