Assignment of orthologous genes by utilization of multiple databases: the orthology package in R
Abstract
The assignment of orthologous genes between species is a key issue when multiple-species approaches are conducted. This has become even more relevant over the past years, triggered by the development of high-throughput genome sequencing technologies, which enable access to complete genomes in a rapid and cost-effective way. In this paper, we present a new software that allows the user to access orthology relationships across multiple species in an easy, fast, and flexible manner. The tool collects data from three prominent freely available databases, and presents it to the user in a convenient, easily accessible way. Once the package is installed, the software works on the local computer, therewith circumventing runtime delay caused by network traffic often being a critical performance bottleneck when large datasets are studied or many organisms are
investigated simultaneously. By the consequent internal usage of unique identifiers, the software disburdens the user from problems connected with the existence of synonyms or ambiguous gene denotations, a problem that often hampers a clear-cut assignment of orthologs. The software is able to display frequently occurring, complicated many-to-many orthology relationships in a visual manner. It is written in the R programming language and freely available.