The BioQuery application is designed to make it as simple as possible to add additional databases to the list of accessible data sources. This is possible because the Query package is designed as a framework that models a query, but does not know the database-specific details of how to submit the query. By creating 1 java file and modifying 1 text file, any moderately skilled java programmer can add a new database to this application. The existing code does not need to be modified or recompiled.
To add a new database follow these steps:
QuerySubmitter. Make sure the
abstract methods are implemented. A QuerySubmitter class should do 3 things
when given a search string:
querys.xml. This will
include the database name, result formats, and searchable fields. Also include the
full class name of your new QuerySubmitter class. See the file
querys.xml.QuerySubmitter class and put the class file in the bioquery
directory of your client application. When you run the client, it should be available.
If you want to receive automatic updates from your new database, you must also update
the querys.xml file and add your new class file to the server. If you're
using our server, send us your new Query and we'll put it up.
The Query framework is designed to construct boolean searches. For example:
(calcium[TITLE] AND calmodulin[TITLE]) OR (kinase[ALL] AND atpase[ALL]). Any new
QuerySubmitter class should be able to translate this into a format the database
can understand. This may involve standardizing the boolean operators, taking out spaces, making
joins to produce the same effect as the parentheses, or even mapping the logic into SQL. These
efforts should be transparent to the user.
The QuerySubmitter can make whatever network connections are required to contact
the database. The NCBIQuerySubmitter installed with the BioQuery application makes
only http connections (and will thus work behind a firewall) and does not require any
non-standard java classes or packages, keeping the query package portable. This is
the ideal, but may not always be possible.
The databases installed with BioQuery use a number of different return formats. In the file
querys.xml these are coupled with a file extension that indicates how to
display these formats. BioQuery currently only supports displaying plain text and html,
so the 2 file extension types are: txt and html. When an
NCBIQuery returns results in GenBank format or in XML, they are usually embedded in
an html document. It is up to the QuerySubmitter to parse and correctly format
the text or html. The BioQuery user interface can correctly display html and make external links
from absolute addresses (by calling the computer's browser). However, it cannot run javascript or
enable relative links (it doesn't know the parent address). Hence, parsing of html pages and
expanding relative links is usually the task of the QuerySubmitter. If you're
adventurous enough to want to expand the GUI as well as the query package, you can add custom
subclasses of DataView that will display any kind of data and use different file
extensions.
The querys.xml file is located in the META-INF directory under the
BioQuery directory on the client, and in the META-INF directory under the
base directory of the virtual machine on the server (see
Server Installation Instructions for details). You can create and
test your Query on the client, but will have to modify the server to get the Auto-Update feature
working. If you're using our server, just send us the new Query and we'll host it for you.
Write us for more details.
Your new subclass of QuerySubmitter should be placed in the BioQuery
directory on the client, and in the bioquery->WEB-INF->classes directory on the
server.
When expanding the set of databases BioQuery can search, you only need to work with 2 files:
your new subclass of QuerySubmitter, and querys.xml. The existing
entries in querys.xml are self-explanatory, so use them as a guide. You can also
use the existing NCBIQuerySubmitter as a guide. However, this
QuerySubmitter can query 8 different databases, making it a bit more complicated.
You do not need to recompile
the source code for the original BioQuery program to get your extension to work, but you should
download the the souce code from our website for examples and for a better understanding
of the program.
Adding a new database does not require writing much code, it just takes a conceptual
understanding of the query package. This page is just a primer.
For more details and support in your efforts, please write us at:
support@bioquery.org