odaesa

Personal tools

For and against ogsadai

From W

Jump to: navigation, search

global links:

  1. new NPD admin system (test version)
  2. new NPD user system (test version)
  3. Reports - weekly reports. setup and test reports, and others
  4. Meetings - information on past and future meetings
  5. Project plan - the original, and current updates
  6. the completed dissertation
  7. Original NPD - info and links
  8. Web services links


This project has been considering the various benefits of "grid enabling" the NPD database. One of the tools for consideration is ogsa-dai. However, there are many other ways to access data available these days, in the form of various web services protocols. Thus it has become necessary to compare some of these and to find out which is the most beneficial.

A web service should enable automated requests for data, allowing external entities to create software that can call on the service, receive a response in a standard layout, and act on the data transmitted. Particularly for this project, Taverna is a common software tool used by biologists, and it should be capable of using the NPD via the created services. Additionally, it is quite likely that people will want to integrate the services into other tools, using various different programming languages and methods.

These goals can be achieved with a "traditional" web service. I have prepared a SOAP server on the NPD, which provides WSDL files describing services that can be developed and made available. This interacts easily with Taverna, and most programming languages have built in functionality to handle this type of service. Thus, it meets the requirements laid out so far.

However, ogsadai is also capable of providing a WSDL access method. From an external point of view, it would appear to be any other web service. Once installed, "activities" can be set up and made available, again via WSDL, allowing various access methods onto the data.

Both types of service are therefore capable of doing the job in question. Both have presented their advantages and disadvantages during setup:

The traditional service was easy, given that an apache web server already underlies the NPD, and this makes it very compatible with current working methods at the HGU, and does not present much extra admin requirements or security issues for the IT staff there. But it does require some manual fiddling with WSDL documents, which can be annoying to get correct initially (although this should diminish as the service becomes more stable).

Ogsadai, however, requires another layer to run - in this case, apache tomcat. The ogsadai installation process includes deployment of apache axis onto the tomcat container, and then ogsadai sits ontop of that. In an environment where tomcat is already well utilised, this may not be an issue. But in this case, tomcat is not heavily used by the group in question, and so presents extra admin requirements, and security issues. Of course, ogsadai itself also needs to be administered and worked with to ensure it runs well, so this is another layer of new technology that may not be desirable to administrators. On the plus side, once an understanding of how ogsadai activities is gained, it should be quite easy to develop new services onto the data and make them readily available. These services are also likely to be able to be more complex than those on offer with traditional web services (at least for the amount of effort made), although as I am not expert with either method I cannot yet definitively claim which is better in this regard.

The result of all this, so far at least, is that ogsadai presents added complexity and extra work for the administrators in this case, in order to achieve much the same end result. Thus, unless there is a specific example where ogsadai is needed to solve a particular access problem that a user presents, there may not be much benefit to using ogsadai in this particular case.

The sort of benefit that ogsadai could offer become clear when considering operations over very large datasets, or operations over data in multiple databases. In either case, ogsadai would be ideal for querying such distributed data sets, simplifying the task by both obfuscating the underlying structural differences and by minimising the need to pass large amounts of data back and fore between operations. Also, ogsadai enables queries almost as complex as could be submitted directly via SQL, with the advantage of being able to apply them over multiple underlying data sets.

At present though, it does not seem that such benefits as ogdadai can offer would be of a great deal of use with the NPD. The underlying data is relatively small, both currently and potentially. If others in the field were using ogsadai too it may be that distributed query processing would show some promise, and the use of "activities/workflows" could really solve some problems. But for now, access via traditional web services seems to meet the same requirements as ogsadai can for this project, and with less administrative overhead and complexity.

Some user examples of how this data might be accessed should be forthcoming, at which point the situation may change. This is most likely to occur if it is found that user requirements exceed the complexity that can be reasonably managed by a traditional web service. It is unlikely that DQP scenarios are going to be discovered in this project.