Discovery systems

Discovery systems:

Discovery services are search interfaces to pre-indexed metadata and/or full text documents. Discovery services differ from federated search applications in that discovery services don’t search live sources. The searching is done on pre-indexed data, meaning that results are returned for the user very rapidly. Discovery services are an evolution beyond federated search. Some discovery services either provide integration with federated search or provide an API for others to do the integration.

The terms “unified index” and “unified search index” are associated with discovery services. Just as the terms imply, discovery services use a unified search index to search content from all sources they have access to from a single index. The discovery service must deal with differences in the structure of metadata (e.g. names and contents of fields) from different sources to produce the unified search index.

Discovery services are more popular for users due to speed- federated search systems cannot compete with discovery systems in terms of response time. A second factor driving the creation of discovery services is the willingness of publishers and content aggregators to form partnerships with developers of the services. Given the pressure to deliver search results in “Google time,” publishers have an incentive to cooperate with one another and with discovery service providers.

Another reason for the big interest in discovery services is that the onerous task of building, monitoring, and repairing connectors disappears since there are no connectors. Unified indexes provide benefits due to their “homogenization” of metadata. Duplicates can be removed much more easily via discovery services than by federated search engines. And, discovery services produce more “complete” results, i.e. results with titles, authors, publications dates and other fields of interest that federated search can’t reliably get. With better-fielded results it will be easier to cluster and otherwise organize search results.

Loading posts...