Keyword
Query Routing
ABSTRACT
Keyword search is an intuitive paradigm for searching linked data
sources on the web. We propose to route keywords only to relevant sources to
reduce the high cost of processing keyword search queries over all sources. We
propose a novel method for computing top-k routing plans based on their
potentials to contain results for a given keyword query. We employ a
keyword-element relationship summary that compactly represents relationships
between keywords and the data elements mentioning them. A multilevel scoring
mechanism is proposed for computing the relevance of routing plans based on
scores at the level of keywords, data elements, element sets, and sub graphs
that connect these elements. Experiments carried out using 150 publicly
available sources on the web showed that valid plans (precision@1 of 0.92) that
are highly relevant (mean reciprocal rank of 0.89) can be computed in 1 second
on average on a single PC. Further, we show routing greatly helps to improve
the performance of keyword search, without compromising its result quality.
Index Terms—Keyword search, keyword query, keyword query routing,
graph-structured data, RDF
INTRODUCTION
THE web is no longer only a collection of textual documents but
also a web of interlinked data sources (e.g., Linked Data). One prominent
project that largely contributes to this development is Linking Open Data. Through
this project, a large amount of legacy data have been transformed to RDF,
linked with other sources, and
published
as Linked Data. Collectively, Linked Data comprise hundreds of sources
containing billions of RDF triples, which are connected by millions of links
(see LOD Cloud illustration at http://linkeddata.org/). While different kinds
of links can be established, the ones frequently published are sameAs links,
which denote that two RDF resources represent the same real-world object. A
sample of Linked Data on the web is illustrated in Fig. 1
It is
difficult for the typical web users to exploit this web data by means of
structured queries using languages like SQL or SPARQL. To this end, keyword
search has proven to be intuitive. As opposed to structured queries, no
knowledge of the query language, the schema or the underlying data are needed
Literature
Survey
2. Analysis on Existing Networks:
It is difficult for the typical web users to exploit this web data
by means of structured queries using languages like SQL or SPARQL. To this end,
keyword search has proven to be intuitive. As opposed to structured queries, no
knowledge of the query language, the schema or the underlying data are needed.
3.Idea on proposed System:
We propose
to investigate the problem of keyword query routing for keyword search over a
large number of structured and Linked Data sources. Routing keywords only to
relevant sources can reduce the high cost of searching for structured results
that span multiple sources. To the best of our knowledge, the work presented in
this paper represents the first attempt to address this problem.
. Existing
work uses keyword relationships (KR) collected individually for single
databases [6], [7]. We represent relationships between keywords as well as
those between data elements. They are constructed for the entire collection of
linked
sources,
and then grouped as elements of a compact summary called the set-level
keyword-element relationship graph (KERG). Summarizing relationships is essential
for addressing the scalability requirement of the Linked Data web scenario.
. IR-style
ranking has been proposed to incorporate relevance at the level of keywords
[7]. To cope with the increased keyword ambiguity in the web setting, we employ
a multilevel relevance model, where elements to be considered are keywords,
entities
mentioning
these keywords, corresponding sets of entities, relationships between elements
of the same level, and inter-relationships between elements of different levels
No comments:
Post a Comment
Note: only a member of this blog may post a comment.