data integration and the Semantic Web

 
 

RDF Resource Description Framework Developer Icon

My recently completed PhD research embraced semantic heterogeneity, information search and integration - in the context of the Semantic Web and Ontology.  The research included the development of query-topic relevant ontologies, specified using XML-based RDF, RDF Schema and OWL, together with other relevant Semantic Web technologies.

The main areas of consideration encompassed:

  • An examination of ontology mapping issues at both Upper (foundational) and Domain/Context levels.
  • Consideration of modularity within the context of small "geographical" OWL ontologies.
  • Formulation of theory regarding module reuse and the identification of primary and secondary contexts. The issues of effectiveness and efficiency in ontology design have been considered, with particular relevance to minimising ontology specification redundancy.
  • Consideration of how such contexts can be best applied in a motivating application ontology.
  • A Jena Ontology-API based java applet was developed to allow:
    • OWL Ontology query.
    • Ontology search using a web-crawler process.
    • Selective dynamic importation of the results of ontology searches into OWL ontology.
  • The java applet (shown below) has been further developed as a semantic search tool, called SemSeT, for ontology-based query expansion (OQE) and IR on Web document collections.
  • Searches have been analysed using a test data population of 100 Web pages. More extensive search experiments have been conducted using the TREC WT2g document corpus.
  • OQE experiments have now been completed on a ¼ million Web documents for three of the TREC query topics (T401, 416 and 438). Document relevance rankings have been calculated using a VSM tf-idf based document term weighting algorithm and search effectiveness compared, based on precision and recall measures.
  • The OQE-based search results have been compared against conventional keyword results, generated by the search tool, to determine search effectiveness of OQE search against keyword-driven search.
  • The results of the query experiments have been presented in a thesis and have shown that OQE, enabled by query topic-specific ontology contexts, can double the precision rate - compared to traditional keyword-only search. An example of the results is shown in the graph below.
  • NLQ document.

SemSeT: semantics-based search tool

An example of the prototype SemSeT query interface.

 

Below is an example P&R graph of Ontology-based query expansion (OQE) comparison outcomes; this was based on IR data generated using SemSeT with TREC WT2G 416 (Three Gorges Dam) query topic and a query topic-specific hydro-electric ontology context.

The graph shows macro-evaluation based (MEA) average precision and recall outcomes, based on a set of 10 queries, where the query terms were used on an optional basis; hence, it provides a comparison between "optional keyword" (Ko) versus "optional Ko plus ontology sub and super classes" (Oo), versus optional "Oo plus ontology relation classes" (Oro). Optional query terms were used, i.e. must-have terms were not specified.

P&R graph of OQE comparisons using SemSeT, TREC WT2G 416 and hydro-electric ontology context.

As mentioned above, ontology-based query expansion had the effect of doubling the query precision outcomes.

Presentations, Papers and Posters

  • "Information Dynamics, Perspectives, and Risks",  February 2005, pdf. Citation: [1].
  • "Data Modelling for Data Integration", research topics presentation, March 2005, ppt.
  • "Understanding Structural and Semantic Heterogeneity in the Context of Database Schema Integration", In Proceedings of the SIXTH Conference in the Dept. of Computing, (Journal of the Dept. of Computing, UCLan, Issue Number 4, pp. 29-44, ISSN 1476-9069), May 2005, pdf. Citations: [1][2] also [3][4][5].
  • "Schema and Semantic Heterogeneity in Database Schema Integration", presentation, SIXTH Conference of Dept. of Computing, UCLAN, May 2005, ppt.
  • "Developing "Geo" Ontology Layers for Web Query", presentation to Graduate School Conference, UCLan, December 2005, ppt.
  • "Semantic Web, Ontology Integration, and Web Query", presentation at Dept. of Computing Seminar, UCLan, March 2006, pps.
  • "Developing Ontologies based on RDF-OWL Semantic Web languages", presentation to Graduate School How-to@2, UCLan, April 2006, pps.
  • "CO3709 Research Topics in Computing 2007",  presentation, BSc Research Topics, UCLan, 6 March 2007, ppt.
  • "MSc Database Systems Research Topics",  presentation, MSc Database Systems, UCLan, 23 March 2007, ppt.
  • "Ontology Modules by Layering - Facilitating Reuse in a Geographical Semantic Web Context", presentation SEVENTH Conference of Dept. of Computing, UCLan, 20 June 2007, ppt.
  • "Geographical Ontology Modules for Efficient Semantic Web Reuse", presentation, Ordnance Survey Research Labs, Southampton, 28 June 2007, ppt.
  • "The Semantic Web and Efficient Reuse of Ontology Modules", presentation, MSc CO3701 Advanced Database Systems Research Topics, UCLan, 5 March 2008.
  • "An Ontology-based Semantic Web Search Engine to improve Precision and Recall", poster presented at Faculty of Science and Technology Annual Research Conference, UCLan, 18 June 2008.

Useful Ontology Links

 
* get Acrobat Reader You may need to download Adobe Acrobat Reader to open PDF files automatically.

Valid XHTML 1.0! Valid CSS!


© David George 2006-11 | Privacy | Legal | |