Linguaggi, modelli tecniche e strumenti per la scoperta, rappresentazione e gestione di mapping semantici tra ontologie di dominio/schemi eterogenei e distribuiti

WISDOM DIT-PRJ-05-014

Status NOT active project
DISI role Partner
Project type Research Project
Dimension National
Acquisition date 2004-11-17
Start date 2004-11-30
End date 2006-11-30
SAP code 40100782

Project details

Project astract The huge amount of data and services available on the Web requires the development of systems that, overcome the "information overloading" problem of traditional search engines. In particular, there is the need of developing novel tools for the integration, the localization and the customizable fruition of informative resources which allow the clients to "recharge" with interesting data. WISDOM goal is to develop intelligent techniques and tools, based on domain ontologies, to perform effective and efficient information search on the WEB. In particular, we aim at developing systems for retrieving information both from data-intensive and unstructured site/web pages, in an integrated and efficient way. The project will be articulated in three synergic and complementary themes and will define a reference methodological and functional architecture in order to ensure compatibility of the solutions offered by the three themes. The goal of Theme1(Creation and extension of a domain ontology) is the study and development of solutions to represent the semantics of the contents of Web sources. The goal of Theme2 (Emergent Semantics: Discovering semantic mappings among domain ontologies) is the study and development of techniques and tools to support identification, discovery and storage of semantic mappings among domain ontologies. The investigated semantic mappings will support the rewriting of a query with respect to more ontologies and will be based on language semantics, lexical chains and logic inferences. The goal of Theme3 (Query processing) is the development of techniques for searching information, exploiting the semantic infrastructure developed within Themes 2 and 3. Efficient and effective query processing mechanisms, considering data/sites hetereogeneity and constraints imposed by the distributed environment, will be developed. In particular, these techniques will rely on sources characterization to individuate useful sources, solve rewriting problems and integrate results from different sources.<br/>The issues addressed in WISDOM are relevant with a high applicative and industrial impact able to effectively exploit the potentialities of the Web.<br/>4 units coming from different universities participate to the project, with 18 professors and researchers (for a total of 137 man-months), 7 PhD (72 man-months) and external people under contract (86 man-months). The total cost of the project is 393500 Euro, with 135500 Euro set aside for external people. The units involved in the project have a long experience in collaborations in both national and international projects. The project management will be guaranteed by a coordinator for each theme, who will cooperate with the project leader, with the aim of monitoring relative progress. A collegiate meeting is expected at the end of each of the three phases in which the project is articulated. Project results will be both of scientific-methodological nature, documented through a series of technical reports, and implementative, in the form of prototype tools. Methods and tools proposed in the project will be validated through experimental activity.
Keywords semantic web, web intelligence, semantic coordination
Fundings 289700 €
Partners
  • DIT - UniTN
  • University of Modena and Reggio Emilia
  • University of Bologna
  • University of Roma 3

DISI Sub-project details

Project astract THEME 1. Creation and extension of a domain ontology<br/><br/>The main objective of THEME 1 is the study of proposed solutions for semantic representation of the contents of information resources on the Web, particularly referring to data-intensive resources and resources with scarcely structured contents. The representation and integration of such information resources requires devising a language for expressing domain ontologies, with the main goal of using them for query execution and answering- (THEME 3). With respect to THEME 1, the unit at the University of Trento will be involved in the following activities: <br/>Phase 1: - Contribution to the report on the state of the art of languages and emerging standards for representation of ontologies and classifications (deliverable D1.R1). The unit will focus especially on the representation of concept hierarchies (taxonomies), as they are very common on the web (see e.g. the Web directories of Google or Yahoo, or the directory structure of web sites), and the discovery of mappings across them is a special (but very relevant) case, which is of special use in document sharing and knowledge management applications. <br/>Phase 2 - Contribution to the definition of a language for representing domain ontologies and classifications (deliverable D1.R2). The unit will focus on the part of the specification which deals with concept hierarchies and semantically annotated taxonomies in general. This work will build on the definition of CTXML, an XML-based language proposed by Bouquet, Magnini, Serafini, Zanobini in the AAAI-02 workshop on Meaning Negotiation (Edmonton, CAnada, July 2002). Such a language must be compatible with the W3C standard (XML, RDF, RDFS, XML Schema, OWL) and with the query languages defined in THEME 3. <br/>Phase 3 - Design and development of a tool for the automatic population of predefined classifications (deliverable D1.P5). This tool will provide a simple way of associating documents to a pre-defined classification schema, and will be used to add a non-structured web resources to the system via its orgaization in homogeneous clusters of documents on the same topic. Used techniques will include natural language processing, text mining and case-based reasoning methods. <br/><br/>THEME 2. Emerging semantic: discovering semantic mappings between domain ontologies<br/><br/>The main aim of the THEME 2 is the design and development of techniques for (semi-) automatic generation of mappings holding between domain ontologies. <br/>Phase 1: - Report on the state of the art of languages and techniques for mapping domain ontologies (deliverable D2.R1). The objective of this activity is twofold: * the definition of a common framework for domain ontology mapping, including a common definition of what a mapping should include; * the analysys and comparison of state of the art techniques for mapping ontologies, including an assessment of the contribution that each technique may provide to the discovery of the kind of mappings defined in the previous item. <br/>Phase 2: In the second phase, new techniques for creating mappings across domain ontologieswill be elaborated (deliverable D2.R2). In particular, this will lead to: a. the definition of a language for representing complex mappings between heterogeneous ontologies and classifications. Such a language will be used to represent the kind of mapping defined in the common framework (previous item); b. the specification of an algorithm for allowing the (semi-) automatic generation of complex mappings between heterogeneous comain ontologies. The algorithm will be based of the CTXMATCH algorithm elaborated at the University of Trento. <br/>Phase 3: - Development of a platform for discovering and managing semantic mappings across domain ontologies (deliverable D2.P1). The platform can be viewed as a service which can be invoked to generate mappings across domain ontologies, namely a highly modular and domain independent system, where single components can be plugged, unplugged or suitably customized. Depending on the architectural choices made in the project, the platform may be used as a shared service at a global level, or used at a local level in a peer-to-peer attitude. <br/><br/>THEME 3. Query elaboration<br/><br/>The main objective of this theme is the development of techniques for query eleboration based on the use of domain ontologies (THEME 1) and of mappings across them (THEME 2). <br/>Phase 1: - Contribution to the analysis of the query languages and of query rewriting techniques based on ontologies and classifications (deliverable D3.R1). <br/>Phase 2: - Contribution to the definition of a query language and of some rewriting techniques based on ontologies (deliverable D3.R3). In particular, the contributions of our unit will be on the definition of a notion of 'semantic distance' between a concept used in a query based on an ontology T1 and other concepts (belonging to different ontologies) onto which the original concept is linked through semantic mapping. This distance will be one of the thresholds used to define the notion of a "good" answer to a query, especially when the execution of such a query requires the usage of mappings across different domain ontologies.
Keywords Semantic Web, intelligent search, semantic mapping, query languages
Fundings 71500 €
Manager Paolo Bouquet