Version 5 (modified by endres, 19 years ago) (diff) |
---|
AlignmentGenerator: Hotspot AlignmentGenerator
Developers: Björn Endres
Description
This experimental AlignmentGenerator tries to tackle large scale ontology alignment, for which other approaches fail due to high memory and/or performance requirements. The general idea is a "divide and conquer" approach, continously aligning a number of parts of the ontologies and joining them it to a single, large alignment.
A basic concept, that is used within this approach, is the ContextOntology. It defines a subontology by taking a single class of another ontology, called anchor, and adding all related classes within a certain distance to it. The relation currently used is the subclass relation, but any other class to class relation could be used just as well. The distance is to be called context depth. The resulting set of classes is then interpreted as a new (sub)ontology. Every relation/property of the original classes could be used, but have to be truncated to the classes that are actually contained in the new ontology.
In this context, a Hotspot consists of the following attributes:
- a source ContextOntology
- a target ContextOntology
- an alignment between those two
The general idea of the algorithm is now, to identify a Hotspot using single sure matches as anchors. Then, this Hotspot is optimised in size with respect to a hotspot quality measure. Finally, the resulting hotspot is joined with every other hotspot it overlaps (in both, source and target).
To implement this approach, the algorithm requires two exchangeable modules, which are specified by the interfaces HotspotIdentifier and HotspotMeasure.
Characteristics
Evaluation/Performance
Specification
Parameters
Parameter name | ValueType | Default | Description |
PARAM_TAXONOMY_ONLY | Boolean | FALSE | Defines whether only the taxonomy (classses) should be aligned. This defaults to aligning the classes along with the properties. |
PARAM_THRESHOLD | Double | 0.0 | This standard parameter forces the algorithm to drop all relations with lesser confidence. However, this parameter must be chosen carefully, since the confidences produced by the BordaCount? algorithm are no explanatory values. It would probably make more sense, to apply a threshold to each SimilarityMeasure, but this requires for a set of threshold parameters and not just a single one. |
Dependencies
License Issues
This module is subject to the license the PhaseLibs project is published under.