wiki:generator_HotSpot

Context Navigation

Version 5 (modified by endres, 20 years ago) (diff)
--

TracNav?

AlignmentGenerator: Hotspot AlignmentGenerator

Developers: Björn Endres

Description

This experimental AlignmentGenerator tries to tackle large scale ontology alignment, for which other approaches fail due to high memory and/or performance requirements. The general idea is a "divide and conquer" approach, continously aligning a number of parts of the ontologies and joining them it to a single, large alignment.

A basic concept, that is used within this approach, is the ContextOntology. It defines a subontology by taking a single class of another ontology, called anchor, and adding all related classes within a certain distance to it. The relation currently used is the subclass relation, but any other class to class relation could be used just as well. The distance is to be called context depth. The resulting set of classes is then interpreted as a new (sub)ontology. Every relation/property of the original classes could be used, but have to be truncated to the classes that are actually contained in the new ontology.

In this context, a Hotspot consists of the following attributes:

a source ContextOntology
a target ContextOntology
an alignment between those two

The general idea of the algorithm is now, to identify a Hotspot using single sure matches as anchors. Then, this Hotspot is optimised in size with respect to a hotspot quality measure. Finally, the resulting hotspot is joined with every other hotspot it overlaps (in both, source and target).

To implement this approach, the algorithm requires two exchangeable modules, which are specified by the interfaces HotspotIdentifier and HotspotMeasure.

Characteristics

Evaluation/Performance

Specification

Parameters

Parameter name	ValueType	Default	Description
PARAM_TAXONOMY_ONLY	Boolean	`FALSE`	Defines whether only the taxonomy (classses) should be aligned. This defaults to aligning the classes along with the properties.
PARAM_THRESHOLD	Double	`0.0`	This standard parameter forces the algorithm to drop all relations with lesser confidence. However, this parameter must be chosen carefully, since the confidences produced by the BordaCount? algorithm are no explanatory values. It would probably make more sense, to apply a threshold to each SimilarityMeasure, but this requires for a set of threshold parameters and not just a single one.