[[TracNav]] = AlignmentGenerator: [wiki:PhaseLibs PhaseTab] Algorithm = Developers: [mailto:endres(at)dfki.uni-kl.de Björn Endres] and [mailto:Malte.Kiesel(at)dfki.de Malte Kiesel] == Description == The [wiki:generator_PhaseTab PhaseTab] algorithm is a generic algorithm that employs an arbitrary number of [wiki:SimilarityMeasure SimilarityMeasures] to generate an alignment. The algorithm was originally developed by [mailto:Malte.Kiesel(at)dfki.de Malte Kiesel] and [mailto:Ludger.van_Elst(at)dfki.de Ludger van Elst] and has been presented in the paper referenced [wiki:PhaseLibs here]. It was then adapted for the PhaseLibs project. The basic principle of the algorithm is the iterative confirmation of the best ranked relation proposals. This approach makes use of a feature of some [wiki:SimilarityMeasure SimilarityMeasures], especially the [wiki:similarity_SimFlood similarity flooding], which increase their precision when given higher confidences on correct relations. In order to archieve a robust and fair ranking, a [wiki:generator_BordaCount BordaCount] alogrithm is used to identify the {{{n}}} best rated realtion of all [wiki:SimilarityMeasure SimilarityMeasures] involved. An iteration cycle now consists of three phases: 1. Generate the SimilarityMatrix for all [wiki:SimilarityMeasure SimilarityMeasures] involved, using the current alignment as an input; the current alignment being empty in the first iteration. If the SimilarityMeasure is independent of such input, e.g. with a frame name based comparison measure, the matrix should not be recalculated. This is, however, up to the measure to decide. 2. Copy all 1.0 rated realtions, i.e. "confirmed ones", into a new and empty alignment. 3. Rank the remaining relations and identify the {{{n}}} best. These are also confirmed by setting their confidence to 1.0. 4. Finally add all remaining relations with the confidences calculated by the [wiki:generator_BordaCount BordaCount] algorithm. It is up to the used Alignment implementation, to filter these masses of relations. Typically, a MatchingAlignment would be used. 5. Reiterate 1)-4) until a maximal number is reached or all relations are confirmed. == Characteristics == This algorithm estimates, that the best scoring relations can be confied in. According to our experiences, this is the case in many scenarios. It is, however, highly vulnerable to errors, since they become confirmed and add a high amount of noise to the algorithms confidences. Therefore, it is crucial to keep an eye on at least two parameters: the number {{{n}}} of relations confirmed per iteration, and the number of iterations. We have not yet been able to define a general criteria to determine when the algorithm trails into cascades of false confirmation. The alignment generated by this algorithm lacks, due to its radical approach, reasonable relation confidences. This should not be a problem for end users, but is not beneficial when it comes to using it as a subroutine. == Evaluation/Performance == == Specification == === Parameters === || '''Parameter name''' || '''ValueType''' || '''Default''' || '''Description''' || || PARAM_TAXONOMY_ONLY || Boolean || {{{FALSE}}} || Defines whether only the taxonomy (classses) should be aligned. This defaults to aligning the classes along with the properties.|| || PARAM_THRESHOLD || Double || {{{0.0}}} || This standard parameter forces the algorithm to drop all relations with lesser confidence. However, this parameter must be chosen carefully, since the confidences produced by the [wiki:generator_BordaCount BordaCount] algorithm are no explanatory values. It would probably make more sense, to apply a threshold to each SimilarityMeasure, but this requires for a set of threshold parameters and not just a single one.|| || PARAM_MAXIMAL_ITERATIONS || Integer || {{{64}}} || Defines the number of iteration cycles to be performed. Thus (PARAM_MAXIMAL_ITERATIONS * PARAM_CONFIRMATION_STEP_SIZE) relations eventually be confirmed.|| || PARAM_CONFIRMATION_STEP_SIZE || Integer || {{{5}}} || Defines the number of relations to confirm with each iteration.|| || PARAM_FIX_1_0_RELATIONS || Boolean || {{{TRUE}}} || EXPERIMENTAL: This enables the user to lift the 1.0 restriction, i.e. each iteration the relations with confidence 1.0 are set to 0.9999 to enable the system to review them. However, first test produced poor results.|| === Dependencies === The algorithm uses the [wiki:generator_BordaCount BordaCount] alignment generator and the MatchingAlignment alignment implementation. == License Issues == This module is subject to the license the PhaseLibs project is published under.