SimilarityMeasure: Graphmatching
Developer: Barbara Spillmann
Description
The Graph Matching module can be used to determine similarities of ontologies with respect to the structures of the ontologies under consideration. An ontology is mapped to a graph where each ontology element (concepts, relations, attributes) is represented as a vertex in the graph. Once such graph structures have been extracted, maximum common subgraphs (mcs) can be determined. An mcs refers to structurally equal parts of two graphs, or finally of two ontologies.
Originally, the definition of the maximum common subgraph of two graphs is strict in the sense that only exact subgraphs regarded. Applied to ontologies this is really not uesful as exact subparts of ontologies are rare. Therefore, a weaker definition of the mcs is used: subgraphs are defined to be common if they are similar enough. The similarity of subgraphs can be measured by various measures and possibly controlled by a threshold parameter.
Characteristics
The calculation of the maximum common subgraph is very time-consuming.
Evaluation/Performance
Some evaluation tests on the Benchmarks have been performed. You can inspect the results here.
Specification
Initialisation
The SimilarityMeasure main class is
de.dfki.km.phaselib.impl.similarities.graphmatching.mxcs.MaxComSubgraphMeasure
Initialisation is straight forward:
new MaxComSubgraphMeasure()
In order to use the graph matching module together with the INRIA Alignment package, an implementation of the AlignmentProcessAdapter is needed, which is the class
de.dfki.km.phaselib.impl.alignments.inria.MaxComSubgraphAlgorithm
Its getGeneratorProcess() method then returns a GraphBasedAlignment with a MaxComSubgraphMeasure as the similarity measure.
Parameters
Parameter | Implementation | Description |
'VertexComparator' 'EdgeComparator' | MeanComparatorList LogicalANDComparatorList LogicalORComarator ThresholdedComparatorList | |
'VertexComparatorI' 'EdgeComparatorI' | GraphComponentTypeEquality LabelEquality DummyTrue LabelEditDistance SimilarityMatrixBasedComparator RandomComparator | |
'VertexThresholdI' 'EdgeThresholdI' | -- | a real number in [0, 1] |
'OntoGraphType' | TaxRelGraph TaxonomyGraph | |
'AssocGraph' | ALOE TEE |
These parameters are passed by a mxcs_config file, which is an xml file. An example is the following:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd"> <properties> <comment>example config file</comment> <entry key="VertexComparator">de.dfki.km.phaselib.impl.similarities.graphmatching.impl.graph.component.comparator.MeanComparatorList</entry> <entry key="VertexThreshold">0.0</entry> <entry key="VertexComparator0">de.dfki.km.phaselib.impl.similarities.stringbased.StringBasedSimilarity</entry> <entry key="VertexThreshold0">0.7</entry> <entry key="EdgeComparator">de.dfki.km.phaselib.impl.similarities.graphmatching.impl.graph.component.comparator.MeanComparatorList</entry> <entry key="EdgeThreshold">0.0</entry> <entry key="EdgeComparator0">de.dfki.km.phaselib.impl.similarities.graphmatching.impl.graph.component.comparator.GraphComponentTypeEquality</entry> <entry key="EdgeThreshold0">0.0</entry> <entry key="OntoGraphType">de.dfki.km.phaselib.impl.similarities.graphmatching.impl.graph.TaxRelGraph</entry> <entry key="AssocGraph">ALOE</entry> </properties>
The final call would be: java fr.inrialpes.exmo.align.util.Procalign -i de.dfki.km.phaselib.impl.alignments.inria.MaxComSubgraphAlgorithm -o align.rdf -Dconfig=mxcs_config -Dinternal=true ontology1 ontology2
NB: The internal parameter is used to match only internal elements. It has a direct effect on the precision and recall values.
Dependencies
The Graph eXchange Language GXL.
License Issues
TODO