Phase-Libs Documentation
Overview
The Phase-Libs project is designed to allow custom tailored combinations of alignment algorithms to produce the best matching procedure of ontology alignment for each application scenario. Thus, its core consists of a set of interfaces, providing a framework to integrate various types of modules to a single procedure.
Each module belongs to one of the following levels:
- Ontology level Since all kinds of ontologies and alike should serve as alignees, the Phase-Libs define an ontology interface. Ontology Adapter implement this interface in order to make a certain kind of ontological data acessible for the system.
- Similarity level The basis of most alignment approaches is some sort of similarity between the entities of the ontologies to be aligned. Thus, we have created an interface to such Similarity Measures, to make them exchangable.
- Procedure level The highest level is a complete alignment procedure that is able to calculate an alignment for a given pair of ontologies. This is certainly the single most important level from a user's point of view.
Modules
This section will give a list of the modules contained in the Phase-Libs project.
1. Ontology Adapters
All ontology adapters considered to be strictly read only, i.e. it must not be possible to modify an ontology through these adapters. All alignment algorithms will rely on this assumption.
- OWL Ontologies This adapter enables the Phase-Libs to access OWL ontologies, represented as RDF files.
- RDFS Ontologies This one uses the HP Jena system to access RDFS based ontologies.
- Protégé Ontologies Provides an interface to all Protégé projects.
- Document Classification Stores? DCSs are integrated document management and classification systems. Their document topic taxonomy can be accessed a simple kind of ontology.
- Composite ontology This is a second level adapter, allowing to create modified views on other ontologies by expanding or contracting subgraphs.
- Testing ontology This is a simple implementation of ontology interface, can be used for test purpose.
2. Similarity Measures
- String Based Similarity This measure uses n-gram matching on the entities' labels to calculate a similarity
- SimilarityFlooding Uses a propagation mechanism to promote a given or confirmed similarity to the neighboring nodes.
- Instance Based Similarity? Determines the similarity between concepts by calculation the similarity between sets of example documents.
- Graph Matching Uses some sort of classifier to determine an entity similarity (TODO: improve)
- Acronmy Matcher Analyses the entities labels for one being an acronym of the other.
- DCS Keyword Matcher? Determins, how the label of one entity matches a DCS category representing another concept.
3. Alignment Generators
The algorithms presented here are not yet suitable for productive use. They are merely used to test the underlying modules.
- Phase-Tab Algorithm This resembles Malte Kiesel's original Phase-Tab algorithm.
- Simple Borda Count? This generator uses a Borda count alogrithm to combine an arbitrary set of similarity measures to a solid alignment.
- Simple Evidence Simply converts a single similarity measure into an alignment.