Skip to end of metadata
Go to start of metadata

Ontotext Experience

CTO Research

Our CTO group have researched such tools in May-Aug 2012, investing considerable time (25p/d)

  • conf page: [Ontotext:CTO Link Discovery] (you may have no access), includes presentation, relevant parts copied below
  • jira task: ONTOTECH-207 (you may have no access)
  • overview of SILK, LIMES, FEBRL.
  • In their opinion SILK is clearly the best
  • practical experiments with SILK, including:
    • DBpedia/Geonames: cities, museums
    • DBpedia/MuzicBrainz: music bands
    • DrugBank/LinkedCT: drugs vs interventions
  • SILK test installation is still available

Ontotext IdRF

[IdRF:]: implemented by Milena Yankova as part of her thesis.
Used for an European Skills, Competences and Occupations classification (ESCO) mapping pilot.

Notable Tools

  • Research this list
  • Also see [DOM:CLAS#CLAS and ontology matching]

Note: Karma is not a terminology matching tool. It's a data mapping tool, see review: Data Migration and Ingestion Tools#Karma

SILK

Silk - A Link Discovery Framework for the Web of Data
"A tool for discovering relationships between data items within different Linked Data sources. Data publishers can use Silk to set RDF links from their data sources to other data sources on the Web."
Mapping framework for mapping LOD entities, terminologies etc. Includes a workbench, matching engine, server, MapReduce implementation. Excellent scalability, includes distributed processing option.

  • Output - alignment of entities, connected by owl:sameAs or any other defined predicate

Silk Links

Papers:

Silk Implementation

  • Developed by Chris Bizer with funding by Vulcan Inc and FP7 LOD2
  • ASL license (version 2)
  • Implemented in SCALA (see Yammer discussions), a scalable JVM language
  • Multi-threading support
  • Source code available and recently updated
  • Integration: Java API for in-process integration
  • Server edition allows for reconciliation service implementation - maintains an internal DB of known entities and is capable to process input stream of new data to be reconciled

Silk Rules

Data sets are interlinked via mapping rules. Powerful rules language

  • mappings based on similarities of property values (labels, names, dates, etc.)
  • fuzzy match strategies, similarity metrics, data normalisation, aggregation on similarity metrics (min, max, average)
  • restrictions on (ontology) classes/types

Eg a rule can be: Match cities based on

  • name (string comparison after lowercase transformation and with certain Levenstein distance tolerance),
  • population (numeric with certain tolerance) and geographic location;
  • combined in a certain way

Example linkage (dbpedia:City vs. geonames:P)
Unable to render embedded object: File (silkRule.png) not found.

Silk Rule Development

  • Manual mapping (via Linkage Rule Editor) - visual interface supports rule development (drag & drop, wizards, templates)
  • Linkage rule generation
    • Suggests pairing candidates and the user approves/rejects suggestions
    • Based on the user response the system generates linkage rules using genetic algorithms (interactive rule learning)

STITCH@CATCH

http://stitch.cs.vu.nl/: Vocabulary&Alignment Repository ("Semantic Interoperability To access Cultural Heritage")
"Web-based Repository Service for Vocabularies and Alignments in Cultural Heritage: ESWC10 paper and slides by the current Europeana scientific advisor

A vocabulary server including:

  • conversion to SKOS
  • alignment (mapping)
  • access by Web services, SOAP, JSON...
  • annotation suggestions, especially for multi-lingual mapping. "subjected a corpus of 250.000 dually indexed books of the National Library of the Netherlands (KB) to an instance-based method to derive an alignment between two KOSs, the Brinkman thesaurus and the Biblion one"
    (KB is the host organization for Europeana)
  • numerous demos

Also see CATCH, STITCH for tools from related people

Amalgame

Amalgame is a tool for semi-automatic vocabulary mapping by VU Amsterdam
Loaded with a number of vocabularies.

Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.