NSWI144 - Data Integration and Quality

Basic information

Semestral assignment

Briefly

  1. Identify source data
  2. Download/Extract/Scrape
  3. Triplify (convert to RDF according to LD principles - *****)
  4. Identify vocabularies to be reused, use them, maybe create your own
  5. Link (internally, externally)
  6. Store
  7. Query & Use – in a demo application
  8. Use RDFa/Microformats/Microdata
  9. Present – on the last lecture 2019-01-09

In detail

See the detailed assignment specification

Labs prerequisites

Make sure that you can run the following on the corresponding lab:

  1. Web browser, Internet access
  2. Web browser, Internet access, Java JDK, Git, Tarql (requires Apache Maven)
  3. Web browser, Internet access, Java JDK, Git, Silk Link Discovery Framework
  4. Web browser, Internet access
  5. Java JDK, Java IDE (Eclipse, Netbeans, IntelliJ), Apache Jena or Eclipse RDF4J

Lecture slides

  1. Introduction to Linked Data
  2. RDF, RDFS, Serializations
  3. SPARQL, Tarql
  4. Dublin Core, SKOS, RDF Datacube Vocabulary
  5. Linked Data Patterns
  6. Linking, Silk, Metadata (DCAT, DCAT-AP, VoID)
  7. RDFa, Microformats, Microdata, SPARQL Update, Quadstores
  8. RDF APIs - Eclipse RDF4J, Apache Jena
  9. Direct Mapping, R2RML
  10. (Linked) Data Quality