NPRG036 - Data Formats
Basic information - winter 2024
- The lectures and tutorials are on-site, in person. Slides and videos in English and Czech from last years are provided on this webpage. The lecture this year and one of the tutorials will be taught in English.
- Homework will be done in groups and will have 4 parts with corresponding deadlines.
- All 4 parts of the homework need to be turned in before the individual deadlines in order to proceed to the final exam.
Lectures - Wednesdays 12:20 in S3
- 2024-10-03: Data formats introduction: Google Slides, YouTube (English), YouTube (Czech)
- 2024-10-10: Graph data formats - RDF, RDF Schema, Linked Data, Open World Assumption: Google Slides, YouTube (English), YouTube (Czech)
- 2024-10-17: Graph data formats - SPARQL: Google Slides, YouTube (English), YouTube (Czech)
- 2024-10-24: Graph data formats - Basic vocabularies, Wikidata: Google Slides, YouTube (English), YouTube (Czech)
- 2024-10-31: Graph data formats - Labeled property graph model, Cypher, RDF-star: Google Slides, YouTube (English), YouTube (Czech)
- 2024-11-07: Hierarchical data formats - XML, XML Schema: Google Slides, YouTube (English), YouTube (Czech)
- 2024-11-14: No lecture
- 2024-11-21: Hierarchical data formats - XPath, XSLT: Google Slides, YouTube (English), YouTube (Czech)
- 2024-11-28: Hierarchical data formats - JSON, JSON Schema, JSON-LD: Google Slides, YouTube (English), YouTube (Czech)
- 2024-12-05: Relational data formats - SQL dump, CSV, CSV on the Web: Google Slides, YouTube (English), YouTube (Czech)
- 2024-12-12: Formats for geodata by guest speaker Michal Med: PDF, YouTube
- 2024-12-19: Key-value, configuration formats - .properties, INI, TOML, YAML: Google Slides, YouTube (English), YouTube (Czech), Formats for text documents: Google Slides, YouTube (English), YouTube (Czech)
- 2025-01-09: Multimedia formats - images, video, audio, containers, print formats: Google Slides, YouTube (English), YouTube (Czech), Print formats on YouTube (Czech)
Tutorials
In this section, the links to tutorials with examples are available. There are three instances of tutorials per week. The tutorials are split into (R) Recommended, where we go through what you need for the homework, and (O) Optional, which are shorter and you can practice them at home, and therefore come to the tutorial only if you need to consult something (the homework).
T1
: Thursdays 14:00, S4 - bring your own laptop!, CzechT2
: Thursdays 15:40, SW2, EnglishT3
: Wednesdays 12:20, SU2, Czech
Schedule and slides
The slides contain assignments to be practiced during the tutorial.
In case of problems consult during the tutorial.
Tutorials are numbered from the first one after first lecture, i.e. T1
and T2
have Tutorial 1
in the first week, T3
has Tutorial 1
in the second week.
Exact dates for the groups are available in the tooltip when you hover over the tutorial number below.
- Tutorial 1 (R): Conceptual Modeling
- Tutorial 2 (R): RDF
- Tutorial 3 (R): SPARQL
- Tutorial 4 (O): Wikidata
- Tutorial 5 (R): LPG & Cypher
- Tutorial 6 (R): XML & XML Schema, No tutorial for
T3
on 2024-11-13 - Tutorial 7 (R): XPath & XSLT, No tutorial for
T1
andT2
on 2024-11-14 - Tutorial 8 (R): JSON, jq, JSON Schema, JSON-LD
- Tutorial 9 (R): CSV, CSV on the Web
- Tutorial 10 (O): Geodata - GeoJSON, WKT, CRS, QGIS
- Tutorial 11 (O): Key-value formats - TOML, YAML, Formats for text documents
- Tutorial 12 (O): Multimedia formats
Homework
Homework will be done in groups and will have 4 parts. All 4 parts of homework need to be turned in using the SIS Study group roster module before the individual deadlines in order to proceed to the final exam. The tutor's comments to the homework solutions need to be addressed when the next part is turned in. Before turning in a homework part, double-check the assignment and common errors and make sure you satisfy all requirements.
Homework part 1: Conceptual model
- Submission deadline
-
Before 3rd tutorials.
2024-10-17T14:00:00
forT1
2024-10-17T15:40:00
forT2
2024-10-23T12:20:00
forT3
- Assignment
- See the homework 1 assignment.
Homework part 2: Graph models
- Submission deadline
-
Before 6th tutorials for
T1
andT2
, before2024-11-13T12:20:00
forT3
2024-11-07T14:00:00
forT1
2024-11-07T15:40:00
forT2
2024-11-13T12:20:00
forT3
- Assignment
- See the homework 2 assignment.
Homework part 3: Hierarchical models
- Submission deadline
-
Before 9th tutorials.
2024-12-05T14:00:00
forT1
2024-12-05T15:40:00
forT2
2024-12-11T12:20:00
forT3
- Assignment
- See the homework 3 assignment.
Homework part 4: Relational model
- Submission deadline
-
Before 10th tutorials.
2024-12-12T14:00:00
forT1
2024-12-12T15:40:00
forT2
2024-12-18T12:20:00
forT3
- Assignment
- See the homework 4 assignment.
Homework feedback
You will receive feedback on your homework from me via e-mail. The feedback may be one of the following kinds:
- Everything is OK and you get a ✅ in SIS.
- Minor issues
- You get a ✅ in SIS. You need to fix those along with the next HW.
- Regular issues
- You do not get ✅ in SIS until you fix them. You need to fix them along with the next HW to be able to continue. If you do not fix those with the next HW, you fail the course.
- Major issues
- You need to fix those ASAP and let me know when you do. These issues will prevent you from doing the next assignment correctly. If you do not fix those with the next HW at the latest, you fail the course.
- Fatal issues
- Typically resulting from not following instructions in the HW assignments, or completely missing parts. You need to fix those ASAP and let me know when you do. If this kind of issue appears for the second time, you fail the course.
- Missed deadline
- In case the deadline passes and there is no solution turned in by your group, you fail the course, unless the reason is serious, e.g. medical.
Homework groups
Be ready to work on the homework with your group during the semester, communicate.
In case of problems with your team, such as member or leader not communicating, let me know as soon as possible to avoid problems with deadlines.
Final deadline for fixing all HW feedback is 2025-01-10T20:00:00
.
There must be no errors in the HWs by then.
Avoid splitting homework topics among members in a way that some members do not participate in a certain topic at all. This means they do not practice it enough and it is also unfair as the individual HW parts are not the same in terms of difficulty.
I suggest splitting the team members for each topic as creators and verifiers, rotating throughout the semester. In addition, I suggest establishing communication channels, regular team meetings and internal deadlines for creation and verification at least a few days before the submission deadline.
Common troubles with group homework
Group member or leader not communicating or not doing their part
-
Contact me, do not hesitate. I will contact the not communicating member demanding explanation.
- This may be due to illness, which can happen
- If necessary, I will remove the member from the group
- If necessary, I will appoint a new group leader
-
Group size reduction is not a reason for reduction of the homework scope
- Assignments are doable even single-handedly, but teamwork is part of the experience
-
Not communicating group member is not a reason for deadline extension
- Do your homework early, not a day before deadline
- Set internal team deadlines, check your groupmates’ solution
- It is unacceptable to say you missed a deadline because one teammate was responsible for a certain task and did not deliver.
-
Communicate!
- If you are ill or otherwise unable to work, let your group know ASAP
- If you are removed from a team, you will fail this course
Exams
See a sample test.