NPRG036 - Data Formats
Basic information - winter 2023
- The lectures and tutorials are on-site, in person. Slides and videos in English and Czech from last years are provided on this webpage. The lecture this year and one of the tutorials will be taught in English.
- Homework will be done in groups and will have 4 parts with corresponding deadlines.
- All 4 parts of the homework need to be turned in before the individual deadlines in order to proceed to the final exam.
Lectures - Mondays 09:00 in S4
- 2023-10-02: Data formats introduction: Google Slides, YouTube (English), YouTube (Czech)
- 2023-10-09: Graph data formats - RDF, RDF Schema, Linked Data, Open World Assumption: Google Slides, YouTube (English), YouTube (Czech)
- 2023-10-16: Graph data formats - SPARQL: Google Slides, YouTube (English), YouTube (Czech)
- 2023-10-23: Graph data formats - Basic vocabularies, Wikidata: Google Slides, YouTube (English), YouTube (Czech)
- 2023-10-30: Graph data formats - Labeled property graph model, Cypher, RDF-star: Google Slides, YouTube (English), YouTube (Czech)
- 2023-11-06: Hierarchical data formats - XML, XML Schema: Google Slides, YouTube (English), YouTube (Czech)
- 2023-11-13: Hierarchical data formats - XPath, XSLT: Google Slides, YouTube (English), YouTube (Czech)
- 2023-11-20: Hierarchical data formats - JSON, JSON Schema, JSON-LD: Google Slides, YouTube (English), YouTube (Czech)
- 2023-11-27: Relational data formats - SQL dump, CSV, CSV on the Web: Google Slides, YouTube (English), YouTube (Czech)
- 2023-12-04: Formats for geodata by guest speaker Michal Med: PDF, YouTube
- 2023-12-11: Key-value, configuration formats - .properties, INI, TOML, YAML: Google Slides, YouTube (English), YouTube (Czech)
- 2023-12-18: Formats for text documents: Google Slides, YouTube (English), YouTube (Czech)
- 2024-01-08: Multimedia formats - images, video, audio, containers, print formats: Google Slides, YouTube (English), YouTube (Czech), Print formats on YouTube (Czech)
In this section, the links to tutorials with examples are available. There are three instances of tutorials per week. The tutorials are split into (R) Recommended, where we go through what you need for the homework, and (O) Optional, which are shorter and you can practice them at home, and therefore come to the tutorial only if you need to consult something (the homework).
T1: Mondays 10:40, SU2, English
T2: Mondays 15:40, SU2, Czech
T3: Wednesdays 15:40, SU2, Czech
Schedule and slides
The slides contain assignments to be practiced during the tutorial. In case of problems consult during the tutorial.
- Week 1 (R): Conceptual Modeling
- Week 2 (R): RDF
- Week 3 (R): SPARQL
- Week 4 (O): Wikidata
- Week 5 (R): LPG & Cypher
- Week 6 (R): XML & XML Schema
- Week 7 (R): XPath & XSLT
- Week 8 (R): JSON, jq, JSON Schema, JSON-LD
- Week 9 (O): HW part 3 (hierarchical formats) consultations
- Week 10 (R): CSV, CSV on the Web
- Week 11 (O): Geodata - GeoJSON, WKT, CRS, QGIS
- Week 12 (O): Key-value formats - TOML, YAML
- Week 13: Holidays
- Week 14 (O): Multimedia formats, Formats for text documents
Homework will be done in groups and will have 4 parts. All 4 parts of homework need to be turned in using the SIS Study group roster module before the individual deadlines in order to proceed to the final exam. The tutor's comments to the homework solutions need to be addressed when the next part is turned in. Before turning in a homework part, double-check the assignment and common errors and make sure you satisfy all requirements.
Homework part 1: Conceptual model
- See the homework 1 assignment.
Homework part 2: Graph models
- See the homework 2 assignment.
Homework part 3: Hierarchical models
- Submission deadline
Before tutorials in Week 10.
- See the homework 3 assignment.
Homework part 4: Relational model
- Submission deadline
Before tutorials in Week 12
- See the homework 4 assignment.
Each group of 4 students from the same tutorial (
T3) needs to have a group leader responsible for turning the homework in before the deadline and informing me of any changes in the group, including students in their group not working on the homework.
First, I will let you form your own groups of 4.
The appointed group leaders can send me the list of names of students in the group by Friday
2023-10-10T20:00:00 via email.
After that, I will distribute the remaining students and appoint the remaining group leaders myself, so that you can start working on the homework.
When I do that, establish contact with your group ASAP.
Students enrolled after this point will be assigned to a new or existing group by me.
Be ready to work on the homework with your group during the semester, communicate.
In case of problems with your team, such as member or leader not communicating, let me know as soon as possible to avoid problems with deadlines.
Final deadline for fixing all HW feedback is
There must be no errors in the HWs by then.
Avoid splitting homework topics among members in a way that some members do not participate in a certain topic at all. This means they do not practice it enough and it is also unfair as the individual HW parts are not the same in terms of difficulty.
I suggest splitting the team members for each topic as creators and verifiers, rotating throughout the semester. In addition, I suggest establishing communication channels, regular team meetings and internal deadlines for creation and verification at least a few days before the submission deadline.
Common troubles with group homework
Group member or leader not communicating or not doing their part
Contact me, do not hesitate. I will contact the not communicating member demanding explanation.
- This may be due to illness, which can happen
- If necessary, I will remove the member from the group
- If necessary, I will appoint a new group leader
Group size reduction is not a reason for reduction of the homework scope
- Assignments are doable even single-handedly, but teamwork is part of the experience
Not communicating group member is not a reason for deadline extension
- Do your homework early, not a day before deadline
- Set internal team deadlines, check your groupmates’ solution
- It is unacceptable to say you missed a deadline because one teammate was responsible for a certain task and did not deliver.
- If you are ill or otherwise unable to work, let your group know ASAP
- If you are removed from a team, you will fail this course
You will receive feedback on your homework from me via e-mail. The feedback may be one of the following kinds:
- Everything is OK and you get a ✅ in SIS.
- Minor issues
- You get a ✅ in SIS. You need to fix those along with the next HW.
- Regular issues
- You do not get ✅ in SIS until you fix them. You need to fix them along with the next HW to be able to continue. If you do not fix those with the next HW, you fail the course.
- Major issues
- You need to fix those ASAP and let me know when you do. These issues will prevent you from doing the next assignment correctly. If you do not fix those with the next HW at the latest, you fail the course.
- Fatal issues
- Typically resulting from not following instructions in the HW assignments, or completely missing parts. You need to fix those ASAP and let me know when you do. If this kind of issue appears for the second time, you fail the course.
- Missed deadline
- In case the deadline passes and there is no solution turned in by your group, you fail the course, unless the reason is serious, e.g. medical.
See a sample test.