Heterogeneity in representations of IISG film meta-data

This story draws up how film metadata is currently provided at the IISG. Most insights are from Frank de Jong, archivist at the IISG. He mentioned that metadata on movies has been stored at varying levels of granualarity. What follows are a couple of examples he mentioned.

The first example shows a record describing a single movie, that happens to be on a single tape. Notice how the movie appears to have no title. Moreover, in the original record the description says U-matic 45 min, rather than just '45 min'.

Example of a single record describing a single film (on a single tape)

When briefly exploring the titles of films, it clearly shows that many items have no title. Update: since November 2023, we now provide 'type' as title for these works.

This next example, describes a single record on a single film as above, but it relates to two tapes (a 35mm and 16mm). Notice that the type of information is different from above, where the description mentioned the lenght of the film, not its physical diameter. We can also see that the movie has an appropriate title, including a language tag, nl, indicating that the film is in Dutch.

In the next query, a single record describes multiple movies. Notice how the 'content' field provides more in depth information, compared to the queries above (for which that information was not available).

Films are also represented as part of an archive, as illustrated by the film Toekomst 36. What is worrisome, is that in the Linked Data representation there do not appear to be direct links from the archive to the individual audio and visual materials, as used to be the case in the search view on the previous version of the IISG website.

A film might also be part of a collection of video and audio samples as portrayed by this search view. It doesn't appear to be the case that the collection has been modeled as such in Linked Data as I am unable to find any of the descriptors such as "18.630 foto's/negatieven, 2530 dia's, 356 geluidsbanden/cassettes, 630 films/filmfragmenten" or the collection id in the Linked Data representation.

Finally, most films are simply covered by their title in a list provided as pdf on the old website. There doesn't appear to be any representation of these films, nor the lists themselves in the Linked Open Data. Ideally, these lists would need to be converted to collection descriptions as above, or in case of stand-alone film titles, described as records. Finally, the available audio and MovingImage materials ought to be 'playable' in Linked Open Data. This is now impossible, because the links to the raw files are preceded by a reference to a default player, that currently refuses to play any video files I encountered and hardly played any of the audio files.

Use case: The Future of '36

Above we have seen that films are archived in different formats sometimes as records and sometimes as part of an archive and that in the latter case there is no link between the film itself and the archive.

But with some ugly text matching we are still able to retrieve and connect a bit more information on a given film. To illustrate this, I will focus on the film "De Toekomst '36".

First, I will check what entries have the string "De Toekomst '36" as (part of their) description. If you would execute the query yourself (click on 'Try this query yourself), you would notice that this query takes a long time: it is an expensive query to run. But here we need to, because there are no links between the items we are interested in.

The query provides 3 results that are image objects and have "toekomst '36" in their title (name). Two of them have the same size. Are they duplicates?

Now that we know what's available, let's explore it a little, by checking out the image items with the same size first. These posters linked via their title provide scarce descriptions, amongst others the year of publication and the poster size. Also thumbnail images are provided. In the query below, This visual and textual information is concatenated.

The topic to which the posters relate are also part of the metadata. The topic is not provided in words (as if it were a title), but as a topic id. Here the id stands for the 'Spanish Civil War'. Because the IISG adds metadata in that way, it is possible to find other (visual) collection items related to the Spanish Civil War. The following query could be seen as a "what also might interest you" result. The result below is limited to 40 items. If you wanted to see more, click 'run the query yourself' and enhance the number after "LIMIT".

Posters you may also be interested in