How To Write a Clariah FAIR Paper

1. Motivation

Currently, academic papers are mainly published as pdfs. Pdfs are popular as they have an identical layout on different computers and can easily be generated by text and graphical editors, such as Microsoft Word, Photoshop, and InDesign. However, with the move to open science, the disadvantages have become clearer. Pdf is a proprietary format that is optimized for human readability. The lack of interoperability makes pdf-files challenging to read on computers, stripped of interactive components that were present in the original file format, and devoid of metadata. As a result, findings presented in academic papers are often not reproducible without considerable effort.

The FAIR principle states that all data and research should be: findable, accessible, interoperable, and reusable 1. Public data stories based on linked data are FAIR by default as well as reproducible and standards-based. Queries are live visualizations of data, they can be inspected, readers can configure or adapt them, and even inspect, download or copy the datasets that the queries are based on. All this without having to install anything, and without writing a single line of code or having to rely on proprietary software formats.

The main problem with data stories is that they are frequently not considered as academic publications. Search engines like Google Scholar do not pick them up as they are devoid of metadata, scholars don't cite them due to missing metadata and improper academic styling, and live data is difficult to store for posterity. In this project, we show that it is possible to publish data stories as FAIR papers by adding metadata, applying academic styling, and making it possible to print them in a logical way.

2. How to publish a FAIR paper

2.1 Configure metadata

To add metadata one can use the tag lncsmetadata in the paragraph story element. After opening the paragraph with the above-mentioned tag, the author list, affiliations, orcids numbers, abstract, and keywords can be filled in. The closing tag lncsmetadata ends the metadata block in the paragraph. The above-named changes will result in the following view.

This is the metadata for this paper (inside <lncsmetadata></lncsmetadata>):

<lncsmetadata>
authors:
  - name: Kathrin Dentler
    orcid: 0000-0003-3325-7876
    affiliation: 1
  - name: Tatiana Ronzhina
    orcid: 0000-0002-2268-0460
    affiliation: 1
  - name: Wouter Beek
    orcid: 0000-0003-0250-9655
    affiliation: 1,2
  - name: Auke Rijpma
    orcid: 0000-0002-8950-8227
    affiliation: 3
  - name: Rick Mourits
    orcid: 0000-0002-2267-1679
    affiliation: 4
affiliations:
  - id: 1
    name: Triply
  - id: 2
    name: VU University Amsterdam
  - id: 3
    name: Universiteit Utrecht
  - id: 4
    name: International Institute for Social History
keywords:
  - Linked Data
  - FAIR Paper
abstract: This data story is an instruction manual that documents the required steps to publish a data story as a FAIR paper according to the FAIR principles (Findability, Accessibility, Interoperability, and Reuse). This data story is a FAIR paper itself ("dogfooding"). 
</lncsmetadata>

2.2 Apply academic styling

FAIR papers currently support the LNCS (Springer Lecture Notes in Computer Science) style2. To apply the styling, add ?lncs to the URL of the data story, ie. https://druid.datalegend.net/fair-paper-project/-/stories/fair-paper-manual?lncs for this data story.

The ?lncs tag converts the current view of the datastory into lncs styled view. Based on the template, the font is changed, specific rules for the table view and headers are applied.

Another interesting data story that might be published as FAIR paper is this one: https://druid.datalegend.net/data-management-JochemWieke/-/stories/sparqling-diamonds?lncs.

2.3 Print a FAIR paper

The print dialog can be opened from the graphical user interface (GUI), or by pressing the common keyboard key for printing (e.g. Ctrl+P on Windows and Linux).

Pages can be printed as pdf or hardcopy. To allow readers to zoom in on pdfs, we advise to use images with at least 300 DPI.

It was specifically mentioned in the style guide that page numbers are not needed and will be added by a publisher2 . To exclude page numbers and data story information from the PDF view, the “Headers and footers” should be unchecked. However, if the page numbers are needed for authors or during the education, they could be easily added by turning "Headers and footers" on. When the “Headers and footers” are turned on, the datastory information, such as the name of the document, page numbers, creation date, are printed.

figure 1. How to print a data story.

In the printed version, story elements (paragraphs and queries) are spread over multiple pages in a logical way.

2.4 Academic styling of query results

If you are editing your data story, or you are part of the organization that the data story belongs to, you can add SPARQL queries to your story. Click on ADD NEW ELEMENT, there you can either select an existing query or create a new one. Fill in the name, description, and the dataset for the query.

figure 2. How to add a SPARQL query to a data story

Figure 3. Academic styling of the Table view.

It is also possible to add Geo visualisations (Figure 4), various charts and widgets (Figure 5).

Figure 4. Academic styling of the Geo view.

The FAIR Paper concept is to minimize the manual steps needed to reproduce the research and the paper. As such, manually adjusting the map view, would introduce an unwanted element. In a FAIR Paper, the query must be adjusted to adjust the view of the map, e.g. you can interact with the visualization by zooming in and out. However, if the more detailed view is needed for the print version, we highly recommend adjusting a SPARQL query to get the needed objects on the map.

HTML widgets allow SPARQL results to be displayed in an HTML gallery:

Figure 5. Academic styling of the Widget view.

2.5 Footnotes and references to literature

Footnotes are currently not supported in data stories. It is possible to use endnotes by using the sup tag. For example, to learn more about SPARQL queries and their use, you can view the video tutorials3 consult the SPARQL 1.1. Query Language specification4.

Literature references can be added manually in APA or Chicago Style at the end of the paper.

2.6 Markdown

The usage of markdown basic and extended is supported in the data stories. One can add images, tables, different headings and paragraphs.

2.7 Links to the FAIR paper

This is a FAIR Paper manual whose URL is https://druid.datalegend.net/fair-paper-project/-/stories/fair-paper-manual

This is a FAIR Paper manual whose academic styled URL is https://druid.datalegend.net/fair-paper-project/-/stories/fair-paper-manual?lncs

This is a FAIR Paper written by IISG https://druid.datalegend.net/IISG/-/stories/unifying-and-augmenting-metadata-as-LOD?lncs

3. Lessons learned

The most important lesson learned is that the concept works, as proven by this FAIR paper.

3.1 SPARQL results visualization

  • We need to define what parts of the visualization should be moved to the new page, where to put page break.
  • What size of the visualization passes good on the page, and it varies based on the type of the visualization (e.g. different for charts and maps).

3.2 Specification format for metadata

We chose YAML to define our metadata. YAML a human-readable data serialisation language, which is commonly used for configuration files. Originally YAML was said to mean Yet Another Markup Language. YAML is a good start, but has some disadvantages:

  • it does not support Markdown, so it is for example not possible to include a link in the abstract,
  • it is not supported in the preview of a data story,
  • it is easy to break

3.3 Use of the different academic styles

  • The FAIR project was specialized on the 1 academic style application (LNCS).
  • We learned that for different academic areas the different academic style is expected. Thus, the expansion of the supported academic styles is needed.
  • The developer's team spent about 80 hours on application of the academic style to the data story. The work included: picking the way of implementation, alignment of the text, headers, paragraphs, using specific text format, based on the style guide.
  • The next steps included alignment of the images, tables, results of the SPARQL queries, adding metadata. That in total took 42 hours of work.

3.4 Known limitations

  • FAIR papers rely on public dataset, which need to be maintained so that the data story keeps up and running.
  • Stylesheets that use two columns do not work well.
  • It is not possible to automatically generate a URL to the data story when creating a print version. Currently, it is done manually.
  • The use of Garlic (Garlic's A Ruby Lisp Implementation Compiler) is not possible yet. Benefit from using other programming languages in the data story: increasing accessibility and interoperability of the data story, makes it more usable for different groups of people (who work with different programming languages).
  • The legend in the chart view is partially missing from the PDF preview. We cannot control how google chart renders information inside the chart. And given that chart is so space-constraint in PDF, we run in these limitations. A possible solution is to make the size of the text in legend smaller (7-9 instead of 10-12).
  • It is not possible to add the hyperlink in the abstract.