How To Write a Clariah FAIR Paper

By auke

Motivation

Currently, academic papers are mainly published as pdfs. Pdfs are popular as they have an identical layout on different computers and can easily be generated by text and graphical editors, such as Microsoft Word, Photoshop and InDesign. However, with the move to open science the disadvantages have become clearer. Pdf is a proprietary format that is optimised for human readability. The lack of interoperability makes pdf-files hard to read for computers, stripped of interactive components that were present in the original file format, and devoid of metadata. As a result, findings presented in academic papers are often not reproducible without considerable effort.

The FAIR principle states that all data and research should be: findable, accessible, interoperable, and reusable. Public data stories based on linked data are FAIR by default: they are findable, they are accessible, interoperable and reusable and additionally reproducible and standards-based. Queries are live visualisations of data, they can be inspected, readers can configure or adapt them, and even inspect, download or copy the datasets that the queries are based on. All this without having to install anything, and without writing a single line of code or having to rely on proprietary software formats.

The main problem with data stories are that they are often not seen as academic publications. Search engines like Google Scholar do not pick them up as they are devoid of metadata, scholars don't cite them due to missing metadata and improper academic styling, and live data is hard to store for posterity. In this project, we show that it possible to publish data stories as FAIR papers by adding metadata, applying academic styling, and making it possible to print them in a logical way.

How to publish a FAIR paper

Configure metadata

To add metadata one can use the tag lncsmetadata in the paragraph story element. After opening the paragraph with the above-mentioned tag, the author list, affiliations, orcids numbers, abstract, and keywords can be filled in. The closing tag lncsmetadata ends the metadata block in the paragraph. The above-named changes will result in the following view.

This is the metadata for this paper (inside <lncsmetadata></lncsmetadata>):

<lncsmetadata>
authors:
  - name: Kathrin Dentler
    orcid: 0000-0003-3325-7876
    affiliation: 1
  - name: Tatiana Ronzhina
    orcid: 0000-0002-2268-0460
    affiliation: 1
  - name: Wouter Beek
    orcid: 0000-0003-0250-9655
    affiliation: 1,2
  - name: Auke Rijpma
    orcid: 0000-0002-8950-8227
    affiliation: 3
  - name: Rick Mourits
    orcid: 0000-0002-2267-1679
    affiliation: 4
affiliations:
  - id: 1
    name: Triply
  - id: 2
    name: VU University Amsterdam
  - id: 3
    name: Universiteit Utrecht
  - id: 4
    name: International Institute for Social History
keywords:
  - Linked Data
  - FAIR Paper
abstract: This data story is an instruction manual that documents the required steps to publish a data story as a FAIR paper according to the FAIR principles (Findability, Accessibility, Interoperability, and Reuse). This data story is a FAIR paper itself ("dogfooding"). 
</lncsmetadata>

Apply academic styling

FAIR papers currently support the LNCS (Springer Lecture Notes in Computer Science) style. To apply the styling, add ?lncs to the URL of the data story, ie. https://druid.datalegend.net/fair-paper-project/-/stories/fair-paper-manual?lncs for this data story.

Another interesting data story that might be published as FAIR paper is this one: https://druid.datalegend.net/data-management-JochemWieke/-/stories/sparqling-diamonds?lncs.

Print a FAIR paper

The print dialog can be opened from the GUI, or by pressing the common keyboard key for printing (e.g. Ctrl+P on Windows and Linux). It is recommended to uncheck the option to print "Headers and footers" in the printing dialogue under settings.

In the printed version, story elements (paragraphs and queries) are spread over multiple pages in a logical way.

Academic styling of query results

If you are editing your data story, or you are part of the organization that the data story belongs to, you can add SPARQL queries to your story. Click on ADD NEW ELEMENT, there you can either select an existing query or create a new one. Fill in the name, description, and the dataset for the query.

It is also possible to add Geo visualisations (Figure 2), various charts and widgets (Figure 3).

HTML widgets allow SPARQL results to be displayed in an HTML gallery:

Footnotes and references to literature

Footnotes are currently not supported in data stories. It is possible to use endnotes by using the sup tag. For example, to learn more about SPARQL queries and their use, you can view the video tutorials1 consult the SPARQL 1.1. Query Language specification2.

Literature references can be added manually in APA or Chicago Style at the end of the paper.

Lessons learned

The most important lesson learned is that the concept works, as proven by this FAIR paper.

Specification format for metadata

We chose YAML to define our metadata. YAML a human-readable data serialisation language, which is commonly used for configuration files. YAML is a good start, but has some disadvantages:

  • it does not support Markdown, so it is for example not possible to include a link in the abstract,
  • it is not supported in the preview of a data story,
  • it is easy to break

Known limitations

  • FAIR papers rely on public dataset, which need to be maintained so that the data story keeps up and running.
  • Stylesheets that use two columns do not work well.

Future work

  • Including ORCIDs in preview