The Data and Service Center for the Humanities DaSCH is a Swiss national research infrastructure which, as a competence center for digital methods and long-term use of digital data, supports the hermeneutically oriented Humanities in the use of state-of-the-art digital research methods. It provides services to individual scholars as well as to projects of research consortia from all areas of the Humanities dealing with qualitative research data.
Basic services as described hereafter are provided free of charge to scholars in Switzerland. For more detailed technical descriptions of the services, please consult the appendix to this document.
Providing tools and long-term access to Humanities' research data for the research community in accordance with FAIR principles and international standards for interoperability.
The DaSCH establishes a data repository. The repository allows to deposit data of project/user specific data models, to upload, query and to retrieve qualitative data and associated digital objects like images, digital facsimile, sound files, videos etc.
The repository guarantees access to data in accordance with FAIR principles for an indefinite period. To achieve this goal, the data deposited will be continuously curated and adapted to technical developments, best practices and standards.
The DaSCH provides basic tools for the management of data (upload, query, retrieval) and manages access permissions (embargo periods, copyright issues etc.). In addition, it provides a well-documented user interface based on industry standards for easy access to the data.
The DaSCH provides and maintains an architecture as well as tools and libraries to enable easy re-use of the data. Special attention is paid to the possibility of re-use of data beyond the original project boundaries in order to facilitate new methods of data mining and data aggregation.
Providing support to scholars in generating new and re-using existing digital research data for cutting-edge research, supporting local research-IT groups and other research infrastructures in using our digital resources and tools.
The DaSCH provides consulting in all questions regarding data management of qualitative data and digital objects including digitization, formats, data handling, workflows etc. (cost sharing may be required ).
It advises on data modelling or carries it out on behalf of its clients (cost sharing may be required)
It develops project-specific tools for data cleaning, data import as well as data analysis and data dissemination (cost sharing is required).
It collaborates with research projects that have very high demands on data management of qualitative data (cost sharing is required).
Training for scholars, research groups and institutions
The DaSCH offers courses and training in the use of the DaSCH infrastructure
It provides courses and training about best practices in management and (re-)use of qualitative data in Humanities’ research
It provides teaching in new digital research methods related to qualitative data.
The basic services of the DaSCH are free of charge. However, for individual or project specific consulting or development, cost sharing is required. As a general principle, the first 8 hours of consultation are given free of charge.
Core functions and services of the DaSCH
The DaSCH operates a long-term data storage framework (Knora) to preserve and guarantee access to qualitative data and related bitstream objects of research projects. Projects typically entail
Complex information objects such as databases of any kind containing linked and structured information (text, geographical information, date and time, etc.) that can either be stand-alone or connected with bitstream objects (see below).
Bitstream objects that are related to qualitative data. These bitstream objects include images, sound files, motion pictures, 3D-models, documents etc.
Usually, qualitative data is created and worked with some kind of database system (e.g. FileMaker, MySQL, NoSQL-databases, MS Access, Excel, folder hierarchies etc.). The bitstream objects are usually stored on the file system as binary data file (e.g. TIFF, MP3, …). For further research and re-use of these data it is of vital importance that they remain retrievable in their complex form, not simply listed under a title. The complexity of relational databases must be stored and held retrievable for further research in order to generate added value of the original research work. At the same time, each individual resource can be cited and exported in the most precise way.
The DaSCH operates the Knora Framework in an environment that is strongly protected against data loss. The servers are provided by SWITCH. The data is redundantly kept at two different geographical locations with each location providing 3-time redundancy based on a secure filesystem. The deployment framework (docker) is constantly adapted to the latest technological changes.
The DaSCH maintains the Knora framework which implements the repository for qualitative data and constantly adapts it to the changing technology and to the evolving standards for data representation and interoperability. The Knora framework implements the FAIR access principles as far as the FAIR principles are applicable to qualitative data and guarantees the semantic interoperability:
- It implements ARK persistent identifiers (PID) for project, resources, properties and bitstream objects. (FAIR F1 & F4)
- The project description gives enough information about the topic and goal of data. Digital objects must be supplied with sufficient metadata. Where necessary, a simple project specific access app may be provided to facilitate findability of relevant data. (FAIR F2)
- The framework is searchable based on a RESTful API. A generic application is provided to search and view resources of all projects that are in the repository. (FAIR F3)
- All data is accessible using a RESTful API. The response is JSON-LD, a widely adopted standard for data serialization for Linked Open Data. Bitstream objects are available as IIIF API. (FAIR A1 & A1.1)
- Knora implements a fine-grained authentication and permission system. For access to digital objects, the IIIF authentication API is implemented. (FAIR A1.2)
- Knora does not delete data – all data is always accessible. (FAIR A2)
- The API is based on widely adopted, open standards. (RESTful API, JSON-LD, SPARQL) (FAIR I1)
- The data models are described in machine readable and human readable form using Linked Open Data standards (RDF/RDFS/OWL) and can be exported in standard formats (RDF/XML, Turtle, N3, JSON-LD, …). The use of standard vocabularies is encouraged and implemented through mapping. (FAIR I2)
- The platform allows links and connections to project-internal data objects, to other projects within the DaSCH repository and to external data objects of other repositories (data-data object, data-bitstream object). (FAIR I3)
- The ontologies structuring the data are self-describing as far as possible and supplemented by comments. They are provided in both human and machine-readable forms. (FAIR R1)
- The DaSCH encourages the use of Creative Commons (CC) licenses, but custom licenses are supported as well. (FAIR R1.1)
- All data are associated to a project and a creator. (FAIR 1.2)
- Knora is based on Linked Open Data technology (RDF / RDFS / OWL / SPARQL) and supports standard as well as project/domain specific ontologies. (FAIR R1.3)
The Knora framework offers a complete, timestamp based historization (version history) of all data objects and their properties. This feature allows an ARK identifier to show an object as it was when the identifier has been created, even if the data object has changed in the meantime.
The DaSCH provides libraries and tools for data import and data export. These libraries and tools facilitate the use of the DaSCH for local research-supporting IT people as well as scholars directly (that usually do not have a computer science background but some basic programming skills). Libraries and tools allow a humanities scholar with basic IT-knowledge to access the services provided by the DaSCH, especially for data import into the repository, while the use of the API still requires considerable IT knowledge.
The DaSCH will ASAP provide a generic access tool, to search, view and manipulate the data within the DaSCH repository. All manipulations can be traced through the timestamp-based version history (provenance, change history). ([Under development])
The DaSCH will provide a generic database management tool through a simple user interface (SUID). For projects of low to moderate data complexity, it serves as a self-service kiosk that allows to design a project-specific data model and use Knora as a research database during the course of a project. ([Under development])
The DaSCH will provide projects with a configurable viewer that allows to present data in a project-specific individually configured way. This can be considered as a generalization of the IIIF-manifest concept (IIIF Presentation API) for generic qualitative data (manifest++). This allows project specific search and discovery that would not be possible by the generic access tool (SALSAH, provided and maintained by the DaSCH) or, in the future, by 3rd party tools. It is to note that the creation and maintenance of additional project specific applications is not the task of the DaSCH (see 2.1.2 Additional services below). ([Under development])
The DaSCH creates a project specific data model using best practices (interoperability, standards, longevity etc.) and standards for research projects or individual scholars
Tariff: CHF 1200.-/day (first day free of charge)
Development of project specific data import scripts, data cleaning etc.
Tariff: CHF 1200.-/day (first day free of charge)
Project-specific application development
Development of project specific applications (GUI based, data analysis tools etc.). It is to note that free long-term support for project specific applications cannot be given.
Tariff: CHF 1200.-/day or according to agreement (possibly with cost ceiling)
If long term support is desired, a yearly fee has to be charged.
Tariff: annually 25% of initial development cost
Open development paradigm
All DaSCH related code developments are open source and developed in and for the public. The development process, the code and the documentation are freely accessible on github. Any person can review, add, expand the code base using the mechanism of pull requests.
Usually free for research-IT groups and other research infrastructures. Consulting to research groups or individual scholars is free up to 8h. ↩︎
Including research-IT’s, faculties, other research infrastructures, memory institutions etc. ↩︎
Repositories usually do not support persistent identifiers (PIDs) for properties (data fields). But in order to allow the citations of a property of a data object, we consider PID’s necessary also for properties. ↩︎
In the future, SWISSUbase from FORS may be used for this purpose and integrated into the DaSCH access portal. ↩︎
A RESTful API is an application program interface (API) that uses HTTP requests to GET, PUT, POST and DELETE data. Also referred to as a RESTful web service, it is based on representational state transfer (REST) technology, an architectural style and approach to communications between computers. RESTful can be thought of as the language of the internet and is used by all major platforms like Amazon, Google, Twitter etc. ↩︎
The International Image Interoperability Framework (IIIF) is driven by a community of research, national and state libraries, museums, companies and image repositories to provide a standardized, interoperable way to access images and other bitstream resources on the internet (see https://iiif.io). ↩︎
However, data and digital bitstream objects may be marked as deleted. The only exception for this is when the physical deletion is required for legal reasons. ↩︎
the use of standard vocabulary cannot be enforced, as research often penetrates into new areas where standards have not yet been set. ↩︎
In order to implement access permissions, timestamp-based versioning and provenance information, standard ontologies may not be used directly. However, the Knora framework provides a mapping mechanism to map data model definitions to standard entities (such as dcterms:Creator) and thus enable semantic interoperability. ↩︎
For images, the IIIF-manifest concept allows to use standard viewers (e.g. Mirador, Universal Viewer etc.) to access image resources from arbitrary servers as long they implement the IIIF presentation API based on “manifests” describing the metadata of the image resources. The DaSCH will extend the concept of the manifest for arbitrary qualitative data (and propose these extension to the IIIF consortium). ↩︎
A tool to maintain a consistent and target-oriented software development in open source projects. E.g. https://github.com/dhlab-basel/Knora, https://github.com/dhlab-basel/Sipi, https://github.com/dhlab-basel/knora-api-js-lib, https://github.com/dhlab-basel/knora-py ↩︎