DaSCH provides services to individual scholars and to projects of research consortia from all areas of the Humanities which deal mainly with qualitative research data. Qualitative data is defined as structured data such as databases of any kind, possibly linked to bitstream objects such as large texts, images, videos and audios. The characteristics of qualitative research is that it is mainly descriptive.
Our basic services include the following:
A repository and access service. The repository stores the data and associated descriptive and/or technical metadata and allows their long-term curation.
DaSCH provides software components or programs which can ingest data and enable researchers to find, access and re-use the data. There are access services for machines (machine readable API and data formats) as well as end-points in human readable form (programs for end-users with GUI).
All objects are provided with persistent identifiers based on ARK standards.
Support for local research-IT-groups at the Higher Education Institutions, research projects and individual scholars.
Training in the use of our repository and access services. Education in best practices in data management for qualitative data, applicable standards, methods and tools to find and analyze the data provided by DaSCH.
DaSCH operates and maintains a digital repository that allows long-term access to qualitative data and provides the necessary software tools to facilitate the depositing process as well as discovery and re-use of deposited data. Our tools are in accordance with FAIR principles and international standards for interoperability. The repository service provides the following functions:
- DaSCH operates a data repository. The repository allows to deposit not only “simple” datasets in form of flat files, but also a project specific “complex” dataset with data based on project/user specific data models, and to upload, query and retrieve qualitative data such as text, images, digital facsimile, sound files, videos etc.
- The repository guarantees access to data in accordance with FAIR principles for an indefinite period. To achieve this goal, the original data may have to be remodeled in order to be compliant with our platform and files such as images or videos are stored in selected formats. After this process the deposited data will be continuously curated and adapted to newer technical developments, best practices and standards by DaSCH. In addition, the repository is also compliant with the OAIS archiving standard and maintains the provenience of the data and guarantees its integrity.
- DaSCH provides basic tools for the management of data (upload, query, retrieval) and access permissions (embargo periods, copyright issues etc.). The implementation of embargo periods for restricted access allows researchers to publish their results before the data becomes generally available. An elaborated access control system protects items that are under copyright or should not be publicly available due to personal rights. In addition, DaSCH provides a user interface for easy access to the data.
- DaSCH provides and maintains an architecture as well as tools and libraries to enable easy re-use of the data. Special attention is paid to the possibility of re-use of data beyond the original project boundaries in order to facilitate new methods of data mining and data aggregation.
- Archival Resource Keys (ARKs) are provided as persistent identifiers for digital objects and data.
The repository service of DaSCH is free of charge for national research projects or research projects with Swiss participation. In the case of very large data volumes, cost sharing by the project or its hosting institution must be negotiated. DaSCH is registered at re3data, the registry of research data repositories, and at forschungsdaten.info.
DaSCH distinguishes between first and second level support. We define first level support as direct support of scholars. Second level support is directed at institutional units that provide local first level support such as the local Research-ITs at the various research institutions. DaSCH focuses on second level support and collaborates with the resident Research-IT groups at the Higher Education Institutions. In cases where no local Research-IT is available or the complexity of a project's data is beyond the capabilities of the resident Research-IT, DaSCH may directly provide first level support to the scholars.
We provide support for the use of our digital resources and tools in generating new and re-using existing digital research data for cutting-edge research.
- DaSCH may provide consulting in all questions regarding data management of qualitative data and digital objects including digitization, formats, data handling, workflows etc. (cost sharing is required for research groups or individual scholars if 8 hours of work are exceeded).
- DaSCH may advise on data modelling or models the data on behalf of and in close cooperation with its clients. This service is free for research-IT group members and other research infrastructures. For research groups or individual scholars cost sharing is required if 8 hours of work are exceeded.
- DaSCH may develop project-specific tools for data cleaning, data import and data dissemination (cost sharing is required).
- DaSCH may participate in research projects as a partner and also receive project-specific funding if a project has very high demands on the data management of qualitative data (cost sharing is required).
- DaSCH offers courses and training in the use of the DaSCH infrastructure.
- DaSCH provides courses and training about best practices in management and (re-)use of qualitative data in Humanities’ research.
- DaSCH provides teaching in new digital research methods related to qualitative data.
DaSCH uses various channels for education and training that will be offered at regular intervals for various interest groups on a national level:
- Workshop and courses
- Online courses (planned)
- Video tutorials (planned)
The addressees are the researchers themselves as well as the research-IT staff of the institutions of higher education. Particular attention is paid to young researchers (PhD level), for whom special offers will be made available within the framework of their doctoral training. The courses are given by academically qualified staff and can, if desired, be integrated by the institutions of higher education into their respective curricula.
The following additional services are non-mandatory services which DaSCH may offer (resources permitting). The same rates apply if the first 8 hours free of charge are exceeded.
DaSCH creates a project specific data model for research projects or individual scholars in close cooperation with its clients using best practices concerning interoperability, standards and longevity.
CHF 1280,-/day for PI at the University of Basel
CHF 1280,-/day + 7.7% VAT for all other institutions
DaSCH develops project specific data cleaning and data import scripts. This service requires close cooperation with the client.
CHF 1280,-/day for PI at the University of Basel
CHF 1280,-/day + 7.7% VAT for all other institutions
Technical description of the core functions and services of DaSCH
DaSCH operates the DaSCH Service Platform (DSP) to preserve and guarantee access to qualitative data and related bitstream objects of research projects. Projects typically entail
- complex information objects such as databases of any kind which contain linked and structured information such as text, geographical information or date and time. These objects can either be stand-alone or connected with bitstream objects.
- bitstream objects that are related to qualitative data. These bitstream objects include images, sound files, motion pictures, 3D-models and documents.
Scholars usually use a variety of database systems (such as FileMaker, MySQL, NoSQL-databases, MS Access) or Excel or even simple folder hierarchies to create and work with qualitative data. The bitstream objects are usually stored on the file system as binary data files (e.g., TIFF, MP3, OBJ). For further research and re-use of these data it is of vital importance that
- they remain easily retrievable in their complex form
- that they are provided with permanent identifiers on the level of each individual database record. This allows for a citation and a data export in the most precise way.
If the criteria preservation of the complex form, easy retrievability and precise citability are fulfilled, the original research work gains added value.
Since it is impossible to convert a quickly growing amount of research data permanently to the newest data formats for a whole range of different applications (e.g., MS Access, Filemaker, etc), DaSCH operates one platform - the DaSCH Service Platform (DSP) - within which a constant adaptation to the changing technology and to the evolving standards for data representation and interoperability is guaranteed. A data model compliant with DSP has to be created for each project and the research data imported into the platform.
DSP is strongly protected against data loss. The servers are provided by SWITCH. The data is redundantly kept at two different geographical locations and each location provides a 3-time redundancy based on a secure filesystem. The used deployment framework (docker) is constantly adapted to the latest technological changes.
DSP guarantees the semantic interoperability and has FAIR access principles implemented as far as they are applicable to qualitative data:
DSP has a version history (timestamp based historisation) of all data objects and their properties implemented. This feature allows an Archival Resource Key identifier (ARK) to show an object as it was when the identifier has been created, even if the data object has changed in the meantime.
DaSCH provides libraries and tools for data import and data export. These libraries and tools facilitate the use of the DSP for local research-supporting IT members as well as scholars directly. Libraries and tools allow a Humanities scholar with basic IT-knowledge to access the services provided by DaSCH, especially for data import into the repository, while the use of the application program interface (API) still requires considerable IT knowledge.
DaSCH provides a generic access tool to search, view and edit the data within the DaSCH repository. All manipulations can be traced due to a timestamp-based version history.
DaSCH provides a generic database management tool. For projects of low to moderate data complexity, it allows to design a project-specific data model. Afterwards the generic access tool of DSP can be used as a research environment during the course of a project and data entered directly into DSP.
DaSCH will provide research projects with a configurable viewer that allows to present data in a project-specific individually configured way. This can be considered as a generalisation of the IIIF-manifest concept (IIIF Presentation API) for generic qualitative data (manifest++). This allows project specific search and discovery that would not be possible by the generic access tool or, in the future, by 3rd party tools. It is to note that the creation and maintenance of additional project specific applications is not the task of DaSCH (see Additional Services).
DaSCH pursues the paradigm of open development. All DaSCH related code developments are open source and developed in and for the public. The development process, the code and the documentation are freely accessible on github. Any person can review, add and expand the code base using the mechanism of pull requests.
- DSP has persistent Archival Resource Key identifiers (ARK) implemented for the project itself, but also for all resources, properties and bitstream objects. This means that each single data field can be referenced if deemed useful. (FAIR F1 & F4 according to force11)
- The project description provides sufficient information about the research field as well as the goals and aspects under which the data were collected. Digital objects are supplied with adequate metadata. If necessary, a simple project specific web application may be provided to facilitate findability of relevant data. (FAIR F2)
- The DSP is searchable based on a so-called RESTful API (Representational State Transfer Application Program Interface), an application programming interface (API) that uses HTTP requests to GET, PUT, POST and DELETE data. Our generic DSP web-based application is provided to search and view resources of all projects that are in the repository. (FAIR F3)
- Project’s metadata can be searched separately in order to locate datasets of potential interest for the user in a quick and easy way.
- DSP possesses a fine-grained authentication and permission system. For access to digital objects, the IIIF authentication API is implemented. (FAIR A1.2)
- DSP usually does not delete data to ensure that all data is always accessible. Data and digital bitstream objects, however, may be marked as deleted. The only exception for this approach is if a physical deletion is required for legal reasons. (FAIR A2)
- The application program interface (API) is based on widely adopted open standards such as RESTful API, JSON-LD and the query language SPARQL. (FAIR I1)
- The data models are described in machine readable and human readable form using Linked Open Data standards such as RDF, RDFS, OWL and SHACL. The models can be exported in standard formats such as RDF/XML, Turtle, N3 and JSON-LD. DaSCH encourages the use of standard vocabularies and implements it through mapping. Standard ontologies may not be used directly due to the implementation of access permissions, timestamp-based versioning and provenance information. Thus, semantic interoperability has to be achieved via a mapping mechanism which maps data model definitions to standard entities. (FAIR I2)
- DSP allows links and connections to project-internal data objects, to other projects within the DaSCH repository and to external data objects of other repositories (data-data object, data-bitstream object). (FAIR I3)
- The data models are self-describing as far as possible and supplemented by comments. They are provided in both human and machine-readable forms. (FAIR R1)
- DaSCH encourages the use of Creative Commons (CC) licenses, but custom licenses are supported as well. (FAIR R1.1)
- All data are associated to a project and a creator. (FAIR R1.2)
- DSP is based on Linked Open Data technology (RDF, RDFS, OWL, SHACL, SPARQL). It supports standard as well as project and domain specific ontologies. (FAIR R1.3)