This community-supported project aims to provide a simple to re-use example on how to run Dataverse on a Kubernetes cluster.
NOTE: all Docker images in this project work for released versions of Dataverse only!
You'll need a running and fully configured Kubernetes cluster either using OpenShift, Minikube, a full-fledged Cluster, GKE or similar.
When you want to register datasets and/or files in your deployment to DataCite, EZID or similar, you will need working accounts to configure. Otherwise, you might want to use the FAKE provider. See also: http://guides.dataverse.org/en/latest/installation/config.html and IQSS/dataverse#5448
Quick'n'dirty demo on naked cluster:
kubectl apply -k .
Notes:
- This will of course need a recent
kubectl
and a configured cluster context. - This is usable for demo purposes.
- You really want to provide a secure admin password for anything serious.
You should make yourself familiar with a series of documentation articles, linked below:
- Container images
- Persistance storage
- Detailed insight into inner workings
- Using Kubernetes descriptors from this project
- Configuration of Dataverse
- Secrets usage
- (Custom) Metadata Blocks
- Maintenance Jobs and Little Helpers
Please be aware that this project currently only offers images and support for basic usage. Integrations are not yet part of this, but may be added as needed. See also relevant docs within Dataverse guides and upstream projects.
First, you will need to read up and get familiar with all of the above about production usage. More details about usage for developing Dataverse below.
- Development container images
- Prepare toolchain
- Using local cluster
- Using remote cluster (not yet supported)
If you think this is weird and/or cumbersome:
As long as K8s usage is not a first class citizen for IQSS, this project should not (or cannot) be included in Dataverse upstream.+ We don't have to deal with upstream merge process for PRs and can move quicker.
+ We can use tools like Skaffold, Kustomization, etc only usable when living at the topmost level.
- We have to deal with `git submodules` and somewhat bloated image builds.
- We cannot use fancy Maven tools like JIB and others.
When switching to a new Dataverse version (you will need to change the image tag), please always read upstream release notes carefully.
Obviously, deployments or changed files are included in the images, but sometimes, you will need to execute some actions manually.
These actions are left out of automation by intent. For example re-indexing might be a heavy lifting task in your installation and put heavy load on your deployment (you might want to schedule that for off-hours).
We will try to point out any of those in release notes of our k8s images.
This project is supported by the Dataverse community rather than IQSS. If you need help, please open an issue.
At a later point in time, an Operator might be added for even easier usage.
The docker images should at some point be moved into the upstream code, so they can be build and used for development purposes, too. See also issue 5292 on this.
This should support testing S3 remote file storage with Minio out of the box.