Skip to content
Snippets Groups Projects
Commit 5c141b99 authored by Bastien Abadie's avatar Bastien Abadie
Browse files

Architectures

parent 516e4701
No related branches found
No related tags found
No related merge requests found
Showing
with 161 additions and 16 deletions
......@@ -22,6 +22,12 @@ Arkindex needs a few hard-requirements to run on your own hardware:
- [Docker](https://docs.docker.com/get-docker/) is needed, as we only deploy through Docker images,
- Linux servers are the only Operating System support. We heavily recommend using [Ubuntu LTS](https://ubuntu.com/download/desktop)
- All your images must be hosted on a [IIIF](https://iiif.io/) server, or you'll need to expose them through a local IIIF server
- A domain name for the platform server:
- ideally this is a public domain name if your server is reachable on Internet (like `arkindex.company.com`),
- or an internal domain name, provided by your company's system administrator.
- An SSL certificate for that domain name:
- it can be provided by [Let's Encrypt](https://letsencrypt.org/) freely and automatically if your server is reachable on Internet
- otherwise an internal certificate , provided by your company's system administrator.
To run the Enterprise Edition, your servers must be able to make regular API calls (a few times a day) on a remote server to validate its licence. The server does **not** necessarily need to be exposed to Internet, but simply be able to make requests towards a domain.
......
......@@ -5,11 +5,18 @@ weight = 10
+++
## Architecture
The main part of the architecture uses a set of open-source software along with our own proprietary software.
We'll use different terms for the components of our product:
- **Platform server** is the server that will run the **Backend** code responsible for the **Rest API**,
- Arkindex needs to run some specific asynchronous tasks that require direct access to the database: the **local worker** will execute these tasks,
- In the Enterprise Edition, some intensive Machine Learning tasks will be executed by **Remote workers**, using a proprietary software called **Ponos**. One instance of Ponos is called an **Agent**.
{{ figure(image="deployment/architecture.png", height=400, caption="Arkindex platform architecture") }}
# Overview
The main part of the architecture uses a set of open-source software along with our own software.
{{ figure(image="deployment/architecture/overview.png", height=400, caption="Arkindex platform architecture") }}
The open source components here are:
- Traefik as load balancer,
......@@ -19,9 +26,74 @@ The open source components here are:
- PostgreSQL as database,
- Solr as search engine.
You'll also need to run a set of workers on dedicated servers: this is where the Machine Learning processes will run.
## Machine Learning
In the Enterprise Edition, you'll also need to run a set of workers on dedicated servers: this is where the Machine Learning processes will run.
{{ figure(image="deployment/workers.png", height=400, caption="Arkindex workers for Machine Learning") }}
{{ figure(image="deployment/architecture/workers.png", height=400, caption="Arkindex workers for Machine Learning") }}
Each worker in the diagram represents a dedicated server, running our in-house job scheduling agents and dedicated Machine Learning tasks.
# Common cases
We only cover the most common cases here; if you have questions about your own architecture, please Please [contact us](https://teklia.com/company/contact/).
## Single Server
This is the simplest option, a standalone server that hosts all the services using **Docker container**.
A single `docker-compose.yml` can efficiently deploy the whole stack.
{{ figure(image="deployment/architecture/single.png", height=400, caption="Arkindex stack on a single server") }}
**Pros :**
- Simple to deploy and maintain
- Cheap
**Cons :**
- Limited disk space
- Limited performance
- Single point of failure
## Cluster
With more budget, you can deploy Arkindex across several servers, still using docker-compose along with placement constraints on [Docker Swarm](https://docs.docker.com/engine/swarm/).
A **Docker Swarm cluster** enables you to run Docker services instead of containers, with multiple containers per service so you can benefit from higher throughput and eliminate single points of failures.
{{ figure(image="deployment/architecture/cluster.png", height=400, caption="Arkindex stack on a Docker Swarm cluster") }}
**Pros :**
- High performance
- Services replica for high availability
- Network segregation for better security
**Cons :**
- Limited disk space
- Harder to maintain and monitor
## Cloud provider
You can also deploy Arkindex using a Cloud provider (like Amazon AWS, Google GCP, Microsoft Azure), using their managed services to replace self-hosting databases and shared S3-compatible storage.
Most cloud providers provide manged offers for the services required by Arkindex (Load balancer, Postgresql, S3-compatible storage, search engine & redis cache). You'll then just need to run Arkindex containers:
- through managed Docker containers
- by building your own Docker swarm cluster on their VPS offering
{{ figure(image="deployment/architecture/cloud.png", height=400, caption="Arkindex stack on a cloud provider") }}
**Pros :**
- High performance
- Low maintenance for non-hosted services
- Unlimited disk space
**Cons :**
- Expensive
- Vendor lock-in
graph TD
lb[Load balancer] --> docker
subgraph docker[Containers]
backend[Arkindex backend]
cantaloupe[IIIF server]
worker[Arkindex internal worker]
end
subgraph services[Managed service]
minio[S3-compatible storage]
redis[Cache]
db[Database]
solr[Search engine]
end
subgraph gpu[GPU-enabled services]
ponos -->ml_task[Machine Learning Task]
end
docker --> services
gpu --> docker
content/deployment/architecture/cloud.png

32.1 KiB

graph TD
subgraph server_web1[WebServices n°2]
lb[Load balancer]
lb --> backend[Arkindex backend]
lb --> cantaloupe[IIIF server]
end
subgraph server_web2[WebServices n°2]
lb --> backend2[Arkindex backend]
lb --> cantaloupe2[IIIF server]
end
subgraph storage[File storage]
lb --> minio
end
subgraph server_db1[Databases]
minio[S3-compatible storage]
redis[Cache]
db[Database]
solr[Search engine]
end
subgraph server_worker1[Worker n°1]
worker[Arkindex internal worker]
worker2[Arkindex internal worker n°2]
end
subgraph server_worker2[Worker n°2]
ponos -->ml_task[Machine Learning Task]
end
server_web1 -..-> server_db1
server_web2 -..-> server_db1
server_worker1 -..-> server_db1
server_worker2 -..-> server_web1
server_worker2 -..-> server_web2
content/deployment/architecture/cluster.png

68 KiB

graph TD
subgraph server
lb[Load balancer] --> frontend[Arkindex frontend]
frontend --> backend
lb --> backend[Arkindex backend]
lb --> cantaloupe[IIIF server]
lb --> minio[S3-compatible storage]
cantaloupe --> minio
backend --> worker[Arkindex internal worker]
worker --> backend
backend --> redis[Cache]
backend --> db[Database]
backend --> solr[Search engine]
end
content/deployment/architecture/single.png

38.3 KiB

......@@ -7,23 +7,12 @@ weight = 30
This documentation is written for **system administrators**.
We'll use different terms for the components of our product:
- **Platform server** is the server that will run the **Backend** code responsible for the **Rest API**,
- Arkindex needs to run some specific asynchronous tasks that require direct access to the database: the **local worker** will execute these tasks,
- Some intensive Machine Learning tasks will be executed by **Remote workers**, using a proprietary software called **Ponos**. One instance of Ponos is called an **Agent**.
## Requirements
- A bare metal server running Linux Ubuntu LTS (20.04 or 22.04) for the platform
- If you plan to run Machine Learning processes, you'll need another server
- [Docker installed on that server](https://docs.docker.com/desktop/install/linux-install/)
- [docker-compose](https://docs.docker.com/desktop/install/linux-install/)
- A domain name for the platform server:
- ideally this is a public domain name if your server is reachable on Internet (like `arkindex.company.com`),
- or an internal domain name, provided by your company's system administrator.
- An SSL certificate for that domain name:
- it can be provided by [Let's Encrypt](https://letsencrypt.org/) freely and automatically if your server is reachable on Internet
- otherwise an internal certificate , provided by your company's system administrator.
## Third-party services
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment