Multiple Video Cards Usage
Should you have several video cards installed on a physical server, you can create additional findface-extraction-api
or findface-video-worker
instances on your GPU-based system and distribute them across the video cards, one instance per card.
If you have followed the instruction to prepare the server on Ubuntu (CentOS, Debian) you should already be all set to proceed.
However, before you take further action, make sure that you have a properly generated /etc/docker/daemon.json
configuration file. The example is provided for the Ubuntu operating system and shows how to configure the Docker network and also configure usage of the NVIDIA Container Runtime. Note that in our example, we imply that you have already installed the NVIDIA Container Runtime.
sudo su
BIP=10.$((RANDOM % 256)).$((RANDOM % 256))
cat > /etc/docker/daemon.json <<EOF
{
"default-address-pools":
[
{"base":"$BIP.0/16","size":24}
],
"bip": "$BIP.1/24",
"fixed-cidr": "$BIP.0/24",
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
},
"default-runtime": "nvidia"
}
EOF
Refer to the corresponding sections of the documentation to validate configuration of the /etc/docker/daemon.json
file for CentOS and Debian.
In this section:
Distribute findface-extraction-api
Instances Across Several Video Cards
To distribute the findface-extraction-api
instances across numerous video cards, create multiple instances of the findface-extraction-api
service in the docker-compose.yaml
configuration file. Then, for the findface-extraction-api
GPU instances to work within one system, bind them via a load balancer, e.g., Nginx.
Do the following:
Configure the
docker-compose.yaml
file.Open the
/opt/findface-cibr/docker-compose.yaml
file and create multiple records of thefindface-extraction-api
configuration in thefindface-extraction-api
section. Each new instance of thefindface-extraction-api
should be configured similarly to another one so that there are no problems in query processing. Do the following:Rename the default
findface-extraction-api
service section.Create configuration copies for the new instances.
Configure the instances.
Below is an example of two configured services:
sudo vi /opt/findface-cibr/docker-compose.yaml findface-extraction-api-0: command: [--config=/etc/findface-extraction-api.ini] depends_on: [findface-ntls] environment: - CUDA_VISIBLE_DEVICES=0 - CFG_LISTEN=127.0.0.1:18660 image: docker.int.ntl/ntech/universe/extraction-api-gpu:ffserver-9.230407.1 logging: {driver: journald} network_mode: service:pause restart: always runtime: nvidia volumes: ['./configs/findface-extraction-api/findface-extraction-api.yaml:/etc/findface-extraction-api.ini:ro', './models:/usr/share/findface-data/models:ro', './cache/findface-extraction-api/models:/var/cache/findface/models_cache'] findface-extraction-api-1: command: [--config=/etc/findface-extraction-api.ini] depends_on: [findface-ntls] environment: - CUDA_VISIBLE_DEVICES=1 - CFG_LISTEN=127.0.0.1:18661 image: docker.int.ntl/ntech/universe/extraction-api-gpu:ffserver-9.230407.1 logging: {driver: journald} network_mode: service:pause restart: always runtime: nvidia volumes: ['./configs/findface-extraction-api/findface-extraction-api.yaml:/etc/findface-extraction-api.ini:ro', './models:/usr/share/findface-data/models:ro', './cache/findface-extraction-api/models:/var/cache/findface/models_cache']
findface-extraction-api-0
— the readable name of an instance. Must be unique for each instance;CUDA_VISIBLE_DEVICES=0
— GPU ID to run the service instance. Must be unique for each instance;CFG_LISTEN=127.0.0.1:18660
— IP:Port to listening requests. Must be unique for each instance;./cache/findface-extraction-api/models:/var/cache/findface/models_cache
— the volume for storing the model cache. It may be the same if you have the same GPU models. In other cases, the caches are stored separately.
The remaining configuration parameters can be the same for every
findface-extraction-api
service instance.Configure the load balancing in the
/opt/findface-cibr/docker-compose.yaml
file. To do so, add a new service (findface-extraction-api-lb
in our example) to thedocker-compose.yaml
file, e.g., to the end of the file.findface-extraction-api-lb: depends_on: [findface-ntls] image: docker.int.ntl/ntech/multi/multi/ui-cibr:ffcibr-2.1.2 logging: {driver: journald} network_mode: service:pause restart: always volumes: ['./configs/findface-extraction-api/loadbalancer.conf:/etc/nginx/conf.d/default.conf:ro']
We have specified the
loadbalancer.conf
configuration file in the volumes of thefindface-extraction-api-lb
section. Now we need to create it.
Create the load balancer configuration file with the following content:
sudo vi /opt/findface-cibr/configs/findface-extraction-api/loadbalancer.conf upstream findface-extraction-api-lb { least_conn; server 127.0.0.1:18660 max_fails=3 fail_timeout=60s; server 127.0.0.1:18661 max_fails=3 fail_timeout=60s; } server { listen 18666 default_server; server_name _; location / { proxy_pass http://findface-extraction-api-lb; } }
In the
upstream
section, configure the balancing policy and the list of thefindface-extraction-api
instances between which the load will be distributed.You can choose among the sending policy options below, but we recommend using the
least_conn
sending policy.round-robin
— requests are distributed evenly across the backend servers in a circular manner;least_conn
— requests are sent to the server with the fewest active connections. This algorithm is useful when you have backend servers with different capacities;weighted
— backend servers are assigned different weights, and requests are distributed based on those weights. It allows you to prioritize certain servers over others.
The
server 127.0.0.1:18666 max_fails=3 fail_timeout=60s
line consists of the following meaningful parts:server
— defines an upstream server in Nginx;127.0.0.1:18660
— represents the IP address (127.0.0.1
) and port (18660
) of thefindface-extraction-api
instance;max_fails=3
— this parameter sets the maximum number of failed attempts that Nginx will allow when connecting to the server before considering it as unavailable;fail_timeout=60s
— this parameter specifies the amount of time Nginx should consider the server as unavailable (in this case, 60 seconds) if it exceeds the maximum number of failed attempts specified bymax_fails
;weight=2
(not used in the provided example) — this parameter allows you to assign a relative weight or priority to each server in an upstream group. This means that for every request, the server withweight=2
will receive twice as many requests. It might be useful in scenarios where GPUs of different performance are used.
Open the
/opt/findface-cibr/configs/findface-sf-api/findface-sf-api.yaml
configuration file. If the address or port of the load balancer differs from the127.0.0.1:18666
in theextraction-api
section, you must set it to127.0.0.1:18666
. Otherwise, don’t change anything.sudo vi /opt/findface-cibr/configs/findface-sf-api/findface-sf-api.yaml extraction-api: timeouts: connect: 5s response_header: 30s overall: 35s idle_connection: 10s max-idle-conns-per-host: 20 keepalive: 24h0m0s trace: false url: http://127.0.0.1:18666
Rebuild all FindFace CIBR containers, removing at the same time orphan containers for services that are no longer defined in the
docker-compose.yaml
file.cd /opt/findface-cibr/ docker-compose down docker-compose up -d --remove-orphans
To make sure that everything works as expected, check the logs of the services.
docker compose logs --tail 10 -f findface-extraction-api-0
docker compose logs --tail 10 -f findface-extraction-api-1
docker compose logs --tail 10 -f findface-extraction-api-lb
Allocate findface-video-worker
to Additional Video Card
To create an additional findface-video-worker
instance on your GPU-based system and allocate it to a different video card, do the following:
In the
/opt/findface-cibr/docker-compose.yaml
file, specify thefindface-video-worker
configuration for each runningfindface-video-worker
instance. Copy your currentfindface-video-worker
configuration.sudo vi /opt/findface-cibr/docker-compose.yaml findface-video-worker: command: [--config=/etc/findface-video-worker.yaml] depends_on: [findface-video-manager, findface-ntls, mongodb] environment: [CUDA_VISIBLE_DEVICES=0] image: docker.int.ntl/ntech/universe/video-worker-gpu:ffserver-9.230407.1 logging: {driver: journald} network_mode: service:pause restart: always runtime: nvidia volumes: ['./configs/findface-video-worker/findface-video-worker.yaml:/etc/findface-video-worker.yaml:ro', './models:/usr/share/findface-data/models:ro', './cache/findface-video-worker/models:/var/cache/findface/models_cache', './cache/findface-video-worker/recorder:/var/cache/findface/video-worker-recorder']
Then, adjust it accordingly.
findface-video-worker-0: command: [--config=/etc/findface-video-worker.yaml] depends_on: [findface-video-manager, findface-ntls, mongodb] environment: - CUDA_VISIBLE_DEVICES=0 - CFG_STREAMER_PORT=18990 - CFG_STREAMER_URL=127.0.0.1:18990 image: docker.int.ntl/ntech/universe/video-worker-gpu:ffserver-9.230407.1 logging: {driver: journald} network_mode: service:pause restart: always runtime: nvidia volumes: ['./configs/findface-video-worker/findface-video-worker.yaml:/etc/findface-video-worker.yaml:ro', './models:/usr/share/findface-data/models:ro', './cache/findface-video-worker/models:/var/cache/findface/models_cache', './cache/findface-video-worker/recorder:/var/cache/findface/video-worker-recorder'] findface-video-worker-1: command: [--config=/etc/findface-video-worker.yaml] depends_on: [findface-video-manager, findface-ntls, mongodb] environment: - CUDA_VISIBLE_DEVICES=1 - CFG_STREAMER_PORT=18991 - CFG_STREAMER_URL=127.0.0.1:18991 image: docker.int.ntl/ntech/universe/video-worker-gpu:ffserver-9.230407.1 logging: {driver: journald} network_mode: service:pause restart: always runtime: nvidia volumes: ['./configs/findface-video-worker/findface-video-worker.yaml:/etc/findface-video-worker.yaml:ro', './models:/usr/share/findface-data/models:ro', './cache/findface-video-worker/models:/var/cache/findface/models_cache', './cache/findface-video-worker/recorder:/var/cache/findface/video-worker-recorder']
The main parameters here are as follows:
findface-video-worker-0
— the new name of thefindface-video-worker
instance (in our example, thefindface-video-worker
instance is assigned a number equal to the GPU ID on the device);CUDA_VISIBLE_DEVICES=0
— the CUDA device ID to run a new instance;CFG_STREAMER_PORT=18991
— the streamer port (must be unique);CFG_STREAMER_URL=127.0.0.1:18991
— the URL to connect to and get stream from thefindface-video-worker
instance (must be unique).
All other parameters remain unchanged.
Rebuild all FindFace CIBR containers, removing at the same time orphan containers for services that are no longer defined in the
docker-compose.yaml
file.cd /opt/findface-cibr/ docker-compose down docker-compose up -d --remove-orphans
To make sure that everything works fine, check the logs of the findface-video-worker
services.
docker compose logs --tail 10 -f findface-video-worker-0
docker compose logs --tail 10 -f findface-video-worker-1