Video Object Detection: `video-manager` and `video-worker`

In this section:

Functions of video-manager
Functions of video-worker
Configure Video Object Detection
Jobs

Functions of `video-manager`

The video-manager service is the part of the video object detection module that is used for managing the video object detection functionality.

The video-manager service interfaces with video-worker as follows:

It supplies video-worker with settings and the list of to-be-processed video streams. To do so, it issues a so-called job, a video processing task that contains configuration settings and stream data.
In a distributed system, it distributes video streams (jobs) across vacant video-worker instances.

Note

The configuration settings passed via jobs have priority over the video-manager.yaml configuration file.

The video-manager service functioning requires etcd, third-party software that implements a distributed key-value store for video-manager. In the FindFace Server, etcd is used as a coordination service, providing the video object detector with fault tolerance.

Functionality:

allows for configuring video object detection parameters,
allows for managing the list of to-be-processed video streams.

Functions of `video-worker`

The video-worker service (on CPU/GPU) is the part of the video object detection module, that recognizes objects in the video. It can work with both live streams and files, and supports most video formats and codecs that can be decoded by FFmpeg.

The video-worker service interfaces with the video-manager and router services (e.g. facerouter) as follows:

By request, video-worker gets a job with settings and the list of to-be-processed video streams from video-manager.
The video-worker posts extracted normalized object images, along with the full frames and meta data (such as bbox, camera ID and detection time) to the router service (facerouter) for further processing.

Functionality:

detects objects in the video,
normalizes images of objects,
tracking objects in real time and posting the best object snapshot.

When processing a video, video-worker consequently uses the following algorithms:

Motion detection. Used to reduce resource consumption. Only when the motion detector recognizes the motion of certain intensity that the object tracker can be triggered.
Object tracking. The object tracker traces, detects, and captures objects in the video. It can simultaneously be working with several objects. It also searches for the best object snapshot using the embedded neural network. After the best object snapshot is found, it is posted to facerouter.

The best object snapshot can be found in one of the following modes:

Real-time
Offline

Real-Time Mode

In the real-time mode, video-worker posts an object on-the-fly after it appears in the camera field. The following posting options are available:

If realtime_post_every_interval: true, the object tracker searches for the best object snapshot within each time period equal to realtime_post_interval and posts it to facerouter.
If realtime_post_every_interval: false, the object tracker searches for the best face snapshot dynamically:
1. First, the object tracker estimates whether the quality of an object snapshot exceeds a pre-defined internal threshold. If so, the snapshot is posted to facerouter.
2. The threshold value increases after each post. Each time the object tracker gets a higher quality snapshot of the same object, it is posted.
3. When the object disappears from the camera field, the threshold value resets to default.
If realtime_post_first_immediately: true, the object tracker doesn’t wait for the first realtime_post_interval to complete and posts the first object from a track immediately after it passes through the quality, size, and ROI filters. The way the subsequent postings are sent depends on the realtime_post_every_interval value. If realtime_post_first_immediately: false, the object tracker posts the first object after the first realtime_post_interval completes.

Offline Mode

The offline mode is less storage intensive than the real-time one as in this mode video-worker posts only one snapshot per track but of the highest quality. In this mode, the object tracker buffers a video stream with an object until the object disappears from the camera field. Then the object tracker picks up the best object snapshot from the buffered video and posts it to facerouter.

Configure Video Object Detection 

The video object detector configuration is done through the following configuration files:

The video-manager configuration file can be built by command line flag -config-template. You can find its default content here.

When configuring video-manager, refer to the following parameters:

Command line flags	Type	Description
`-config`	string	Path to config file.
`-config-template`	–	Output config template and exit.
`-etcd-dial-timeout`	duration	Timeout for failing to establish a connection (default 3s).
`-etcd-endpoints`	string	List of URLs, separated by comas (default `127.0.0.1:2379`).
`-etcd-key-prefix`	string	Prefix for several sharded managers.
`-exp-backoff-enabled`	–	Enabled.
`-exp-backoff-factor`	float	Factor for increasing delay (default 2).
`-exp-backoff-flush-interval`	duration	Flush delay to min_delay if now()-last_NOT_STARTED > x (default 2m0s).
`-exp-backoff-max-delay`	duration	Maximum delay (default 1m0s).
`-exp-backoff-min-delay`	duration	Initial delay (default 1s).
`-generate-openapi`		Flag for generating openapi docs and exit.
`-help`	–	Print help information.
`-job-scheduler-script`	string	Lua script to schedule jobs.
`-job-status-change-script`	string	Lua script run on job status change.
`-kafka-enabled`	–	Enabled.
`-kafka-endpoints`	string	List of URLs, separated by comas (default `127.0.0.1:9092`).
`-listen`	string	IP:port to listen on (HTTP server) (default `:18810`).
`-master-lease-ttl`	int	Lease time-to-live, in seconds (default 10).
`-master-self-url`	string	Self url (default `127.0.0.1:18811`).
`-master-self-url-http`	string	Self url http (default `127.0.0.1:18810`).
`-ntls-enabled`	–	Check limits on `ntls`. If true, `video-manager` will send a job to `video-worker` only if the total number of processed cameras does not exceed the allowed number of cameras from the license.
`-ntls-update-interval`	duration	`ntls` update interval (default 1m0s).
`-ntls-url`	string	URL of `ntls`, ui port (default `http://127.0.0.1:3185/`).
`-router-events-url`	string	Facerouter events URL.
`-router-url`	string	Facerouter URL to receive detected objects from `video-worker` (default `http://127.0.0.1:18820/v0/frame`).
`-rpc-heart-beat-timeout`	duration	rpc-heart-beat-timeout (default 4s).
`-rpc-listen`	string	Listen on, IP:PORT (default `127.0.0.1:18811`).

The format of environment variable for flag -my-flag is CFG_MY_FLAG.

Priority:

Defaults from source code (lowest priority).

Configuration file.

Environment variables.

Command line.

The following parameters are available for configuration stream_settings:

Option	Description
`play_speed`	If less than zero, the speed is not limited. In other cases, the stream is read with the given `play_speed`. Not applicable for live streams.
`disable_drops`	Enables posting all appropriate objects without drops. By default, if `video-worker` does not have enough resources to process all frames with objects, it drops some of them. If this option is active, `video-worker` puts odd frames on the waiting list to process them later. Default value: false.
`imotion_threshold`	Minimum motion intensity to be detected by the motion detector. The threshold value is to be fitted empirically. Empirical units: zero and positive rational numbers. Milestones: 0 = detector disabled, 0.002 = default value, 0.05 = minimum intensity is too high to detect motion.
`router_timeout_ms`	Timeout for a `facerouter` response to a `video-worker` API request, in milliseconds. If the timeout has expired, the system will log an error. Default value: 15000.
`router_verify_ssl`	Enables a https certificate verification when `video-worker` and `facerouter` interact over https. Default value: true. If false, a self-signed certificate can be accepted.
`router_headers`	Additional header fields in a request when posting an object: [“key = value”]. Default value: headers not specified.
`router_body`	Additional body fields in a request body when posting an object: [“key = value”]. Default value: body fields not specified.
`ffmpeg_params`	List of a video stream ffmpeg options with their values as a key=value array: [“rtsp_transpotr=tcp”, .., “ss=00:20:00”]. Check out the FFmpeg web site for the full list of options. Default value: options not specified.
`ffmpeg_format`	Pass FFMPEG format (mxg, flv, etc.) if it cannot be detected automatically.
`use_stream_timestamp`	If true, retrieve and post timestamps from a video stream. If false, post the actual date and time.
`start_stream_timestamp`	Add the specified number of seconds to timestamps from a stream.
`rot`	Enables detecting and tracking objects only inside a clipping rectangle WxH+X+Y. You can use this option to reduce `video-worker` load. Default value: rectangle not specified.
`video_transform`	Change a video frame orientation right after decoding. Values (case insensitive, JPEG Exif Orientation Tag in brackets): None (1), FlipHorizontal (2), Rotate180 (3), FlipVertical (4), Transpose (5), Rotate90 (6), Transverse (7), Rotate270 (8). Default value: not specified.
`enable_recorder`	Enables video recording for Video Recorder (must be installed).
`enable_liveness`	Enables liveness (must be installed). Default value: false.
`record_audio`	Enables audio recording. Default value: false.

The following parameters are available for configuration for each detector type (face, body, car):

Option	Description
`filter_min_quality`	Minimum threshold value for an object image quality. Default value: subject to the object type. Do not change the default value without consulting with our technical experts (support@ntechlab.com).
`filter_min_size`	Minimum size of an object in pixels. Calculated as the square root of the relevant bbox area. Undersized objects are not posted. Default value: 1.
`filter_max_size`	Maximum size of an object in pixels. Calculated as the square root of the relevant bbox area. Oversized objects are not posted. Default value: 8192.
`roi`	Enable posting objects detected only inside a region of interest WxH+X+Y. Default value: region not specified.
`fullframe_crop_rot`	Crop posted full frames by ROT. Default value: false.
`fullframe_use_png`	Send full frames in PNG and not in JPEG as set by default. Do not enable this parameter without supervision from our team as it can affect the entire system functioning. Default value: false (send in JPEG).
`jpeg_quality`	Quality of an original frame JPEG compression, in percents. Default value: 95%.
`overall_only`	Enables the offline mode for the best object search. Default value: true (CPU), false (GPU).
`realtime_post_first_immediately`	Enable posting an object image right after it appears in a camera field of view (real-time mode). Default value: false.
`realtime_post_interval`	Only for the real-time mode. Defines the time period in seconds within which the object tracker picks up the best snapshot and posts it to `facerouter`. Default value: 1.
`realtime_post_every_interval`	Only for the realtime mode. Post best snapshots obtained within each `realtime_post_interval` time period. If false, search for the best snapshot dynamically and send snapshots in order of increasing quality. Default value: false.
`track_interpolate_bboxes`	Interpolate missed bboxes of objects in track. For example, if frames #1 and #4 have bboxes and #2 and #3 do not, the system will reconstruct the absent bboxes #2 and #3 based on the #1 and #4 data. Enabling this option allows you to increase the detection quality on account of performance. Default value: true.
`track_miss_interval`	The system closes a track if there has been no new object in the track within the specified time (seconds). Default value: 1.
`track_overlap_threshold`	Tracker IoU overlap threshold. Default value: 0.25.
`track_max_duration_frames`	The maximum approximate number of frames in a track after which the track is forcefully completed. Enable it to forcefully complete “eternal tracks,” for example, tracks with objects from advertisement media. The default value: 0 (option disabled).
`track_send_history`	Send track history. Default value: false.
`post_best_track_frame`	Send full frames of detected objects. Default value: true.
`post_best_track_normalize`	Send normalized images for detected objects. Default value: true.
`post_first_track_frame`	Post the first frame of a track. Default value: false.
`post_last_track_frame`	Post the last frame of a track. Default value: false.
`tracker_type`	Tracker type (simple_iou or deep_sort). Default value: `simple_iou`.
`track_deep_sort_matching_threshold`	Track features matching threshold (confidence) for deep sort tracker. Default value: `0.65`.
`track_deep_sort_filter_unconfirmed_tracks`	Filter unconfirmed (too short) tracks in deep sort tracker. Default value: true.
`track_object_is_principal`	Track by this object in N-in-1 detector tracker. Default value: false.
`track_history_active_track_miss_interval`	Don’t count track as active if N seconds have passed, only if `track_send_history=true`. Default value: `0`.
`filter_track_min_duration_frames`	Post only if object track length at least N frames. Default value: `1`.
`extractors_track_triggers`	Tracker events that trigger the extractor.

The video-worker configuration file video-worker-cpu.yaml or video-worker-gpu.yaml, subject to the acceleration type in use.

When configuring video-worker (on CPU/GPU), refer to the following parameters:

Command line flags	Type	Description
`-h`, `--help`	{bool}	Display help and exit (env: `CFG_HELP`).
`-c`, `--config`	{string}	Path to config file (env: `CFG_CONFIG`).
`-C`, `--config-template`	{bool}	Echo default config to stdout (env: `CFG_CONFIG_TEMPLATE`).
`--config-template-format`	{string}	`config-template` output format (yaml/env/ini) (env: `CFG_CONFIG_TEMPLATE_FORMAT`).
`-i`, `--input`	{string}	Read streams from file, do not use `video-manager` (env: `CFG_INPUT`).
`--exit-on-first-finished`	{bool}	Exit on first finished job, only when `--input` specified (env: `CFG_EXIT_ON_FIRST_FINISHED`).
`--batch-size`	{number}	Batch size (env: `CFG_BATCH_SIZE`).
`--metrics-port`	{number}	Http server port for metrics, 0=do not start server (env: `CFG_METRICS_PORT`).
`--resize-scale`	{double}	Rescale video frames with the given coefficient, if `1` – do not resize (env: `CFG_RESIZE_SCALE`).
`--capacity`	{number}	Maximum number of video streams to be processed by `video-worker`. (env: `CFG_CAPACITY`).
`--ntls-addr`	{string}	`ntls` server ip:port (env: `CFG_NTLS_ADDR`).
`--save-dir`	{string}	Debug: save objects to dir (env: `CFG_SAVE_DIR`).
`--resolutions`	{ResolutionsVector}	Preinit detector for specified resolutions: “640x480;1920x1080” (env: `CFG_RESOLUTIONS`).
`--strict-resolutions`	{bool}	Use `resolutions` as only possible values, others will rescale (env: `CFG_STRICT_RESOLUTIONS`).
`--labels`	{workerLabels}	`video-worker` labels: labels = k=v;group=enter (env: `CFG_LABELS`).
`--use-time-from-sei`	{bool}	Use timestamps from SEI packet (env: `CFG_USE_TIME_FROM_SEI`).
`--frame-buffer-size`	{number}	Reader frame buffer size (env: `CFG_FRAME_BUFFER_SIZE`).
`--events-queue-size`	{number}	Internal events queue size (env: `CFG_EVENTS_QUEUE_SIZE`).
`--skip-count`	{number}	Skip count (env: `CFG_SKIP_COUNT`).
`--mgr-cmd`	{string}	Command to obtain `video-manager` ‘s grpc ip:port (env: `CFG_MGR_CMD`).
`--mgr-static`	{string}	`video-manager` grpc ip:port (env: `CFG_MGR_STATIC`).
`--mgr-id-prefix`	{string}	`video-worker` ID prefix (env: `CFG_MGR_ID_PREFIX`).
`--streamer-port`	{number}	Streamer/shots webserver port, 0=disabled (env: `CFG_STREAMER_PORT`).
`--streamer-url`	{string}	Streamer url to access this `video-worker` on `streamer_port` (env: `CFG_STREAMER_URL`).
`--streamer-tracks`	{bool}	Use tracks instead detects for streamer (env: `CFG_STREAMER_TRACKS`).
`--streamer-tracks-last`	{bool}	Use tracks with lastFrameId=currentFrameId (.tracks must be true) (env: `CFG_STREAMER_TRACKS_LAST`).
`--streamer-max-backpressure`	{number}	Max backpressure for client connection (bytes) (env: `CFG_STREAMER_MAX_BACKPRESSURE`).
`--liveness-fnk`	{string}	Path to liveness fnk (env: `CFG_LIVENESS_FNK`).
`--liveness-norm`	{string}	Path to normalization for liveness (env: `CFG_LIVENESS_NORM`).
`--liveness-batch-size`	{number}	Liveness batch size (env: `CFG_LIVENESS_BATCH_SIZE`).
`--liveness-interval`	{double}	Liveness internal algo param (env: `CFG_LIVENESS_INTERVAL`).
`--liveness-stdev-cnt`	{number}	Liveness internal algo param (env: `CFG_LIVENESS_STDEV_CNT`).
`--imotion-shared-decoder`	{bool}	Use shared decoder for imotion (experimental) (env: `CFG_IMOTION_SHARED_DECODER`).
`--send-threads`	{number}	Posting threads (env: `CFG_SEND_THREADS`).
`--send-queue-limit`	{number}	Posting maximum queue size (env: `CFG_SEND_QUEUE_LIMIT`).
`--send-disable-drops`	{bool}	Disable send queue drops (env: `CFG_SEND_DISABLE_DROPS`).
`--recorder-enabled`	{bool}	Video recording enabled (env: `CFG_RECORDER_ENABLED`).
`--recorder-chunk-size`	{number}	Maximum size of video recording chunks (env: `CFG_RECORDER_CHUNK_SIZE`).
`--recorder-storage-dir`	{string}	Absolute path to the temporary storage folder (env: `CFG_RECORDER_STORAGE_DIR`).
`--recorder-video-storage-url`	{string}	Video storage api url (env: `CFG_RECORDER_VIDEO_STORAGE_URL`).
`--recorder-video-storage-threads`	{number}	Persistent threads for uploads & requests (env: `CFG_RECORDER_VIDEO_STORAGE_THREADS`).
`--recorder-video-storage-timeout`	{double}	Video-storage api requests timeout, seconds (env: `CFG_RECORDER_VIDEO_STORAGE_TIMEOUT`).
`--recorder-video-storage-max-retries`	{number}	Number of retries on request failure (env: `CFG_RECORDER_VIDEO_STORAGE_MAX_RETRIES`).
`--models-cache-dir`	{string}	Path to cache directory (env: `CFG_MODELS_CACHE_DIR`).
`--models-detectors`	{CfgDetectors}	Detectors (env: `CFG_MODELS_DETECTORS`).
`--models-normalizers`	{CfgNormalizer}	Normalizers (env: `CFG_MODELS_NORMALIZERS`).
`--models-extractors`	{CfgExtractor}	Extractors (env: `CFG_MODELS_EXTRACTORS`).
`--models-objects`	{CfgObject}	Objects (env: `CFG_MODELS_OBJECTS`).
`--face-min-size`	{number}	DEPRECATED [use models-detectors] detector param (env: `CFG_FACE_MIN_SIZE`).
`--face-detector`	{string}	DEPRECATED [use models-detectors] path to face detector (env: `CFG_FACE_DETECTOR`).
`--face-norm`	{string}	DEPRECATED [use models-objects/models-normalizers] path to normalizer (usually crop2x) (env: `CFG_FACE_NORM`).
`--face-quality`	{string}	DEPRECATED [use models-objects/models-extractors] path to face quality extractor (env: `CFG_FACE_QUALITY`).
`--face-norm-quality`	{string}	DEPRECATED [use models-extractors] path to face quality normalizer (env: `CFG_FACE_NORM_QUALITY`).
`--face-track-features`	{string}	DEPRECATED [use models-objects/models-extractors] path to face track features extractor (env: `CFG_FACE_TRACK_FEATURES`).
`--face-track-features-norm`	{string}	DEPRECATED [use models-extractors] path to face track features normalizer (env: `CFG_FACE_TRACK_FEATURES_NORM`).
`--body-min-size`	{number}	DEPRECATED [use models-detectors] detector param (env: `CFG_BODY_MIN_SIZE`).
`--body-detector`	{string}	DEPRECATED [use models-detectors] path to body detector (env: `CFG_BODY_DETECTOR`).
`--body-norm`	{string}	DEPRECATED [use models-objects/models-normalizers] path to normalizer (usually crop2x) (env: `CFG_BODY_NORM`).
`--body-quality`	{string}	DEPRECATED [use models-objects/models-extractors] path to body quality extractor (env: `CFG_BODY_QUALITY`).
`--body-norm-quality`	{string}	DEPRECATED [use models-extractors] path to body quality normalizer (env: `CFG_BODY_NORM_QUALITY`).
`--body-track-features`	{string}	DEPRECATED [use models-objects/models-extractors] path to body track features extractor (env: `CFG_BODY_TRACK_FEATURES`).
`--body-track-features-norm`	{string}	DEPRECATED [use models-extractors] path to body track features normalizer (env: `CFG_BODY_TRACK_FEATURES_NORM`).
`--car-min-size`	{number}	DEPRECATED [use models-detectors] detector param (env: `CFG_CAR_MIN_SIZE`).
`--car-detector`	{string}	DEPRECATED [use models-detectors] path to car detector (env: `CFG_CAR_DETECTOR`).
`--car-norm`	{string}	DEPRECATED [use models-objects/models-normalizers] path to normalizer (usually crop2x) (env: `CFG_CAR_NORM`).
`--car-quality`	{string}	DEPRECATED [use models-objects/models-extractors] path to car quality extractor (env: `CFG_CAR_QUALITY`).
`--car-norm-quality`	{string}	DEPRECATED [use models-extractors] path to car quality normalizer (env: `CFG_CAR_NORM_QUALITY`).
`--car-track-features`	{string}	DEPRECATED [use models-objects/models-extractors] path to car track features extractor (env: `CFG_CAR_TRACK_FEATURES`).
`--car-track-features-norm`	{string}	DEPRECATED [use models-extractors] path to car track features normalizer (env: `CFG_CAR_TRACK_FEATURES_NORM`).

When configuring note the format of models-detectors, models-normalizers, models-extractors and models-objects:

--models-detectors: describes detectors.

format: name1:/path/to/file.fnk,min_size=60;name2:file2.fnk,min_size=100;....

name1: an unique name.

/path/to/file.fnk: a path to .fnk.

min_size=60: an optional parameter.

Example:

CFG_MODELS_DETECTORS="bag:/usr/share/findface-data/models/detector/bag.gustav_accurate.001.gpu.fnk;bear:/usr/share/findface-data/models/detector/bear.gustav_accurate.001.gpu.fnk,min_size=100"

models:
  detectors:
    bag:
      fnk_path: /usr/share/findface-data/models/detector/bag.gustav_accurate.001.gpu.fnk
      min_size: 60
    bear:
      fnk_path: /usr/share/findface-data/models/detector/bear.gustav_accurate.001.gpu.fnk
      min_size: 100

--models-normalizers: describes normalizers, that can be used in --models-extractors and --models-objects.

format: name1:/path/to/file.fnk;name2:file2.fnk;....

name1: an unique name.

/path/to/file.fnk: a path to .fnk.

Example:

CFG_MODELS_NORMALIZERS="face_norm:/usr/share/findface-data/models/facenorm/crop2x.v2_maxsize400.gpu.fnk;face_quality_norm:/usr/share/findface-data/models/facenorm/crop1x.v2_maxsize400.gpu.fnk"

models:
  normalizers:
    face_norm:
      fnk_path: /usr/share/findface-data/models/facenorm/crop2x.v2_maxsize400.gpu.fnk
    face_quality_norm:
      fnk_path: /usr/share/findface-data/models/facenorm/crop1x.v2_maxsize400.gpu.fnk

--models-extractors: describes extractors, that can be used in --models-objects.

format: name1:/path/to/file.fnk,normalizer=normName;....

name1: an unique name.

/path/to/file.fnk: a path to .fnk.

normalizer=normName: must be described in --models-normalizers

batch_size=123: an optional parameter sets the size of the batch for this extractor.

Example:

CFG_MODELS_EXTRACTORS="face_quality:/usr/share/findface-data/models/faceattr/quality_fast.v1.gpu.fnk,normalizer=face_quality_norm,batch_size=512"

models:
  extractors:
    face_quality:
      fnk_path: /usr/share/findface-data/models/faceattr/quality_fast.v1.gpu.fnk
      normalizer: face_quality_norm
      batch_size: 512

--models-objects: describes objects.
- format: name1:normalizer=normName,quality=extractorName,track_features=extractorName;....
- name1: an object name.
- normalizer=normName: how to normalize an object when sending it to router_url. normName must be described in --models-normalizers. It may be empty.
- quality=extractorName: how to extract the quality of an object. extractorName must be described in --models-extractors. It may be empty.
- track_features=extractorName: how to extract features for the DeepSortTracker. extractorName must be described in --models-extractors. It may be empty.
Example:
```
CFG_MODELS_OBJECTS="face:normalizer=face_norm,quality=face_quality,track_features="
```
```
models:
  objects:
    face:
      normalizer: face_norm
      quality: face_quality
      track_features: ""
```

If necessary, you can also enable neural network models and normalizers to detect bodies, vehicle, and liveness. You can find the detailed step-by-step instructions in the following sections:

Enable Face Liveness Detection

Enable Body and Body Attribute Recognition

Enable Vehicle and Vehicle Attribute Recognition

Jobs 

The video-manager service provides video-worker with a so-called job, a video processing task that contains configuration settings and stream data.

You can find a job example here and view the parameters of the job here.

Video Object Detection: video-manager and video-worker

Functions of video-manager

Functions of video-worker

Configure Video Object Detection

Jobs

Video Object Detection: `video-manager` and `video-worker`

Functions of `video-manager`

Functions of `video-worker`

Configure Video Object Detection 

Jobs 