Video Object Detection: `findface-video-manager` and `findface-video-worker`

Note

The findface-video-worker is delivered in a CPU-accelerated (findface-video-worker-cpu) and a GPU-accelerated (findface-video-worker-gpu) package.

In this section:

Functions of findface-video-manager
Functions of findface-video-worker
Configure Video Object Detection
Jobs
Time Settings

Functions of `findface-video-manager`

The findface-video-manager service is the part of the video object detection module that is used for managing the video object detection functionality.

The findface-video-manager service interfaces with findface-video-worker as follows:

It supplies findface-video-worker with settings and the list of to-be-processed video streams. To do so, it issues a so-called job, a video processing task that contains configuration settings and stream data.
In a distributed system, it distributes video streams (jobs) across vacant findface-video-worker instances.

Note

The configuration settings passed via jobs have priority over the findface-video-manager.yaml configuration file.

The findface-video-manager service functioning requires etcd, a third-party software that implements a distributed key-value store for findface-video-manager. In the FindFace core, etcd is used as a coordination service, providing the video object detector with fault tolerance.

findface-video-manager functionality:

allows for configuring video object detection parameters
allows for managing the list of to-be-processed video streams

Functions of `findface-video-worker`

The findface-video-worker service (on CPU/GPU) is the part of the video object detection module, that recognizes objects in the video. It can work with both live streams and files, and supports most video formats and codecs that can be decoded by FFmpeg.

The findface-video-worker service interfaces with the findface-video-manager and findface-facerouter services as follows:

By request, findface-video-worker gets a job with settings and the list of to-be-processed video streams from findface-video-manager.
The findface-video-worker posts extracted normalized object images, along with the full frames and metadata (such as bbox, camera ID and detection time) to the findface-facerouter service for further processing.

Note

In FindFace Multi, the findface-facerouter functions are performed by findface-multi-legacy.

findface-video-worker functionality:

detects objects in the video,
normalizes images of objects,
tracking objects in real time and posting the best object snapshot.

When processing a video, findface-video-worker consequently uses the following algorithms:

Motion detection. Used to reduce resource consumption. Only when the motion detector recognizes the motion of certain intensity that the object tracker can be triggered.
Object tracking. The object tracker traces, detects, and captures objects in the video. It can simultaneously be working with several objects. It also searches for the best object snapshot using the embedded neural network. After the best object snapshot is found, it is posted to findface-facerouter.

The best object snapshot can be found in one of the following modes:

Real-time
Offline

Real-Time Mode

In the real-time mode, findface-video-worker posts an object on-the-fly after it appears in the camera field. The following posting options are available:

If realtime_post_every_interval: true, the object tracker searches for the best object snapshot within each time period equal to realtime_post_interval and posts it to findface-facerouter.
If realtime_post_every_interval: false, the object tracker searches for the best face snapshot dynamically:
1. First, the object tracker estimates whether the quality of an object snapshot exceeds a pre-defined internal threshold. If so, the snapshot is posted to findface-facerouter.
2. The threshold value increases after each post. Each time the object tracker gets a higher quality snapshot of the same object, it is posted.
3. When the object disappears from the camera field, the threshold value resets to default.
If realtime_post_first_immediately: true, the object tracker doesn’t wait for the first realtime_post_interval to complete and posts the first object from a track immediately after it passes through the quality, size, and ROI filters. The way the subsequent postings are sent depends on the realtime_post_every_interval value. If realtime_post_first_immediately: false, the object tracker posts the first object after the first realtime_post_interval completes.

Offline Mode

The offline mode is less storage intensive than the real-time one as in this mode findface-video-worker posts only one snapshot per track but of the highest quality. In this mode, the object tracker buffers a video stream with an object until the object disappears from the camera field. Then the object tracker picks up the best object snapshot from the buffered video and posts it to findface-facerouter.

Configure Video Object Detection 

The video object detector configuration is done through the following configuration files:

The findface-video-manager configuration file findface-video-manager.yaml. You can find its default content here.

When configuring findface-video-manager, refer to the following parameters:

Option	Description
`etcd` → `endpoints`	IP address and port of the `etcd` service. Default value: `etcd:2379`.
`ntls` → `enabled`	If true, `findface-video-manager` will send a job to `findface-video-worker` only if the total number of processed cameras does not exceed the allowed number of cameras from the license. Default value: false.
`ntls` → `url`	IP address and port of the `findface-ntls` host. Default value: `http://findface-ntls:3185/`.
`router_url`	IP address and port of the `findface-facerouter` host to receive detected faces from `findface-video-worker`. In FindFace Multi, `findface-facerouter` functions are performed by `findface-multi-legacy`.

The following parameters are available for stream_settings configuration:

Option	Description
`play_speed`	If less than zero, the speed is not limited. In other cases, the stream is read with the given `play_speed`. Not applicable for live streams.
`disable_drops`	Enables posting all appropriate objects without drops. By default, if `findface-video-worker` does not have enough resources to process all frames with objects, it drops some of them. If this option is active, `findface-video-worker` puts odd frames on the waiting list to process them later. Default value: false.
`imotion_threshold`	Minimum motion intensity to be detected by the motion detector. The threshold value is to be fitted empirically. Empirical units: zero and positive rational numbers. Milestones: 0 = detector disabled, 0.002 = default value, 0.05 = minimum intensity is too high to detect motion.
`router_timeout_ms`	Timeout for a `findface-facerouter` (or `findface-multi-legacy` in the standard FindFace Multi configuration) response to a `findface-video-worker` API request, in milliseconds. If the timeout has expired, the system will log an error. Default value: `15000`.
`router_verify_ssl`	Enables an HTTPS certificate verification when `findface-video-worker` and `findface-facerouter` (or `findface-multi-legacy` in the standard FindFace Multi configuration) interact over HTTPS. Default value: true. If false, a self-signed certificate can be accepted.
`router_headers`	Additional header fields in a request when posting an object: [“key = value”]. Default value: headers not specified.
`router_body`	Additional body fields in a request body when posting an object: [“key = value”]. Default value: body fields not specified.
`ffmpeg_params`	List of a video stream ffmpeg options with their values as a key=value array: [“rtsp_transpotr=tcp”, .., “ss=00:20:00”]. Check out the FFmpeg web site for the full list of options. Default value: options not specified.
`ffmpeg_format`	Pass FFMPEG format (mxg, flv, etc.) if it cannot be detected automatically.
`use_stream_timestamp`	If true, retrieve and post timestamps from a video stream. If false, post the actual date and time.
`start_stream_timestamp`	Add the specified number of seconds to timestamps from a stream.
`rot`	Enables detecting and tracking objects only inside a clipping rectangle WxH+X+Y. You can use this option to reduce `findface-video-worker` load. Default value: rectangle not specified.
`stream_data_filter`	POSIX extended regex, if a content of the data stream is matched a filter, it will be sent it to `router_url`. Default value: not specified.
`video_transform`	Change a video frame orientation right after decoding. Values (case insensitive, JPEG Exif Orientation Tag in brackets): None (1), FlipHorizontal (2), Rotate180 (3), FlipVertical (4), Transpose (5), Rotate90 (6), Transverse (7), Rotate270 (8). Default value: not specified.
`enable_recorder`	Enables video recording for Video Recorder (must be installed).
`enable_liveness`	Enables liveness (must be installed). Default value: false.
`record_audio`	Enables audio recording. Default value: false.
`use_rtsp_time`	If `use_stream_timestamp: true`, add start stream timestamp of the RTSP source. Default value: true.
`draw_rot`	Filling outside the ROT. Possible values: `"none"`: the area outside the ROT mask is filled only before object detection. Frames are posted unchanged, and the entire downstream pipeline works with the original frames. `"area"`: (default), the fill outside the ROT area is applied to the entire pipeline, and the frames are posted with this fill.
`detectors`	Detectors: `"face"`, `"car"`, `"body"`.

The following parameters are available for configuration for each detector type (face, body, car):

Option	Description
`filter_min_quality`	Minimum threshold value for an object image quality. Default value: subject to the object type. Do not change the default value without consulting with our technical experts (support@ntechlab.com).
`filter_min_size`	Minimum size of an object in pixels. Calculated as the square root of the relevant bbox area. Undersized objects are not posted. Default value: `1`.
`filter_max_size`	Maximum size of an object in pixels. Calculated as the square root of the relevant bbox area. Oversized objects are not posted. Default value: `8192`.
`roi`	Enable posting objects detected only inside a region of interest WxH+X+Y. Default value: region not specified.
`fullframe_crop_rot`	Crop posted full frames by ROT. Default value: false.
`fullframe_use_png`	Send full frames in PNG and not in JPEG as set by default. Do not enable this parameter without supervision from our team as it can affect the entire system functioning. Default value: false (send in JPEG).
`jpeg_quality`	Quality of an original frame JPEG compression, in percents. Default value: 95%.
`overall_only`	Enables the offline mode for the best object search. Default value: true (CPU), false (GPU).
`realtime_post_first_immediately`	Enable posting an object image right after it appears in a camera field of view (real-time mode). Default value: false.
`realtime_post_interval`	Only for the real-time mode. Defines the time period in seconds within which the object tracker picks up the best snapshot and posts it to `findface-facerouter`. Default value: `1`.
`realtime_post_every_interval`	Only for the real-time mode. Post best snapshots obtained within each `realtime_post_interval` time period. If false, search for the best snapshot dynamically and send snapshots in order of increasing quality. Default value: false.
`track_interpolate_bboxes`	Interpolate missed bboxes of objects in track. For example, if frames #1 and #4 have bboxes and #2 and #3 do not, the system will reconstruct the absent bboxes #2 and #3 based on the #1 and #4 data. Enabling this option allows you to increase the detection quality on account of performance. Default value: true.
`track_miss_interval`	The system closes a track if there has been no new object in the track within the specified time (seconds). Default value: `1`.
`track_overlap_threshold`	Tracker IoU overlap threshold. Default value: `0.25`.
`track_max_duration_frames`	The maximum approximate number of frames in a track after which the track is forcefully completed. Enable it to forcefully complete “eternal tracks,” for example, tracks with objects from advertisement media. The default value: `0` (option disabled).
`track_send_history`	Send track history. Default value: false.
`post_best_track_frame`	Send full frames of detected objects. Default value: true.
`post_best_track_normalize`	Send normalized images for detected objects. Default value: true.
`post_first_track_frame`	Post the first frame of a track. Default value: false.
`post_last_track_frame`	Post the last frame of a track. Default value: false.
`tracker_type`	Tracker type (simple_iou or deep_sort). Default value: `simple_iou`.
`track_deep_sort_matching_threshold`	Track features matching threshold (confidence) for deep_sort tracker. Default value: `0.65`.
`track_deep_sort_filter_unconfirmed_tracks`	Filter unconfirmed (too short) tracks in deep sort tracker. Default value: true.
`track_object_is_principal`	Track by this object in N-in-1 detector/tracker. Default value: false.
`track_history_active_track_miss_interval`	Don’t count a track as active if N seconds have passed, only if `track_send_history=true`. Default value: `0`.
`filter_track_min_duration_frames`	Post only if the object track length is at least N frames. Default value: `1`.
`tracker_settings`	Tracker settings. Available only for the OC‑SORT tracker.
`extractors_track_triggers`	Tracker events that trigger the extractor.

The findface-video-worker configuration file findface-video-worker-cpu.yaml or findface-video-worker-gpu.yaml, subject to the acceleration type in use.

When configuring findface-video-worker (on CPU/GPU), refer to the following parameters:

CPU/GPU Option		Description
`input`		Process streams from file, ignoring stream data from `findface-video-manager`.
`exit_on_first_finished`		(Only if `input` is specified) Exit on the first finished job.
`batch_size`		Post faces in batches of the given size.
`metrics_port`		HTTP server port to send metrics. If 0, the metrics are not sent.
`resize_scale`		Rescale video frames with the given coefficient.
`capacity`		Maximum number of video streams to be processed by `findface-video-worker`.
`ntls_addr`		IP address and port of the `findface-ntls` host.
`save_dir`		(For debug) Save detected objects to the given directory.
`resolutions`		Preinitialize detector to work with specified resolutions. Example: `"640x480;1920x1080"`.
`strict_resolutions`		Use only the values defined in `resolutions`; any other values will be rescaled.
`labels`		Labels used to allocate a video object detector instance to a certain group of cameras. See Allocate findface-video-worker to Camera Group.
`use_time_from_sei`		(For MPEG-2) Use SEI (supplemental enhancement information) timestamps.
`frame_buffer_size`		Reader frame buffer size.
`events_queue_size`		Internal events queue size.
`skip_count`		Skip count.
`mgr` →		`mgr` parameters:
	`cmd`	(Optional, instead of the `mgr` → `static` parameter) A command to obtain the IP address of the `findface-video-manager` host.
	`static`	IP address of the `findface-video-manager` host to provide `findface-video-worker` with settings and the list of to-be-processed streams.
	`id_prefix`	ID prefix of the `findface-video-worker` instance.
`streamer` →		Streamer parameters:
	`port`	IP-port to access the video wall.
	`url`	URL address to access the video wall.
	`tracks`	Use tracks instead of detects for the streamer.
`tracks_last`		Use tracks with `lastFrameId=currentFrameId` (.tracks must be true).
`max_backpressure`		Max backpressure for client connection (bytes).
`liveness` →		Liveness parameters:
	`fnk`	Path to liveness model files (.fnk).
	`norm`	Path to normalization model files for liveness.
	`batch_size`	Liveness batch size.
	`interval`	Internal liveness detection algorithm.
	`stdev_cnt`	Internal liveness detection algorithm.
`imotion` → `shared_decoder`		Use shared decoder for imotion.
`send` →		Send parameters:
	`threads`	Posting threads.
	`queue_limit`	Posting maximum queue size.
	`disable_drops`	Disable send queue drops.
`recorder` →		Recorder parameters:
	`enabled`	Video recording enabled.
	`chunk_size`	Maximum size of video recording chunks.
	`storage_dir`	Absolute path to the temporary storage folder.
	`video_storage`	Video storage API URL, persistent threads for uploads & requests, API requests timeout, and number of retries on request failure.
`video_decoder` → `cpu`		(Only GPU) If necessary, decode video on CPU.
`device_number`		(Only GPU) GPU device number to use.
`models` →		Model parameters:
	`cache_dir`	Path to cache directory; default path is `/var/cache/findface/models_cache`.
	`detectors`	Detectors.
	`normalizers`	Normalizers.
	`extractors`	Extractors.
	`objects`	Objects.
`detectors` →		Detector parameters:
	`face/body/car` → `min_size`	Minimum object size (`face`, `body`, or `car`) that can be detected.

If necessary, you can also enable neural network models and normalizers to detect bodies, cars, and liveness. You can find the detailed step-by-step instructions in the following sections:

Jobs 

The findface-video-manager service provides findface-video-worker with a so-called job, a video processing task that contains configuration settings and stream data.

You can find a job example here.

Each job has the following parameters:

id: job id.
enabled: active status.
stream_url: URL/address of video stream/file to process.
labels: key-value labels, that will be used by the router component (findface-multi-legacy in the standard FindFace Multi configuration) to find processing directives for objects detected in this stream.
router_url: URL/address of the router component (findface-facerouter, findface-multi-legacy) to receive detected objects from the findface-video-worker component for processing.
router_events_url: URL/address of the router component ( findface-facerouter, findface-multi-legacy) that uses event extraction.
single_pass: if true, disable restarting video processing upon error (by default, false).
stream_settings: video stream settings that duplicate those in the findface-video-manager.yaml configuration file (while having priority over them).
stream_settings_gpu: deprecated video stream settings. Not recommended for use. Only for compatibility.
weight: job weight, positive floating point number. Default value: 1.0 . It can be set in the range [1e-3, 1e6] with a step of 1e-3; the entered value is rounded to three decimal places. The weight indicates how many capacity units of a findface-video-worker the job consumes. In other words, the findface-video-manager assigns a job to the findface-video-worker whose total weight of all assigned jobs does not exceed its capacity.
prio: the priority of a job, expressed as a floating point number that can be positive or negative. Default value: 0.0. The value may be set in the range [-1e6, 1e6] with a step of 1e-3; the entered value is rounded to three decimal places.

If the capacity of suitable findface-video-worker instances is exhausted, jobs with a higher priority preempt jobs with a lower priority. In addition, any single_pass job is considered to have higher priority than any non single_pass job. When selecting a job to preempt, the scheduler sorts first by the single_pass flag and then by priority. Therefore, the algorithm works as follows: search for a findface-video-worker that currently holds the lowest-priority non single_pass jobs. If no such findface-video-worker is found, search for a findface-video-worker that holds the lowest-priority single_pass jobs. The identified low-priority jobs are interrupted, freeing the required number of capacity units, after which the higher-priority job is assigned.
status: job status.
status_msg: additional job status info.
statistic: job progress statistics (progress duration, number of posted and not posted objects, processing fps, the number of processed and dropped frames, job start time, etc.).
restream_url: websocket URL where processing stream with detected objects streams live time.
restream_direct_url: websocket URL where original stream with input quality streams live time.
shots_url: HTTP URL where actual stream screenshot can be downloaded.
worker_id: unique id of the findface-video-worker instance with a processing job.
version: job version.

Time Settings 

When you create a job, it is possible to specify time parameters. These parameters determine how the event timestamps will be generated on posting or upon recording in to the VMS, which is important for calculating the final timestamp of an event. The default time parameter in the video-worker.yaml is use_time_from_sei: false. The default time parameters in the video-manager.yaml are use_stream_timestamp: false, use_rtsp_time: true.

Let’s consider various configurations:

use_stream_timestamp: false, use_time_from_sei: either true or false, use_rtsp_time: either true or false.

The current server time (wall-clock time) will be used.
use_stream_timestamp: true, use_time_from_sei: true, use_rtsp_time: either true or false.

SEI timestamps (if any) or stream timestamps (pts) will be used unchanged.
use_stream_timestamp: true, use_time_from_sei: false, use_rtsp_time: true.

The final timestamp will be calculated by the formula: final_ts = pts - start_pts + start_stream_timestamp + rtsp_start_time, where
- pts — pts stream timestamps.
- start_pts — the minimum observed pts of the stream, subtracted to make the first frame time equal to 0.
- start_stream_timestamp — a job setting within stream_settings.
- rtsp_start_time — “Start time of the stream in real world time”, specified by certain RTSP servers.
use_stream_timestamp: true, use_time_from_sei: false, use_rtsp_time: false.

The final timestamp will be calculated by the formula: final_ts = pts - start_pts + start_stream_timestamp.

Video Object Detection: findface-video-manager and findface-video-worker

Functions of findface-video-manager

Functions of findface-video-worker

Configure Video Object Detection

Jobs

Time Settings

Video Object Detection: `findface-video-manager` and `findface-video-worker`

Functions of `findface-video-manager`

Functions of `findface-video-worker`

Configure Video Object Detection 

Jobs 

Time Settings 