Direct API requests to deduplicator

You can use HTTP API to extract data directly from the deduplicator component. The deduplicator checks EmbeN against a new or existing deduplication group.

In this section:

General Information

The deduplicator component accepts requests to http://<deduplicator_ip>:18310/.

If there is no group with this ID, or if the parameters differ – then a group is (re)created with the provided parameters (status code 201).

If there is already a group with the same ID and the same parameters, EmbeN is probed against it (status code 200).

If a match is found in the group, this method returns unique=false and the meta object that was supplied with the matching EmbeN when we first saw it. The matching EmbeN in the deduplication group will have its timestamp updated, current vector and its metadata will be discarded.

If no match is found, this method returns unique=true and no meta. Current EmbeN and its metadata will be stored in the deduplication group.

API v1

POST /v1/groups/{group_id}/probe

This is the main API of deduplicator. It accepts an EmbeN, deduplication group parameters and EmbeN metadata.

Parameters in path segments:

  • group_id: ID of the group to probe against, string.

Request body:

  • application/json: the request body contains only JSON.

{
  "emben": "string",
  "params": {
    "deduplication_period": 30,
    "deduplication_threshold": 0.75
  },
  "meta": 123
}
  • emben: EmbeN, string. Required parameter.

  • params:

    • deduplication_period: deduplication period in seconds, integer. Required parameter.

    • deduplication_threshold: deduplication similarity threshold, number($float). Required parameter.

  • meta: any JSON value describing this EmbeN.

Returns:

EmbeN successfully probed against existing group on success.

{
  "unique": false,
  "matched_meta": 123
}

Common Codes

Code

Description

201

Created new group or reconfigured existing.

400

Invalid request or parameters. BAD_PARAM.