Skip to main content

deduplicate (Stream Processor)

Removes duplicate events based on the unique.key parameter that arrive within the time.interval gap from one another.

Syntax

unique:deduplicate(<INT|LONG|FLOAT|BOOL|DOUBLE|STRING> unique.key, <INT|LONG> time.interval)

Query Parameters

NameDescriptionDefault ValuePossible Data TypesOptionalDynamic
unique.keyParameter to uniquely identify events.INT LONG FLOAT BOOL DOUBLE STRINGNoYes
time.intervalThe sliding time period within which the duplicate events are dropped.INT LONGNoNo

Example 1

CREATE STREAM TemperatureStream (sensorId string, temperature double);
CREATE SINK STREAM UniqueTemperatureStream (sensorId string, temperature double);

@info(name = 'deduplicateTemperatureQuery')
INSERT INTO UniqueTemperatureStream
SELECT *
FROM TemperatureStream WINDOW unique:deduplicate(sensorId, 30 sec);

This query creates a TemperatureStream with the attributes sensorId and temperature. It then creates a UniqueTemperatureStream with the same attributes.

The deduplicateTemperatureQuery processes events from the TemperatureStream and uses a WINDOW unique:deduplicate(sensorId, 30 sec) clause to remove duplicate events based on the sensorId attribute when they arrive within 30 seconds of each other. The deduplicated events are inserted into the UniqueTemperatureStream.