What is the process we should be following to ensure we don’t get duplicate events?

Hi all, what is the process we should be following to ensure we don’t get duplicate events? I found one reference to “event.disambiguation_key”, is this the way forward?

0 2 331
2 REPLIES 2

Hi @Ion_Todd , There is no user based mechanism for deduplication of data on the Chronicle SIEM side of things at this time. Identical batches of logs are automatically deduplicated.
The disambiguating key is used when a single log outputs multiple UDM events, eg if a single log out outs two UDM events they will be tagged disambiguation key 1 and disambiguation key 2.
Just to confirm when you say Event you mean a parsed log into UDM and not a Rule detection alert, correct?

Thanks for the clarification @Gal_Polak1 Sorry for the incorrect wording, when I said event above, i’m actually talking about a raw log hitting the ingest API (and being parsed). So i’m sort of conflating things.

Are you able to share how identical batches are automatically deduplicated? An easy example for me to find right now is a custom log source where we don’t have an event timestamp in the raw log. These are 4 of the same raw log being replayed (I assume accidentally by us) into the ingest API. The search string i’ve used is the log’s ID

Adding a timestamp to the original log is possible, is the deduplication relying on id + timestamp?

View files in slack