Application Insights - Alerts- Custom Signal

Alex Maiereanu 40 Reputation points

Hi,

when setting up azure alerts, based on queries with KQL on the Traces table, we have the situations that alerts are being constantly thrown, after the event when the custom query actually had a hit. So timeline would be:

At 10:00 we have a hit in Traces tables.

An alert is fired since -> ok

The same alert type is being triggered, even though there are no more hits in the table, it's like the alerting system enters into a loop.

The alerts stop being thrown after a couple of hours or if disable/enable again.

(also to make sure and not get standard AI responses, yes the alert has a specific timewindow. We also checked the query behind the alert and it does not have any hits )

  1. Bharath Y P 9,730 Reputation points Microsoft External Staff Moderator

    Hello Alex, Just checking to see, If the information shared was helpful, please accept the answer and upvote. Please feel free to reach out if you have any additional questions or need further clarification. Thanks

  2. Alex Maiereanu 40 Reputation points

    Hi Bharath,
    Sadly no, your answer was not helpful. We already configured correctly the alerts. This situation started happening by itself. The alerts keep firing for hours after the first event

  3. Alex Maiereanu 40 Reputation points

    Is there anybody from azure support team reviewing this case?

  4. Bharath Y P 9,730 Reputation points Microsoft External Staff Moderator

    You can validate and confirm whether this is a backend/platform issue, collect the following:

    1. Compare TimeGenerated vs ingestion_time()
    Traces
    | where TimeGenerated > ago(24h)
    | extend ingestion = ingestion_time()
    | project TimeGenerated, ingestion, Message
    | order by ingestion desc
    
    

    Check whether old traces are being ingested or replayed late.

    1. Verify if the alert payload references old records: From the alert history:
    • inspect fired alert timestamps
    • compare them with the actual trace timestamps

    If the same original event timestamp appears repeatedly, that strongly indicates stale evaluation reuse.

    1. Check whether dimensions are enabled

    If dimensions are configured:

    • operation_Id
    • cloud_RoleName
    • customDimensions

    Azure can maintain separate alert instances internally even when the base query later returns empty.

    We recommend you validate and confirm ingestion timestamps (ingestion_time()), reviewing the alert payload for repeated historical events.

    Thanks you

  5. Bharath Y P 9,730 Reputation points Microsoft External Staff Moderator

    Just checking if the information shared was helpful. If so, please consider accepting the answer and giving it an upvote.

    If you have any further questions or need additional clarification, feel free to reach out. Thanks!


Sign in to comment

3 answers

  1. Bharath Y P 9,730 Reputation points Microsoft External Staff Moderator

    Hello Alex, it looks like you’re running into the classic “stale-event” scenario with log-query alerts. Azure Monitor log alerts work on a sliding time window – every time the rule runs (evaluation frequency), it looks back over the last N minutes (time window) and if it finds any rows, it fires. That means if your window is larger than your frequency (for example a 60 min window evaluated every 5 min), a single trace at 10:00 will still be “in scope” on every run until it ages out of that window. You’ll see alerts every 5 min until ~11:00, then it stops or you reset (disable/re-enable) the rule.

    Here’s what you can try to break the loop:

    1. Tune your time window and frequency – Make the “Period (time window)” equal to your “Evaluation frequency.” e.g. period = 5 min, frequency = 5 min. – That way an event only lives in a single evaluation cycle.
    2. Filter strictly for new events in your KQL Instead of a broad Traces | where ... | count > 0, use the built-in time filters:
      
       Traces 
      
       | where TimeGenerated >= StartTimeGenerated and TimeGenerated < EndTimeGenerated 
      
       | where <your-conditions> 
      
       | count 
      
      
      Or explicitly:
      
       Traces 
      
       | where TimeGenerated > ago(5m) 
      
       | where <your-conditions> 
      
       | count 
      
      
      This guarantees you only ever pick up traces that arrived after the last evaluation window began.
    3. Leverage action-group throttling or suppression (optional) If you absolutely need a longer window but want to avoid repeated notifications, configure your action group to throttle notifications for that rule (e.g. max one notification per 30 min).
    4. Verify “Auto-resolve” is enabled In the rule’s advanced settings, ensure “auto-resolve” is on so alerts return to “Resolved” when the query returns zero hits.

    Reference docs you’ll find helpful:

    • Troubleshoot log alerts (why they fire or don’t):

    https://docs.microsoft.com/azure/azure-monitor/alerts/alerts-troubleshoot-log#log-alert-didnt-fire

    • Manage log alerts in the portal (window vs. frequency):

    https://docs.microsoft.com/azure/azure-monitor/alerts/alerts-log#managing-log-alerts-from-the-azure-portal

    Hope that helps! If you still see spamming after these tweaks, can you share:

    • Your alert rule’s Time window and Evaluation frequency settings
    • The exact KQL you’re using
    • Whether “auto-resolve” is turned on in the rule’s settings
    0 comments No comments

    Sign in to comment
  2. Sina Salam 30,166 Reputation points Volunteer Moderator

    Hello Alex Maiereanu,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you are having Application Insights - Alerts- Custom Signal issue.

    Firstly, the AI reply is partially correct. This is a common scenario. The alert is not looping. The same trace record is still being returned by the alert query during repeated evaluations, or the alert is configured as stateless / with a larger effective query window than expected. Update the rule so that it evaluates only the intended recent trace window, remove any query time-range override, and enable automatic resolution so the same condition does not repeatedly trigger actions.

    For classic Application Insights traces, use:

    traces
    | where timestamp >= ago(10m)
    | where message has "YOUR_EXACT_TRACE_TEXT_OR_FILTER"
    | project timestamp, message, severityLevel, operation_Id, customDimensions
    

    For workspace-based Application Insights, use:

    AppTraces
    | where TimeGenerated >= ago(10m)
    | where Message has "YOUR_EXACT_TRACE_TEXT_OR_FILTER"
    | project TimeGenerated, Message, SeverityLevel, OperationId, Properties, _ResourceId
    

    Then configure the alert rule with:

    • Measurement: Table rows
    • Aggregation: Count
    • Operator: Greater than
    • Threshold: 0
    • Window size / aggregation granularity: 10 minutes
    • Frequency of evaluation: 5 minutes
    • Number of violations: 1
    • Evaluation periods: 1
    • Override query time range: empty
    • Automatically resolve alerts: enabled
    • Mute actions: 10 minutes, if duplicate notifications still occur

    Finally, inspect the alert rule JSON and confirm overrideQueryTimeRange is not set to hours, autoMitigate / auto-resolution is enabled, and there are no duplicate alert rules or split-by-dimensions creating multiple alert instances. Log search alerts are evaluated at a configured frequency, can be stateful or stateless, and stateful alerts do not trigger actions again until the condition resolves. - https://learn.microsoft.com/en-us/azure/azure-monitor/alerts/alerts-create-log-alert-rule, https://learn.microsoft.com/en-us/azure/azure-monitor/alerts/alerts-overview This directly solves the repeated-alert problem and gives the only reliable route to prove the root cause.

    I hope this is helpful! Do not hesitate to let me know if you have any other questions, steps or clarifications.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

    1. Alex Maiereanu 40 Reputation points

      Hi, we already properly configured the alerts. This situation started appearing without us changing the alert rules. The alerts are configured as in your proposal, the time window is minutes. The alerts keep firing even after 15 minutes

    2. Sina Salam 30,166 Reputation points Volunteer Moderator

      There is a mistake somewhere in your configuration that's not yet disclosed. Except you reveal your code or delete all and follow best practice as stated.

      Success


    Sign in to comment
  3. kagiyama yutaka 3,685 Reputation points

    I think Scheduled Query Alerts fire again when the query still returns results inside the configured time window. Safe actions are shortening the window and using “suppress for X minutes” in the rule settings.

    1. Alex Maiereanu 40 Reputation points

      Hi, the alert fires after the time window, and there are no more results, as I mentioned.


    Sign in to comment
Sign in to answer

Your answer