CosmosDB Serverless Change Feed not triggering Python Azure Function

Craig Webb 100 Reputation points

I have a very basic python azure function that is deployed to a Flex Consumption plan, code reference below.

When I first deploy this function it generates the lease documents in the lease container.

I then add a new document to my Cosmos Container and I do get to see the log line printed to Application Insights.

Then every document after I add to the cosmos container is not triggering the function to be called.

I have confirmed the lease continuation token and timestamp are moving forward when the new document is being added, but the python code and log is not printed.

I don't see any exceptions or issues appear in the application insights and I have the log level set to Debug to capture all logging.

import logging
import azure.functions as func

app = func.FunctionApp()

@app.cosmos_db_trigger(
 arg_name="documents",
 database_name="%COSMOS_DATABASE_NAME%",
 container_name="site-assets-history",
 connection="CosmosDBConnection"
)
def process_assets_history(documents: func.DocumentList) -> None:
 """CosmosDB change feed trigger for site-assets-history."""
 logging.info("Trigger invoked: container=%s docs=%d. test", "site-assets-history", len(documents))

0 comments No comments

Sign in to comment

4 answers

  1. Vinodh247-1375 43,181 Reputation points β€’ Volunteer Moderator

    Hi ,

    Thanks for reaching out to Microsoft Q&A.

    This is a known behaviour pattern with Cosmos DB change feed + AzFunctions (Python, especially on Flex Consumption): the trigger is checkpointing correctly (lease moving forward) but not invoking after the first batch due to host scaling/listener lifecycle issues or function not staying warm. In serverless + low throughput scenarios, the listener can go idle and not re-poll aggressively, even though the continuation token advances. The most common causes are: missing lease_container_name explicitly (leading to default mismatch), not setting create_lease_container_if_not_exists=True, low RU/s causing delayed polling, or Flex Consumption cold-start behaviour where the change feed processor is not continuously active. Also, Python worker+Cosmos trigger has intermittent reliability issues compared to .NET.

    What typically fixes it? explicitly configure lease container + prefix, ensure max_items_per_invocation and feed_poll_delay are tuned, increase RU temporarily to validate, and most importantly switch to Elastic Premium plan or .NET function if this is production-critical. For Python, adding logging.info inside a loop over documents also helps validate batch processing (sometimes empty batches get swallowed silently).

    Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the reply was helpful. This will be benefitting other community members who face the same issue.

    0 comments No comments

    Sign in to comment
  2. Pilladi Padma Sai Manisha 10,190 Reputation points β€’ Microsoft External Staff β€’ Moderator

    Hi @Craig Webb

    The most interesting detail in your troubleshooting is that the lease continuation token and timestamp continue to advance when new documents are inserted. This indicates that the Cosmos DB Change Feed Processor is successfully reading changes from the container and updating the lease state.

    Because of that, the issue is less likely to be with Cosmos DB itself and more likely related to the Azure Functions runtime, Cosmos DB trigger extension, or the Python worker process.

    A few details would help narrow this down:

    • Azure Functions runtime version
    • Extension bundle version (or Cosmos DB extension version)
    • Contents of your host.json
    • Confirmation that the Cosmos DB account is using the SQL (Core) API and Serverless mode
    • Any host restart, recycle, or scale events visible in Application Insights

    Since you're running on a Flex Consumption plan, I'd also recommend checking Application Insights for worker restarts or host lifecycle events occurring after the first successful trigger execution.

    The continuation token advancing is the key diagnostic clue hereβ€”it confirms that Cosmos DB is generating change feed events and that the trigger infrastructure is consuming them. The next step is determining why those events are not consistently reaching the Python function after the initial invocation.

    Could you share the runtime/extension versions and your host.json configuration?

    1. Craig Webb 100 Reputation points

      Hi, thanks for your response.
      Below is the hosts.json file I am using.

      I also tried 4.0.0 instead of 4.* as I had seen examples of both.

      There were no restarts in the Application Insights that I could see.

      As mentioned in my below comments it seems the issue is around the Azure Function scaling. AI mentioned that when the Azure Function scales down to 0 instances it would be the scale controller that is handling the cosmos change feed trigger, and that it is probably not scaling up to 1 instance in time, not that this makes sense to my, but I followed its advice to set the always on instance for my function.
      This has resolved the issue.

      So I think the fact I get the trigger firing once was due to there being an instance, then after a short period of time it would scale down to 0, and future documents would not cause it to scale back up up.

      This doesn't seem to be the correct behavior expected from the Cosmos Serverless + Azure Flex Consumption change feed. However I can't get it to work without there being always 1 instance.

      {
       "version": "2.0",
       "logging": {
       "logLevel": {
       "default": "Debug"
       },
       "applicationInsights": {
       "samplingSettings": {
       "isEnabled": false,
       "excludedTypes": "Request"
       }
       }
       },
       "extensionBundle": {
       "id": "Microsoft.Azure.Functions.ExtensionBundle",
       "version": "[4.*, 5.0.0)"
       }
      }
      
    2. Pilladi Padma Sai Manisha 10,190 Reputation points β€’ Microsoft External Staff β€’ Moderator

      Hi craig,
      Thank you for sharing the details and for reporting back on your findings.

      Your host.json configuration looks reasonable, and the fact that the issue is resolved when maintaining an always-ready instance is an important observation.

      Based on your testing, it does appear that the Cosmos DB trigger is processing changes correctly while a Function host instance is active. The behavior you're seeing suggests that after the Flex Consumption app scales down to zero instances, new change feed events are not consistently causing the app to scale back out and resume processing.

      Under normal circumstances, Cosmos DB triggers are expected to work with event-driven scaling, so requiring a permanently active instance would not be the expected behavior. Since the lease continuation tokens were advancing and no host restarts or runtime errors were observed, this points more toward a scaling/trigger activation issue than a problem with your function code or configuration.

      At this point, I would recommend opening a support case with Azure Functions, providing:

      • The Function App name and region
      • Confirmation that the app is running on Flex Consumption
      • Confirmation that the Cosmos DB account is using Serverless mode
      • The timeframe during which the issue was reproduced
      • The observation that the trigger works consistently when an always-ready instance is configured, but not when the app scales to zero
    3. Pilladi Padma Sai Manisha 10,190 Reputation points β€’ Microsoft External Staff β€’ Moderator

      Hi Craig,
      I hope you had a chance to review the information shared earlier, and I hope this information has been helpful! If you still have questions, please let us know what is needed in the comments so the question can be answered.


    Sign in to comment
  3. Alex Burlachenko 22,120 Reputation points β€’ MVP β€’ Volunteer Moderator

    hi Craig Webb & thanks for join me here at Q&A portal,

    if lease tokens move but ur function logs only once, the trigger is probably checkpointing changes but not invoking the handler correctly. Configure the lease container instead of relying on defaults. Cosmos trigger https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-cosmosdb-v2-trigger

    Add lease_container_name="leases",

    create_lease_container_if_not_exists=True

    Then either delete existing lease documents or use a new lease container name for testing.

    Check Application Insights requests table, not only traces. If invocations exist but logs do not, it is logging. If leases move but no invocations exist, it is trigger/runtime behavior. https://learn.microsoft.com/en-us/azure/azure-functions/functions-reference-python

    Stale lease state or trigger config issue, not Cosmos Change Feed itself.

    rgds,

    Alex

    &

    If my answer was helpful pls mark it and additional thx if u follow me at Q&A portal
    
    1. Craig Webb 100 Reputation points

      I was originally defining the original properties, but then cut back to the bare basics following the examples.

      I have found that with the Flex Function App it has started working reliably if I have the function defined with an always on instance.

      This is not something that I have found documented anywhere as being needed, if this is missing from documentation or I have missed this then please share a reference.
      Or if this is a bug and it shouldn't be needed then that is also good to know.


    Sign in to comment
  4. Amira Bedhiafi 42,941 Reputation points β€’ MVP β€’ Volunteer Moderator

    Hello Craig !

    Thank you for posting on MS Learn Q&A.

    The first thing I would check is whether another function app, deployment slot, local process or previous deployment is using the same monitored container with the same lease container.

    For Cosmos DB triggers, the lease container tracks progress and if multiple functions are configured for the same container, each one should use a dedicated lease container or a different lease prefix otherwise only one of the functions is triggered.

    https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-cosmosdb-v2-trigger

    So I would make the lease container explicit and unique:

    import logging
    import azure.functions as func
    app = func.FunctionApp()
    @app.function_name(name="ProcessAssetsHistory")
    @app.cosmos_db_trigger(
     arg_name="documents",
     database_name="%COSMOS_DATABASE_NAME%",
     container_name="site-assets-history",
     connection="CosmosDBConnection",
     lease_container_name="site-assets-history-leases",
     create_lease_container_if_not_exists="true"
    )
    def process_assets_history(documents: func.DocumentList) -> None:
     logging.info(
     "Trigger invoked: container=%s docs=%d",
     "site-assets-history",
     len(documents)
     )
    

    Also make sure the lease container has partition key /id, because partitioned lease containers are required to use /id.

    https://learn.microsoft.com/en-us/azure/cosmos-db/change-feed-functions

    do not reuse the same leases container for multiple Cosmos DB trigger functions unless each function has a separate lease prefix or a separate lease container. Since your continuation token is advancing, another listener may be consuming the feed and checkpointing it before this function logs anything.

    To troubleshoot, you can enable the Cosmos DB trigger host logs in host.json.

    1. Craig Webb 100 Reputation points

      I had confirmed there was only the one function app listening to the change feed, this is a test environment so everything that wasn't required has been removed and just my function was listening.
      I originally have the lease container with a prefix being defined, but I removed that to follow the examples.

      I have found that with the Flex Function App it has started working reliably if I have the function defined with an always on instance. This is not something that I have found documented anywhere as being needed, if this is missing from documentation or I have missed this then please share a reference. Or if this is a bug and it shouldn't be needed then that is also good to know.


    Sign in to comment
Sign in to answer

Your answer