Azure Deployment Takes 70 Minutes to Fail with Obscure Error

Mike-E-angelo 631 Reputation points
{
 "code": "ResourceDeploymentFailure",
 "target": "/subscriptions/<subscription>/resourceGroups/<group>/providers/Microsoft.Cache/redisEnterprise/<name>",
 "message": "The resource write operation failed to complete successfully, because it reached terminal provisioning state 'Failed'."
}

👁 User's image

Thank you for any assistance to track down what is going on here.

  1. Mike-E-angelo 631 Reputation points

    Hi @Anonymous thank you for your reply but that is a lot of information you are asking for as I have already provided the correlation ID which should give you all the information you need. Can you please confirm that Redis instances are able to be deployed without error? Thank you for your continued assistance.

  2. Mike-E-angelo 631 Reputation points

    I have tried another deployment and instead of it gracefully with a comprehensive error message like you see in every Microsoft product but Azure, it fails with the most obscure and useless message you can think of, as if your team is creatively finding ways to confuse the user. Why are you emitting failure errors that give no indication of why the operation failed? I hope you can understand the concern.

  3. Mike-E-angelo 631 Reputation points

    I have tried another deployment and instead of it gracefully with a comprehensive error message like you see in every Microsoft product but Azure, it fails with the most obscure and useless message you can think of, as if your team is creatively finding ways to confuse the user. Why are you emitting failure errors that give no indication of why the operation failed? I hope you can understand the concern.


Sign in to comment

Answer recommended by moderator

Mike-E-angelo 631 Reputation points

So I was able to get a deployment successfully made when switching the region to eastus2. There was ZERO mention of anything region-specific in the error. What you see above is the error that is presented to the user and it is astounding to me that this is considered acceptable by every person working at Azure.

  1. Mike-E-angelo 631 Reputation points

    Ironically I was using eastus because eastus2 was throwing errors regarding databases at one point. This whole system is truly Jenga, especially when very little information is presented to the user to help them solve the problem they are experiencing.

  2. Anonymous

    Hi @Mike-E-angelo

    The error means the connection didn’t have a valid database context or the Entra identity isn’t fully registered inside that database. You don’t need to add pgaadauth as an extension manually—it’s built into the service. The right approach is to enable Entra authentication at the server and then create the identity as a principal in the target database.

    Make sure you’re connecting with the correct FQDN (not IP), using SSL, and explicitly specifying the database name.

    After that, confirm the identity exists in that database and grant the roles you need. Without that, even if the token looks fine, the database won’t recognize you.

    If you’re on Flexible Server, this is the standard flow. If it’s Cosmos DB for PostgreSQL, the steps differ slightly.

  3. Mike-E-angelo 631 Reputation points

    Hi @Anonymous thank you for your reply.

    The error means the connection didn’t have a valid database context or the Entra identity isn’t fully registered inside that database. You don’t need to add pgaadauth as an extension manually—it’s built into the service. The right approach is to enable Entra authentication at the server and then create the identity as a principal in the target database.

    How can you be so sure about this? Exactly zero of this is relayed in the error message presented to me. Additionally, as I was able to deploy to eastus2 without issue, this also seems to imply this was a different issue altogether than you are suggesting.

  4. Anonymous

    Hi Mike,

    Thank you for the detailed feedback and for your patience here – I completely understand how frustrating it is to wait ~70 minutes only to see a generic “deployment failed” error with no actionable details.

    Based on the behavior you described and the fact that the same Redis Enterprise deployment succeeds in eastus2 but consistently fails in eastus, this strongly suggests a region-specific platform/capacity issue in eastus rather than a problem with your ARM/Bicep template or parameters.

    the deployment should fail quickly with a clear reason such as “SKU unavailable or capacity exhausted in this region” instead of a generic ResourceDeploymentFailure after a long timeout. I’ve flagged this feedback to the team so they can review both the error message and the timeout behavior for such cases.

  5. Anonymous

    I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications


Sign in to comment

4 additional answers

  1. Mike-E-angelo 631 Reputation points

    Hi @Anonymous why are you asking me for a correlation id when I have already provided it in the chat? Additionally you are asking me for the time of the incident and shouldn't the correlation Id already tell you this? What is the correlation ID correlating if not the time of the incident? I hope you can understand the concern. Additionally, why are you asking for this information nearly two weeks after the incident? You gave the very distinct impression that this was already being actively investigated. I hope you can understand the concern you are placing upon customers who report active and obvious defects in your software.

    0 comments No comments

    Sign in to comment
  2. Mike-E-angelo 631 Reputation points

    I have tried another deployment and instead of it gracefully with a comprehensive error message like you see in every Microsoft product but Azure, it fails with the most obscure and useless message you can think of, as if your team is creatively finding ways to confuse the user. Why are you emitting failure errors that give no indication of why the operation failed? I hope you can understand the concern.

    0 comments No comments

    Sign in to comment
  3. Anonymous

    Hi Mike-E-angelo ,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.
    Could you please share the details requested in the private message for further investigation.

    0 comments No comments

    Sign in to comment
  4. AI answer

    The error message you are encountering indicates that the deployment of the Azure Cache for Redis resource has failed, reaching a terminal provisioning state of 'Failed'. This can happen for a variety of reasons, including issues with resource availability, configuration errors, or permission problems.

    To troubleshoot this issue, you can follow these steps:

    1. Check Deployment Operations: Use the Azure portal to view the deployment operations associated with your deployment. This will provide more detailed error messages that can help identify the specific cause of the failure. You can do this by navigating to the resource group in the Azure portal, selecting Deployments, and then reviewing the error details for the failed deployment.
    2. Review Resource Limits: Ensure that you are not exceeding any quotas or limits for your Azure subscription, particularly those related to the Azure Cache for Redis service.
    3. Validate Configuration: Double-check the configuration settings for your Redis resource. Ensure that all required parameters are correctly specified and that they conform to Azure's naming and configuration guidelines.
    4. Permissions: Make sure that the account or service principal you are using has sufficient permissions to create and manage the resources in the specified resource group.
    5. Retry Deployment: Sometimes, transient issues can cause deployments to fail. You might want to retry the deployment after addressing any potential issues.

    If the problem persists, consider reaching out to Azure support with the correlation ID and tracking ID you provided, as they can help investigate further.


    References:

    AI-generated content may be incorrect. Read our transparency notes for more information.
Sign in to answer

Your answer