Comments (25)
@mb290 @schoren this is also the behavior of Azure CLI when using the command az ad sp create-for-rbac
, as it pauses execution for 5 seconds and retries role assignment creation up to 36 times, waiting for server replication.
References:
https://github.com/Azure/azure-cli/blob/master/src/command_modules/azure-cli-role/azure/cli/command_modules/role/custom.py#L959
https://github.com/Azure/azure-cli/blob/master/src/azure-cli-core/azure/cli/core/commands/arm.py#L995
To be clear the terraform configuration below works most of the time because it waits 30s for server replication using a hack (but sometimes it take longer than 30s, and then it fails with the same error you describe above):
provider "azurerm" {
version = "~> 1.10.0"
}
data "azurerm_subscription" "current" {}
resource "random_string" "password" {
length = 32
}
resource "random_id" "name" {
byte_length = 16
}
variable "role" {
default = "Contributor"
}
variable "end_date" {
default = "2020-01-01T01:02:03Z"
}
resource "azurerm_azuread_application" "service_principal" {
name = "${random_id.name.hex}"
}
resource "azurerm_azuread_service_principal" "service_principal" {
application_id = "${azurerm_azuread_application.service_principal.application_id}"
}
resource "azurerm_azuread_service_principal_password" "service_principal" {
service_principal_id = "${azurerm_azuread_service_principal.service_principal.id}"
value = "${random_string.password.result}"
end_date = "${var.end_date}"
# wait 30s for server replication before attempting role assignment creation
provisioner "local-exec" {
command = "sleep 30"
}
}
resource "azurerm_role_assignment" "service_principal" {
scope = "${data.azurerm_subscription.current.id}"
role_definition_name = "${var.role}"
principal_id = "${azurerm_azuread_service_principal.service_principal.id}"
depends_on = ["azurerm_azuread_service_principal_password.service_principal"]
}
output "display_name" {
description = "The Display Name of the Azure Active Directory Application associated with this Service Principal."
value = "${azurerm_azuread_service_principal.service_principal.display_name}"
}
output "application_id" {
description = "The Application ID."
value = "${azurerm_azuread_application.service_principal.application_id}"
}
output "object_id" {
description = "The Object ID for the Service Principal."
value = "${azurerm_azuread_service_principal.service_principal.id}"
}
output "password" {
description = "The Password for this Service Principal."
value = "${azurerm_azuread_service_principal_password.service_principal.value}"
}
While this terraform configuration don't wait for server replication using the above hack, and always fails:
provider "azurerm" {
version = "~> 1.10.0"
}
data "azurerm_subscription" "current" {}
resource "random_string" "password" {
length = 32
}
resource "random_id" "name" {
byte_length = 16
}
variable "role" {
default = "Contributor"
}
variable "end_date" {
default = "2020-01-01T01:02:03Z"
}
resource "azurerm_azuread_application" "service_principal" {
name = "${random_id.name.hex}"
}
resource "azurerm_azuread_service_principal" "service_principal" {
application_id = "${azurerm_azuread_application.service_principal.application_id}"
}
resource "azurerm_azuread_service_principal_password" "service_principal" {
service_principal_id = "${azurerm_azuread_service_principal.service_principal.id}"
value = "${random_string.password.result}"
end_date = "${var.end_date}"
}
resource "azurerm_role_assignment" "service_principal" {
scope = "${data.azurerm_subscription.current.id}"
role_definition_name = "${var.role}"
principal_id = "${azurerm_azuread_service_principal.service_principal.id}"
depends_on = ["azurerm_azuread_service_principal_password.service_principal"]
}
output "display_name" {
description = "The Display Name of the Azure Active Directory Application associated with this Service Principal."
value = "${azurerm_azuread_service_principal.service_principal.display_name}"
}
output "application_id" {
description = "The Application ID."
value = "${azurerm_azuread_application.service_principal.application_id}"
}
output "object_id" {
description = "The Object ID for the Service Principal."
value = "${azurerm_azuread_service_principal.service_principal.id}"
}
output "password" {
description = "The Password for this Service Principal."
value = "${azurerm_azuread_service_principal_password.service_principal.value}"
}
with the error:
Error: Error applying plan:
1 error(s) occurred:
* azurerm_role_assignment.service_principal: 1 error(s) occurred:
* azurerm_role_assignment.service_principal: authorization.RoleAssignmentsClient#Create: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="PrincipalNotFound" Message="Principal 12eab7225e744ca7b617876179b68b95 does not exist in the directory ssssssss-ssss-ssss-ssss-ssssssssssss."
Do anyone have suggestion for workaround in terraform? I don't yet understand how fix for this would be implemented in any of these resources.
I really don't want to use this very ugly hack:
...
resource "azurerm_azuread_service_principal" "service_principal" {
application_id = "${azurerm_azuread_application.service_principal.application_id}"
}
resource "azurerm_azuread_service_principal_password" "service_principal" {
service_principal_id = "${azurerm_azuread_service_principal.service_principal.id}"
value = "${random_string.password.result}"
end_date = "${var.end_date}"
# wait 30s for server replication before attempting role assignment creation
provisioner "local-exec" {
command = "sleep 30"
}
}
resource "azurerm_role_assignment" "service_principal" {
scope = "${data.azurerm_subscription.current.id}"
role_definition_name = "${var.role}"
principal_id = "${azurerm_azuread_service_principal.service_principal.id}"
depends_on = ["azurerm_azuread_service_principal_password.service_principal"]
}
...
Many thanks,
from terraform-provider-azuread.
I've barely tested this, so it's probably flawed, but it worked the first time I tried it:
resource "azuread_service_principal_password" "main" {
service_principal_id = "${azuread_service_principal.main.id}"
value = "${var.password}"
end_date = "${var.end_date}"
provisioner "local-exec" {
command = <<EOF
until az ad sp show --id ${azuread_service_principal.main.application_id}
do
echo "Waiting for service principal..."
sleep 3
done
EOF
}
}
At least it's an idea, and someone can probably identify the flaws and improve on it.
from terraform-provider-azuread.
We really want to avoid using the local-exec
provisioner and sleep
command as workaround, since we'd have to have pause execution approx. 180 seconds to really be sure server replication is done (sometimes server replication take long time). Also we run terraform on multiple OS/build agents where sleep
is not always accessible. So it would be a really ugly hack. Using az ad sp create-for-rbac
would be a better alternative for us than using the terraform resources currently.
Any suggestions on how to implement a fix for this in terraform is highly appreciated.
Update 1: yes, have really no idea how to approach fixing this in terraform other than retrying multple times on fail like az cli does, as the error returned from the API is very generic. Maybe @tombuildsstuff could help with what direction to take here.
Update 2:
FYI There is another issue #841 that seem to have the same kind of problem where retrying was implemented in the resource ref. https://github.com/terraform-providers/terraform-provider-azurerm/blob/master/azurerm/resource_arm_storage_container.go#L111
Update 3:
#1644 this is bad example of a workaround, would like just to start this discussion. any advice appreciated.
Thanks again,
from terraform-provider-azuread.
I am also encountering
Original Error: autorest/azure: Service returned an error. Status=400 Code="PrincipalNotFound" Message="Principal 6b3xxxxxxxxxxxx58755xxxx does not exist in the directory xxxxx-xxxx-xxxx-xxxx-xxxxxxxx."
In my scenario the service principle is pre-existing so it cannot be a time thing. I am attempting to give an AKS SP permission to act as "Managed Identity Operator" over a User Managed Identity.
When using the respective AZ CLI command as the same user running Terraform, I have no issues.
az role assignment create --role "Managed Identity Operator" --assignee [SP ID] --scope "/subscriptions/[SUBSCRIPTIONID]/resourcegroups/sandbox/providers/Microsoft.ManagedIdentity/userAssignedIdentities/sandbox-mid"
In this example it looks like (as @liamfoneill above) the issue may lie with the azurerm_role_assignment resource.
Resolved for now by running the az cli command via a local-exec. It works for now, but would much prefer to use the native resource.
from terraform-provider-azuread.
I had the same issue today. In my case, I fixed it by using the azurerm_azuread_application
id instead of the azurerm_azuread_service_principal
id. Something like this:
resource "azurerm_azuread_application" "test" {
name = "exampleTFapplication"
available_to_other_tenants = false
oauth2_allow_implicit_flow = false
}
resource "azurerm_azuread_service_principal" "test" {
application_id = "${azurerm_azuread_application.test.application_id}"
}
resource "azurerm_azuread_service_principal_password" "test" {
service_principal_id = "${azurerm_azuread_service_principal.test.id}"
value = "BVcKK237/&&)hyz@%nsadasdsa(*&^CC#Nd3"
end_date = "2020-01-01T01:02:03Z"
}
resource "azurerm_resource_group" "test" {
name = "testResourceGroup1"
location = "West US"
}
resource "azurerm_role_assignment" "test" {
scope = "${azurerm_resource_group.test.id}"
role_definition_name = "Reader"
principal_id = "${azurerm_azuread_application.test.application_id}"
}
It's a weird behavior, but I got that from the az ad sp create-for-rbac
command. When comparing to the Azure Portal, the actual ID used was the application ID.
Hope it helps!
from terraform-provider-azuread.
@schoren thanks for replying. I just tested this and when i tried the update I get the response:
Error: Error applying plan:
1 error(s) occurred:
-
azurerm_role_assignment.test: 1 error(s) occurred:
-
azurerm_role_assignment.test: authorization.RoleAssignmentsClient#Create: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="PrincipalNotFound" Message="Principal 408b56eeXXXXXXXXXXX does not exist in the directory #######-#####-######-#########."
I then confirmed the outputs and they are both different values:
Outputs:
azurerm_azuread_application_id = 408b56eeXXXXXXXXXXX
azurerm_azuread_service_principal_id = b711cba7XXXXXXXXXXX
I looked at the azurerm_role_assignment documentation and it does specifically call out the principal ID is required.
Am I missing something obvious?
from terraform-provider-azuread.
Yes, yesterday I had a similar issue. I'm checking now to see if it is still happening. In another env, I had successfully deployed and assigned roles to services principal using that method.
from terraform-provider-azuread.
Ok, now it's working with the original solution, using azurerm_azuread_service_principal
id. Not sure why it worked different before, but it's working as expected now. Is it working for you?
from terraform-provider-azuread.
@joakimhellum-in Thanks for that clarification. It is an ugly workaround, but maybe that's the best we can get. I don't have a very deep understanding of terraform and this provider's inner workings, so I cannot tell if there's a cleaner solution.
For the time being, I think I'll implement what you suggested
from terraform-provider-azuread.
@tombuildsstuff and/or anyone - would you clarify something for me?
It appears (to me at least) that the solution to the various StatusCode=404, ErrorCode=ResourceNotFound
issues in the AzureRM provider is to code a fix/retry into the particular resource component. I've noticed multiple such issues here.
Does this mean that you couldn't do something similar to the max_retries
option in the AWS provider?
Thanks for any insight!
from terraform-provider-azuread.
I can confirm I have the same behaviour. This is related to the time to replicate the SP through the Azure AD servers.
My scenario is:
- Create the azurerm_azuread_application,
- Create the azurerm_azuread_service_principal
- Create the azurerm_azuread_service_principal_password
- Create a Keyvault
- Assign a policy to that SP in KeyVault
- Connect to Azure RM provider using that SP to create a secret key
Get the error : AADSTS70001: Application with identifier 'app guid here' was not found in the directory
retry 1 min later another terraform apply and everything goes through.
from terraform-provider-azuread.
Have the same issue.
from terraform-provider-azuread.
I have tried with 30s, 60s,180s and 200s and I am still getting the same issue...
Using directly az-cli is what worked for me as @joakimhellum-in mentioned previously:
resource "azurerm_azuread_service_principal_password" "app_spn_password" {
service_principal_id = "${azurerm_azuread_service_principal.app_spn_id.id}"
value = "${random_string.password.result}"
end_date = "${var.spn_end_date}" #2020-01-01T01:02:03Z
provisioner "local-exec" {
command = "az role assignment create --role ${var.spn_role_definition_name} --assignee-object-id ${azurerm_azuread_service_principal.app_spn_id.id} --scope ${var.spn_scope}"
}
}
from terraform-provider-azuread.
Did anybody think to query the AD servers by PowerShell to see if the SPN has been replicated through and then carry on?
I am not sure if you can do this on Azure AD though...
from terraform-provider-azuread.
I'm getting ServicePrincipalNotFound
errors for azurerm_kubernetes_cluster
resources as well and a subsequent apply works. @tombuildsstuff, should I open a different issue than this one?
from terraform-provider-azuread.
@clstokes that sounds like the same underlying issue as this, so we can track that here. Thanks!
from terraform-provider-azuread.
I'm getting the same issue but I'm not using depends_on. I created the cluster first then added the configuration to create the role assignment. No matter how many times I try to apply it fails.
from terraform-provider-azuread.
Hi @mb290,
As in 2.0 we are deprecating all Azure AD resources and data sources in the Azure RM provider in favour of this new provider I have moved the issue here.
from terraform-provider-azuread.
I can confirm that this issue still exists with the new AzureAD provider.
from terraform-provider-azuread.
I also cannot do role assignments with Terraform for Service Principals. It works fine for AAD groups but I get the Status=400 Code="PrincipalNotFound" too. The service principal has been created days ago so I don't think it is a race condition that others seem to be experiencing. If this is being tracked in another issue @tombuildsstuff can you please post the link here as I cannot find it.
from terraform-provider-azuread.
If you happen to be running on Windows (where until
is not available), here's another potential workaround:
Drop wait-for-service-principal.ps1
in your working directory and use a local-exec
provisioner (similar to the previous option).
wait-for-service-principal.ps1
param(
[string]$ApplicationId
)
$elapsed = 0;
$delay = 3;
$limit = 5 * 60;
$checkMsg = "Checking for service principal with Application ID $ApplicationId"
Write-Host $checkMsg
$cmd = "az ad sp show --id $ApplicationId";
Invoke-Expression $cmd
while($lastExitCode -ne 0 -and $elapsed -le $limit) {
$elapsedSeconds = $elapsed + "s";
Write-Host "Service principal is not yet available. Retrying in $delay seconds... ($elapsedSeconds elapsed)"
Start-Sleep -Seconds $delay;
$elapsed += $delay;
Write-Host $checkMsg
Invoke-Expression $cmd;
}
if($lastExitCode -eq 0) {
Write-Host "Service principal is ready."
exit 0
}
Write-Host "Service principal did not become ready within the allotted time."
exit 1
resource "azuread_service_principal_password" "ad_principal_pw" {
service_principal_id = "${azuread_service_principal.ad_principal.id}"
value = "${var.password}"
end_date = "${var.end_date}"
provisioner "local-exec" {
command = ".\\wait-for-service-principal.ps1 -ApplicationId \"${azuread_application.ad_app.application_id}\""
interpreter = ["PowerShell"]
}
}
from terraform-provider-azuread.
I am having the same issue. Is there a permanent solution on the roadmap? I see this issue was removed from the 0.3.0 milestone.
The work-around with the exec-local to wait for "az ad sp show --id ${azuread_service_principal.main.application_id}" does not work either. The exec returns ok, displaying the service principe, but it is yet not ready to get consumed by AKS. I guess timing/eventual consistency issue between several Azure API's.
Sleep 30 was the only way forward for me.
from terraform-provider-azuread.
Hi!
This also affects for AKS cluster, as the SP is not ready (or the password).
from terraform-provider-azuread.
Maybe something like this can replace resource timeout block.
Also there is no necessary to query API for destroying that resource. (I am not familiar what is done with local-exec at destroying time..), Its just an another guess..
resource "null_resource" "wait" {
provisioner "local-exec" {
command = <<EOF
COUNTER=$RETRIES
until [ $COUNTER -eq 0 ] || az ad sp show --id ${azuread_application.application.application_id} -o none
do
echo "Waiting for service principal..."
let COUNTER-=1
sleep $TIMEOUT
done
EOF
environment = {
TIMEOUT = "5"
RETRIES = "20"
}
}
provisioner "local-exec" {
when = "destroy"
command = "echo 'Wait hook'"
}
}
from terraform-provider-azuread.
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!
from terraform-provider-azuread.
Related Issues (20)
- Add and grand admin consent for the "Azure VPN" enterprise application HOT 1
- Error when setting identifier_uri for azuread_application or azuread_application_identifier_uri HOT 3
- ignore_changes does not ignore `app_role` block on `azuread_application`
- data.azuread_service_principal field display_name incorrectly case sensitive HOT 1
- `azuread_conditional_access_policy` is not idempotent when session control `cloud_app_security_policy = "mcasConfigured"` is set
- How to use `azuread_application_pre_authorized` with the authorizing application being msgraph HOT 1
- Removing group members using azuread_group_member throws an error although members are being removed HOT 1
- Cannot destroy AppRegistration virtual resources HOT 2
- Improve documentation for "azuread_application" HOT 1
- Not able to remove "assignment_review_settings" block in azuread_access_package_assignment_policy HOT 5
- azuread_directory_role_eligibility_schedule_request returning RoleNotFound on creation HOT 2
- Add support for token issuance policies
- Grant admin cosent for API permission of the app HOT 2
- azuread_application_owner will throw error for the current user applying HOT 1
- PIM for Roles
- Plugin crash on azuread_privileged_access_group_eligibility_schedule resource with permanent_assignment=true HOT 2
- Create azuread_application failed: Property api.requestedAccessTokenVersion is invalid. HOT 1
- Cannot assign groups to application via service principal HOT 3
- Add support for "azuread_authentication_strength_policy" data source
- azuread_privileged_access_group_eligibility_schedule - permissions error despite graph and Entra role applied to service prinicple HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from terraform-provider-azuread.