Giter Club home page Giter Club logo

Comments (15)

Captain-P-Goldfish avatar Captain-P-Goldfish commented on August 17, 2024 1

I see.
Thanks for the clarification. The enterprise version should indeed not return a 500 status code. I will look into this!

from scim-for-keycloak.

Captain-P-Goldfish avatar Captain-P-Goldfish commented on August 17, 2024

the reason that the result can still be found in the database is a built in retry-strategy.
If either an OptimisticLockException or StaleObjectStateException is thrown during processing the RetryStrategy will open a new transaction and try to repeat the operation from start. This is done up to 3 times. If an error occurred in the previous calls it is still logged.
So in your case the second try of the operation will probably have succeeded.

Does this answer the problem?

from scim-for-keycloak.

fgrtl avatar fgrtl commented on August 17, 2024

Thanks for the answer but how is entra handling this with a 500 response? And how does this effect the entra state (watermark)?

from scim-for-keycloak.

Captain-P-Goldfish avatar Captain-P-Goldfish commented on August 17, 2024

Hmm... I am not able to reproduce this behaviour with a 500 response.
I have some integrationtests that provoke this behaviour. A concurrent modification of the same resource in 100 requests and with an assertion that each request must return response-code 200. In cases in which I get a 500 response beause the retry-counter has reached its limit and the operation still fails, the operation is not saved in my case...

Could you give me some more details or maybe provide your mock-implementation? This might help to pinpoint your problem.

from scim-for-keycloak.

fgrtl avatar fgrtl commented on August 17, 2024

Sure! I have also tested it with KC18 and received a 500, but as you predicted, the user is not added to the group in KC18.

Code:
public void runKeycloakScimTest() {
    log.info("Run Keycloak-Entra SCIM Test");

    int threadAmount = 10;
    int userAmount = 1000;
    int usersPerThread = userAmount / threadAmount;

    Group group = new Group();
    group.setDisplayName("Test Group");
    try (ExecutorService executorService = Executors.newFixedThreadPool(threadAmount)) {
        for (int i = 0; i < threadAmount; i++) {
            executorService.execute(() -> {
                for (int j = 0; j < usersPerThread; j++) {
                    generateUsersAndAddToGroup(usersPerThread, Thread.currentThread()
                                                                     .getId(), group);
                }
            });
        }
    }
}

private void generateUsersAndAddToGroup(int amount, long threadId, Group group) {
    for (int i = 0; i < amount; i++) {
        User user = generateUser(UUID.randomUUID()
                                     .toString()
                                     .substring(0, 4) + "-" + threadId);
        user = createUser(user);
        createAndPatchGroup(group, user);
    }
}

private User createUser(User user) {
    log.info("Query Users for: {}", user.getDisplayName()
                                        .get());
    Optional<User> optScimObjectNode = get(String.format("?filter=userName+eq+%%22%s%%22", user.getUserName()
                                                                                               .get()), User.class);
    if (getTotalResultsFromGetCall(optScimObjectNode.get()) == 0) {
        user = post(user, User.class).get();
        log.info("Created User: {}", user.getDisplayName()
                                         .get());
    } else {
        // !PATCH not included if not matching!
        String userId = getIdFromGetCall(optScimObjectNode.get());
        user = get(String.format("/%s", userId), User.class).get();
        log.info("Retrieved User: {}", user.getDisplayName()
                                           .get());
    }
    return user;
}

private void createAndPatchGroup(Group group, User user) {
    log.info("Query Groups for: {}", group.getDisplayName()
                                          .get());
    Optional<Group> optScimObjectNode = get(
            String.format("?excludedAttributes=members&filter=displayName+eq+%%22%s%%22", group.getDisplayName()
                                                                                               .get()),
            Group.class);
    if (getTotalResultsFromGetCall(optScimObjectNode.get()) == 0) {
        group = post(group, Group.class).get();
        log.info("Created Group: {}", group.getDisplayName()
                                           .get());
    } else {
        // !PATCH not included if group name not matching!
        String groupId = getIdFromGetCall(optScimObjectNode.get());
        group = get(String.format("/%s", groupId), Group.class).get();
        log.info("Retrieved Group: {}", group.getDisplayName()
                                             .get());
    }
    addUserToGroup(group, user);
}

private void addUserToGroup(Group group, User user) {
    log.info("Trying to add User {} to Group {}", user.getDisplayName()
                                                      .get(), group.getDisplayName()
                                                                   .get());
    PatchRequestOperation patchRequestOperation = new PatchRequestOperation("members", PatchOp.ADD, null,
                                                                            createUserToGroupArrayNode(user));
    PatchOpRequest patchOpRequest = new PatchOpRequest(Collections.singletonList(patchRequestOperation));
    try {
        patch(group.getId()
                   .get(), patchOpRequest, Group.class);
    } catch (HttpServerErrorException e) {
        if (e.getStatusCode().is5xxServerError()) {
            log.info("Received 500 for User {}, continuing", user.getDisplayName().get());
        } else {
            throw e;
        }
    }
    log.info("Added User {} to Group {}", user.getDisplayName()
                                              .get(), group.getDisplayName()
                                                           .get());
}

from scim-for-keycloak.

Captain-P-Goldfish avatar Captain-P-Goldfish commented on August 17, 2024

I checked the first comment again and I am a little bit confused. How are you testing with kc-18? The plugin from https://scim-for-keycloak.de does not have support for Keycloak version before 20.

The kc-18 plugin is still based on the OpenSource version from github that you can download here and this version does not have handling for such errors.
So, I am not quite sure where this is heading now. From my point of view the error is handled to a certain degree in the scim-for-keycloak plugin at https://scim-for-keycloak.de. But the error cannot be handled absolutely reliably because I have limited the numbers of retries. It is just less error-prone because it retries it. The ideal way is telling the client not to send a lot of update-requests on a single resource. SCIM is defining the Bulk-Endpoint and allows several operations on patch to handle this.

But how does the kc-24.2.0.0 version into this? Do you have the exactly same problem with this version as with the OpenSource version? I cannot reproduce this with version kc-24-2.0.0

from scim-for-keycloak.

fgrtl avatar fgrtl commented on August 17, 2024

Sorry for the confusion. I have two separate KC instances running for testing, KC18 with the open source plugin and KC24 with the kc-24-2.0.0-enterprise plugin. I get the same 500 responses from both KC instances when testing the code above, but in KC18, the user is not added to the group. In KC24, on the other hand, the user is added to the group, but it still results in a 500 for me which might confuse leading IDPs.

The ideal way is telling the client not to send a lot of update-requests on a single resource. SCIM is defining the Bulk-Endpoint and allows several operations on patch to handle this.

I totally agree, but unfortunately, we might not be able to pressurize Microsoft into working with the SCIM standard.

from scim-for-keycloak.

Captain-P-Goldfish avatar Captain-P-Goldfish commented on August 17, 2024

I extended my integration-test but I am not able to reproduce it. This is my code:

(This test has 1000 users created in the database that are all getting added to a single group in parallel patch-requests) version kc-23-2.0.0

dynamicNodes.add(DynamicTest.dynamicTest("Provoke OptimisticLockException on Groups-endpoint", () -> {
      try
      {
        new ForkJoinPool(20).submit(() -> {
          IntStream.range(0, users.size()).parallel().mapToObj(index -> {

            final int userBulkId = 2 + index;
            final User randomUser = users.get(String.valueOf(userBulkId));
            Assertions.assertNotNull(randomUser, String.valueOf(userBulkId));


            final String patchRequest;
            patchRequest = """
               {
                "schemas": [
                  "urn:ietf:params:scim:api:messages:2.0:PatchOp"
                ],
                "Operations": [
                  {
                    "op": "add",
                    "path": "members",
                    "value": {
                      "value": "%s",
                      "type": "%s"
                    }
                  }
                ]
              }
               """.formatted(randomUser.getId().get(), ResourceTypeNames.USER);
            groupMembers.add(Member.builder()
                                   .value(randomUser.getId().get())
                                   .display(randomUser.getUserName().get())
                                   .type(ResourceTypeNames.USER)
                                   .build());
            return Triple.of(index, patchRequest, randomUser.getId().get());
          }).forEach(triple -> {
            final String patchRequest = triple.getMiddle();
            final String groupId = groups.get(String.valueOf(groupBulkId)).getId().get();

            ServerResponse<Group> response = scimRequestBuilder.patch(Group.class, EndpointPaths.GROUPS, groupId)
                                                               .setPatchResource(patchRequest)
                                                               .sendRequest();
            if (HttpStatus.NO_CONTENT == response.getHttpStatus() || HttpStatus.OK == response.getHttpStatus())
            {
              indexes.add(triple.getLeft());
            }
            else
            {
              ServerResponse<Group> groupResponse = scimRequestBuilder.get(Group.class, EndpointPaths.GROUPS, groupId)
                                                                      .sendRequest();
              Assertions.assertEquals(HttpStatus.OK, groupResponse.getHttpStatus());
              Group group = groupResponse.getResource();
              final String userMemberId = triple.getRight();
              boolean wasNotAdded = group.getMembers()
                                         .stream()
                                         .noneMatch(member -> member.getValue().get().equals(userMemberId));
              Assertions.assertTrue(wasNotAdded);
              membersNotAddedDueToError.add(userMemberId);
            }
          });
        }).get();
      }
      catch (Exception ex)
      {
        throw ex;
      }
    }));

    /*
     * this test will verify that the members that were added to the group in the previous were only added if no error was returned in the previous test
     */
    dynamicNodes.add(DynamicTest.dynamicTest("Verify members successfully added to group", () -> {
      {
        ServerResponse<ListResponse<User>> response = scimRequestBuilder.list(User.class, EndpointPaths.USERS)
                                                                        .filter("not(groups pr)")
                                                                        .get()
                                                                        .sendRequest();
        Assertions.assertEquals(HttpStatus.OK, response.getHttpStatus());
        log.warn("\n{}",
                 response.getResource()
                         .getListedResources()
                         .stream()
                         .map(user -> user.getUserName().get())
                         .collect(Collectors.joining("\n")));
      }
      {
        Collections.sort(indexes);
        ServerResponse<Group> response = scimRequestBuilder.get(Group.class,
                                                                EndpointPaths.GROUPS,
                                                                groups.get(String.valueOf(groupBulkId)).getId().get())
                                                           .sendRequest();
        Assertions.assertEquals(HttpStatus.OK, response.getHttpStatus());
        Group retrievedGroup = response.getResource();
        Assertions.assertEquals(users.size() - membersNotAddedDueToError.size(), retrievedGroup.getMembers().size());
      }
    }));

I think this code is a pretty good example and it provokes a lot of OptimisticLockExceptions in the process. Approximately 1% of the requests end up in status code 500 and the test does then verify that the member was not added to the group which is what we expect.
I am willing to fix the issue but I can't fix it if I cannot reproduce it...

from scim-for-keycloak.

fgrtl avatar fgrtl commented on August 17, 2024

Hey, I think I have found out the cause.
If E-Tag support is enabled in the SCIM settings, the behavior is exactly as you predicted.
If it is disabled, wrongfully 500 responses appear.

I created 1000 users parallel in KC24 and added them to the same group.

But I also checked the occurrences:

  • with E-Tag enabled:

I received 131–500 responses on the PATCH call without any of the users being added to the group.

  • without E-Tag enabled:

I received 177 - 500 responses to the PATCH call, with 169 users still added and 8 not added to the group.

from scim-for-keycloak.

Captain-P-Goldfish avatar Captain-P-Goldfish commented on August 17, 2024

My tests actually never had ETags activated to begin with but this would make the whole thing even more weird. ETags are calculated on the JSON response-body just after the response-body was generated. The ETag is a SHA-1 digest on the minified JSON-string. I have no idea how this should affect the whole thing.

But just to make sure, I tried both variations, with ETag and without and I get the same result in both cases... I still cannot reproduce this...

Which database are you using? Maybe we can find the problem there. I am testing with postgres

from scim-for-keycloak.

fgrtl avatar fgrtl commented on August 17, 2024

I totally get your concerns, and I am confused too.
This problem showed up in my Docker setup locally (using KC23 with an h2 DB) and in my KC24 test setup with a Postgres DB.
Could there be a problem with calculating the SHA-1 for the concurrent group resource when ETag is enabled?

from scim-for-keycloak.

fgrtl avatar fgrtl commented on August 17, 2024

Hey, I have some interesting news about the behavior of Entra for you.
We have collaborated with a customer to run a few tests for Entra Keycloak24 SCIM provisioning, including 5232 users and 8 groups (the number of replaced users and groups does not match these numbers because of old linked replaced ids).

We started two initial and one incremental cycle due to a misconfiguration of a custom attribute mapping.
No users or groups have been deleted in between any cycles.

In the first initial cycle, we tracked a few 500 responses when PATCHING group memberships.
The good news is that Entra retried those requests and received a 200 response.
This can be observed with Group: replacedGroupId11, first initial cycle: overall 993 calls, 35 x 500 responses, 958 x 200 responses, there should be 959 users in the group.
The missing PATCH call depends on Entra. We will look into why that particular call is missing (it is likely because they messed up the state).

In the second initial cycle, each PATCH call received a 200-word response. This is because no user or group has been deleted in Keycloak, but the state is still reset.
Entra does not check if a user is already assigned to a group and just makes a PATCH request.
On the other hand, before provisioning groups and users, they make a GET call and check if the object exists and already matches in the source and target systems. 

Here you can find our access_log: censored_access.log

from scim-for-keycloak.

Captain-P-Goldfish avatar Captain-P-Goldfish commented on August 17, 2024

Okay, I have a little bit of trouble following here, because I am not sure anymore if the error still seems to be on my side or if the error was found on Entra side:

  • I understand that from 1000 PATCH requests 35 did fail with a status code of 500. What I did not understand if the users were assigned despite the 500 status code to the group as reported before?

from scim-for-keycloak.

fgrtl avatar fgrtl commented on August 17, 2024

Hey, with the help of our customer we could analyze and now explain all deviations.

--> Everything worked 100% on the Plugin-Side. Microsoft Entra has some specific not very intuitive behaviour.


Details / Facts:

1.) When there are a lot of Users and User-Group Membership, they spawn multiple Threads which PATCH the same group in a very tight interval.
2.) This leads to ~4% of Group Patches overwhelming even the 3 Re-Tries Cycle that was described above and return a 500 HTTP to Entra (but correctly NOT persisting the association in KC)
2b.) Thankfully, Entra also has a Retry Mechanism that fixes all those missing associations within the same provisioning cycle.

3.) What concerned us, was the small delta between "Group Memberships recorded by us in Keycloak", and "Group Memberships according to the Entra Admin UI", the latter showing 9 Users more.

Thankfully that delta is explainable now.

5 Users were inactive in Microsoft Entra when the initial provisioning cycle started. They were never provisioned.
--> But did still count in the Entra UI.

Other Users were synced to Keycloak, but deleted on Entra in the meantime.
--> Due to their "soft-deletion" implementation, these users were just disabled via provisioning in Keycloak (without group associations).
Even tough on Entra, they were already fully deleted and invisible to the Enterprise Application UI there.


TLDR: In Microsoft Entra, the number of users that are shown as members of a group which is assigned to an Entra Enterprise Application, can differ from the number of users and group memberships that are actually provisioned to a third party.

from scim-for-keycloak.

Captain-P-Goldfish avatar Captain-P-Goldfish commented on August 17, 2024

Good to know. Thanks for sharing this .
And I am glad that my plugin is not doing something terribly strange :-)

from scim-for-keycloak.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.