How SCIM provisioning works

As the number of tools your company uses grows, you’ll naturally have to share or transfer user data between tools, and this can lead to many challenges. New employees must have all their accounts set up with the same personal information and that information must be kept in sync as things change.

SCIM (System for Cross-domain Identity Management) is a standard that can help you simplify employee or user management. By providing a common set of rules for how systems exchange user identity information, SCIM makes it easier to integrate internal and external systems by automating the process of adding, updating, and removing users. This ensures that user information is consistent across the board while improving security and reducing the risk of errors.

In this article, you'll learn more about SCIM and how it works. You’ll learn how it relates to other standards like SAML and SSO. I’ll share a bit about the data types and resources that comprise the SCIM schema, and some use cases for SCIM to show you how it can improve user management and security across your software ecosystem.

What is SCIM and Why Use It?

SCIM is an open, HTTP-based protocol that describes methods for provisioning, managing, and deprovisioning users across multiple systems.

This comes in handy at large organizations when a new employee joins and needs to be added to email, payroll, project management, benefits, and a dozen other services all at once. If you’re just onboarding 1-2 employees per month, you can probably get away with doing this manually, but if you’re onboarding hundreds of employees every day, you need a standard and automated way to facilitate this.

Similarly, if you’re managing a web application and new users who sign up also need to be added to billing and customer support systems, manually copying this data is sure to lead to errors.

Setup is just the first challenge, though, as things get even more complicated when data changes. What happens when an employee gets married and changes their last name? Will HR, payroll, finance, and operations teams all have to go into their systems and manually update their records?

Initially defined in 2011, SCIM was designed as a solution to this challenge. It defines standard REST API endpoints that any SCIM-compliant client can use to create, read, update, and delete data about users. The SCIM core schema defines a set of common data types, attributes, and groups to allow for maximum interoperability between SCIM services. In other words, SCIM makes it significantly easier for multiple systems (internal and external) to keep user information up-to-date.

SCIM 1.0 was replaced by SCIM 2.0 in September 2015 and has since been adopted by most of the major identity providers. In addition to helping you keep user data synced across systems, SCIM also improves security. By ensuring you can remove access to all applications at once, you can deactivate and block malicious user accounts quickly so bad actors can’t cause further damage.

SCIM is also useful when migrating identities. For example, if your company changes benefits providers from one company to another, both being on the SCIM standard will allow you to seamlessly transfer user information between the two systems and ensure no data is lost in the process.

Related: How auto-provisioning works

How SCIM Works

Now that you understand why SCIM is useful and what it does, let’s dig into the specifics of how it works and the terminology it uses.

SCIM Clients and Service Providers

SCIM relies on a central source of truth for user identities called the “client.” Typically, this role is filled by an identity provider like Okta or FusionAuth, which handles user authentication and stores identifying information for that user.

The SCIM protocol can then be implemented by many “service providers,” typically software services that rely on the user’s identity. When a change is made to the user’s information (e.g., they update their email address, phone number, etc.), the SCIM client broadcasts these changes to the service providers based on the SCIM protocol.

I’ll show you what the actual API calls look like later in this piece, but first, let’s clarify a bit more about how SCIM works with identity providers.

SAML, SCIM, and SSO

If you’re familiar with identity management protocols and standards, you probably know a bit about SAML and SSO, but let’s clarify how these two relate to SCIM.

SSO (single sign-on) is a blanket term that refers to a user’s ability to sign into a single system and get access to multiple systems. For example, you are likely familiar with using social or Google as an authentication method for other websites. This is a form of SSO.

SAML (security assertion markup language) is just one of the possible standards for implementing SSO. It defines the way in which a user should be verified and how their identity can be passed between systems. SAML isn’t the only standard for implementing authentication, though; OAuth and OpenID are also popular options.

Finally, as you’ve already read, SCIM defines the way in which a single user can be kept up-to-date across multiple services. So, SCIM typically works with an SSO provider to help keep a user’s identity information correct.

In a typical workflow, a user would log in through an identity provider (a SCIM client). This identity provider might implement a standard like SAML to offer single sign-on to multiple services. When a user makes a change to a key part of their identity in the identity provider, it will call the endpoints defined by the SCIM protocol to update that user in all service providers. This keeps the user’s information up-to-date as they do things like update their name, email, phone number, or contact information.

Related: Real-world examples of automating user onboarding in your product

Understanding the SCIM Schema

As mentioned, the SCIM protocol defines a REST API that allows data about users or groups to be passed between various services. When a user makes an update to their identity information, all services in the system that implement the SCIM protocol will be notified of the change using this REST API.

Below are some example URLs for users under the SCIM protocol:

  • Create new user: <code class="blog_inline-code">POST https://example.com/v1/User</code>
  • Retrieve a single user: <code class="blog_inline-code">GET https://example.com/v1/User/{id}</code>
  • Delete a user by ID: <code class="blog_inline-code">DELETE https://example.com/v1/User/{id}</code>
  • Update a user by ID: <code class="blog_inline-code">PATCH https://example.com/v1/User/{id}</code>
  • Search for a user: <code class="blog_inline-code">GET https://example.com/v1/User?filter={attribute}{op}{value}&sortBy={attributeName}&sortOrder={ascending|descending}</code>

As you can see, SCIM allows for versioning and provides standard REST endpoints, as well as endpoints for searching and bulk operations. The groups resource has a similar set of endpoints and allows you to add users to groups so you can implement department-level rules about users and how they relate to one another.

Responses are represented using JSON and have a set of common data types to ensure maximum interoperability between systems. It’s important to know how these data types work if you’re going to implement SCIM across your ecosystem because improperly handling types or converting them incorrectly can lead to big problems.

SCIM’s standard data types include:

  • <code class="blog_inline-code">string</code> - a sequence of characters, and it represents text data such as names, addresses, and email addresses.
  • <code class="blog_inline-code">boolean</code> - a data type having only true or false values. It represents binary data, such as whether a user is active or inactive.
  • <code class="blog_inline-code">integer</code> - a whole number that can be positive, negative, or zero. It represents numerical data such as user IDs or group IDs.
  • <code class="blog_inline-code">decimal</code> - contains decimal digits (<code class="blog_inline-code">0–9</code>) that are used to represent numerical values with a decimal point. This is good for values that require precision, such as monetary values, measurements, and scientific calculations.
  • <code class="blog_inline-code">dateTime</code> - represents a specific point in time. It's used for timestamps, such as a user’s creation or modification time.
  • <code class="blog_inline-code">reference</code> - references another resource. This allows you to capture relationships between users and groups.
  • <code class="blog_inline-code">complex</code> - contains multiple attributes and is used to represent more complex data, such as a user's address or access policies for a group. For example, the <code class="blog_inline-code">name</code> attribute in the <code class="blog_inline-code">user</code> resource is a “complex” type that contains <code class="blog_inline-code">givenName</code> and <code class="blog_inline-code">familyName</code>.

Users and groups each have different attributes, but they share four common attributes:

  • <code class="blog_inline-code">schema</code> - defines the set of schemas that a SCIM service provider supports.
  • <code class="blog_inline-code">id</code> - a unique identifier for the resource.
  • <code class="blog_inline-code">externalId</code> - stores an identifier that's external to the SCIM service. This can be used to link users across multiple systems.
  • <code class="blog_inline-code">meta</code> - includes information about the resource, such as the resource's creation and last modified time.

The core resource type in the SCIM schema is the <code class="blog_inline-code">User</code>, which represents each account in the system. The <code class="blog_inline-code">User</code> resource type includes attributes that define the user, such as <code class="blog_inline-code">userName</code>, <code class="blog_inline-code">name</code>, and <code class="blog_inline-code">email</code>. Here's an example of a user resource in JSON format:

{
    "schemas": [
        "urn:ietf:params:scim:schemas:core:2.0:User"
    ],
    "id": "2819c223-7f76-453a-919d-413861904646",
    "userName": "HarryPotter",
    "name": {
        "givenName": "Harry",
        "familyName": "Potter"
    },
    "emails": [
        {
            "value": "harry.potter@hogwarts.edu",
            "type": "work",
            "primary": true
        }
    ],
    "active": true
}

This is a pretty simple record, but note that the email attribute is an array that contains the user's email addresses, the type of email address, and whether it's the user's primary email address or not.

Each SCIM client can also implement custom attributes, but it should offer endpoints to get more detailed information about the specific schema and resource types it exposes. This means that even if a SCIM client implements things slightly differently than others, it will still offer a “source of truth” for the correct resource schemas.

Conclusion

In this article, you’ve learned about the SCIM standard, why it’s useful, and the basics of how to implement and use it. SCIM is one of the most common protocols for migrating identities, integrating with external identity providers, and improving interoperability throughout your software, so it’s important to understand its strengths and limitations.

As a next step, you can check out Merge, a service that gives you a Unified API to integrate with hundreds of human resources, recruiting, ticketing, CRM, and accounting platforms. Merge can be used to integrate with SCIM-compliant services, such as identity providers and HR systems, making it even easier to put your service onto the SCIM standard.