Insights

Packaged vs Headless CDP: Choosing the Best Fit

Packaged vs Headless CDP: Choosing the Best Fit

The interest in and adoption of packaged Customer Data Platforms (CDPs) has increased significantly over the last few years. However, the landscape is evolving, leading many in our industry to question their effectiveness, future potential, and cost.

This blog series will discuss the essential points to consider when choosing between a headless CDP and a traditional, packaged CDP.

What is a Packaged Customer Data Platform (CDP)?

A packaged customer data platform (CDP) is a software solution designed to help businesses collect, unify, analyze, and activate customer data from various sources across their organization.

Unlike traditional customer relationship management (CRM) software, which focuses on managing customer interactions and sales processes, a CDP specifically provides a unified view of the customer that various departments, including marketing, sales, customer service, and product development, can leverage within an organization.

A packaged CDP is a pre-built software solution designed to be deployed and configured quickly and easily, often with minimal IT involvement. These solutions typically come with pre-built integrations with popular marketing and advertising platforms and pre-configured data models and analytics dashboards to help businesses get up and running quickly.

Packaged CDP vendors claim their product is the solution to the problem but ignore that it will take an ecosystem to solve all customer data use cases across various tools and downstream teams.

Public cloud data (warehouse) platforms offer this ecosystem. Overlap with packaged CDPs is growing as it has become much easier to build, or more accurately assemble, the functionality existing CDPs offer. 

What is a Headless Customer Data Platform (CDP)?

A headless Customer Data Platform (CDP) is an increasingly popular solution implemented without a packaged CDP tool or suite where:

  • A cloud data warehouse, such as GCP BigQuery, Snowflake, and AWS Redshift, acts as the data foundation. Different data sources are ingested, transformed and joined.
  • Actions and insights are created on all levels, including customer data. Think of product and campaign-level data.
  • Data activation in channels and platforms is implemented directly from the cloud data warehouse environment, where the central view lives.
  • A combination of cloud-native (data) services and specialized tools or frameworks are used.

As discussed in our article 'The real CDP revolution is in public clouds', we noticed the movement to the public cloud among our clients and shared our observations and experiences.

The Headless Approach: From Content to Customer Data

A headless architecture gained popularity in Content Management (CMS) and is widely used today.

The CMS now acts as a central content hub by decoupling the back end (content management) and the front end (website). Through an API, content and actions can be shared with various applications, such as websites, apps or even dynamically generated emails. The front end no longer intertwines with the content management system. The result: centralizing all content in one system allows developers to create front-end applications independently from the CMS.

The headless CDP acts as a (marketing) data, decisions, and activation hub where insights and actions can be created on various levels. These are activated/integrated with channels, tools, or platforms (think of audiences, events, or triggers). This is done directly from the cloud data warehouse without needing a stand-alone CDP solution.

However, there is no centralized user interface. The solution consists of a combination of services within the cloud environment (building blocks) and other (specialized) tools that fulfil a specific need (e.g. connectivity or decision-making).

Packaged vs Headless CDP: Choosing the Best Fit

Not limited to customer data

We are comparing the headless approach to existing packaged CDP solutions mainly built around customer data.

However, a headless data management system is not limited to customer data only. It supports interoperability with the enterprise data platform.

Some examples:

  • Identity management and an identity graph to link customer and organizational identifiers.
  • Building a single (360°) customer view and (customer) audiences.
  • Gaining insights into product performance and enriching product data/feeds.
  • Creating models, such as product recommenders, engagement, conversion  scoring, CLTV,  churn prediction models or forecasting sales. These models can be used in batch and real-time. Directly in the platform, so the data doesn't have to leave.
  • Creating business rules or "triggers" (decisioning) for follow-up/Next-Best Actions.
  • Sales and engagement reports based on multiple data sources. Creating notifications and alerts based on deviations within these reports.

The headless approach overlaps heavily with the idea behind the modern data stack, also called the composable CDP. That is mainly because you can use products from cloud vendors and ISV SAAS solutions for every platform function. 

A data warehouse 'with benefits'

Many organizations have recently invested in a data warehouse from one of the well-known cloud providers, such as AWS, Azure, or GCP. You can retrieve data from various sources and join it together.

Traditionally, data warehouses were populated using periodic (daily/hourly) batches, but modern cloud data warehouses have native support for real-time data ingestion. Live orders, delivery and clickstream data from websites and apps are more common in cloud data warehouse implementations.

Working with such a cloud data warehouse and all the different services within these cloud platforms became much more accessible. Services within the cloud platforms can be stacked on top of each other (building blocks), making it easier to import, transform and analyze large amounts of data. Skills necessary for working with such a data environment as SQL, Java or Python are also increasingly present within organizations and their suppliers.

Reverse ETL, activating this data in channels and tools directly from the data warehouse, increases the overlap with existing CDPs, hence becoming an alternative.

The headless CDP vs. packaged CDP: functionality comparison

In contrast to headless CDPs, the packaged CDPs usually specialize in different areas. Some platforms prioritize activation, personalization, and AI, while others focus on generating insights, identity resolution, and a comprehensive 360° view of the customer.

Despite their differences, nearly all of these systems require collecting and ingesting source data. Typically, this involves implementing additional on-site trackers using Javascript and uploading first-party data to combine online data with order or CRM data. 

Packaged vs Headless CDP: Choosing the Best Fit

When discussing a packaged CDP versus a headless CDP in an environment with an existing cloud data warehouse, consider the following:

  • Integration of Data Sources: The data sources currently feeding into your data warehouse need to be integrated with the CDP.
  • Existing Systems: Systems like Google Analytics or Snowplow already handle real-time behavioral data capture and collection.
  • Potential Issues with Packaged CDPs:
    • Performance Impact: Adding another tracker to your website or app could slow down performance.
    • Measurement Discrepancies: The new tracker's measurement methodology might differ from your existing systems.
  • Duplication of Business Rules: You might end up duplicating business rules already established in your warehouse, such as:
    • Identity resolution
    • Creating a 360-degree customer view
    • Audience selections

If the above applies to your organization, there will be overlap. Most packaged CDP and data management tools need to work together in such a way that they can use on everything that is already collected and calculated in your data warehouse.

Headless CDP: The Challenges

Packaged CDPs also have their advantages.

First of all, if you don't have a data warehouse (yet) and resources are limited, a packaged CDP suite could fit your needs and get you started quickly, seeing the out-of-the-box functionality. The cost factor heavily depends on the type of system, volume and integrations.

When zooming in, the headless CDP has the following disadvantages:

  1. A headless system lacks a central user interface, making adjusting settings easily or gaining insights into all processes difficult. It is certainly possible to gain insight into the processes of a headless system, but those insights have to be created yourself or are available in multiple systems.

    For example, technical / SQL knowledge might be required to set up audiences or journeys. An option is to put a low-code application on top, such as Google's Appsheet or using Looker Blocks and Actions.

  2. (Real-time) decision engine/flow builder: making decisions based on real-time data flows (e.g. web or app data). Creating such a real-time decision engine (e.g. online personalization use cases) within the cloud is possible but is more complex than clicking rules together in a user interface because that is code-based. 

    Also, creating flows/journeys is easily done within most CDP solutions. Often we use Google's Cloud Composer for batch orchestration. For real-time we use a Cloud Function and PubSub.

  3. The platform maintains connectivity to channels and tools. In a headless CDP, you have to build some of the integrations yourself (mostly through API connections), although several (open-source) frameworks, such as Google Tentacles, and "reverse ETL" tools, such as Looker, Hightouch, Flywheel Software, and Census, are currently available.

  4. The time-to-market is longer because the platform has to be designed and built.

If your organization is not complex, has basic use cases, a packaged CDP might be a suitable choice. However, keep in mind that while a packaged CDP may meet your current needs, it could face limitations as your organization's requirements evolve and grow.

Headless CDP: The Advantages 

More Flexibility

The headless CDP approach provides the flexibility necessary to quickly innovate and have an agile roadmap by using a modular, loosely coupled architecture and persisting data in a real-time cloud data warehouse.

Reduced Vendor Lock-In

Unlike pre-built CDP options, data is not "locked" into the headless CDP approach. The "single-source-of-truth" is available for all applications, reducing the risk of vendor lock-in.

Real-Time Loop-Back

The headless CDP approach also allows for an easy and real-time loop-back to source systems that require consolidated data from the CDP.

Cost Efficiency

The use of cloud-native components with a pay-as-you-go pricing model makes the headless CDP approach cost-effective with low total cost of ownership.

Packaged vs Headless CDP: Choosing the Best Fit

Reverse ETL: Bridging the Gap Between Data and Action

There are already a lot of tools that ingest data into your warehouse. More recently, a new segment has entered the market: reverse ETL. These tools can activate data to your marketing channels or other platforms. Reverse ETL tries to solve this problem. A reverse ETL tool enables you to focus on building use cases and not to worry about connector development and maintenance (e.g. Flywheel Software, Census, Hightouch, Looker Actions, Segment). Next to that, the tools are providing insights in the process and data that flows out of your cloud platform.

An example setup:

In addition to using Google Cloud-native tools, third-party ISV SAAS point solutions can be used for ingestion, transformation, analysis, or governance. If you do so, you get a hybrid-composed CDP.

When to choose a headless CDP approach?

When your organization already has a cloud data warehouse or is planning to set one up, the headless approach is interesting if you:

  1. Don't want to purchase and implement an (expensive) CDP tool, for example, if you want to share audiences, triggers or other data directly from the data warehouse with your (marketing) channels or (external) platforms.
  2. Want to realize advanced and more complex use cases based on different sources and combinations, which are not possible or hard to achieve in packaged/stand-alone CDP tools (by clicking stuff together) or without moving the data out and in the CDP platform to perform the necessary actions?
  3. Do not want to duplicate data and recreate business rules in different environments (single source of truth).
  4. Want to leverage existing investments in a cloud data warehouse optimally? Even if your data lake resides on a non-Google Cloud, Google Cloud remains the most suitable option for hosting your CDP.
  5. Unlock the potential of and activate non-customer data, such as product or campaign data.
  6. If you want to reduce vendor lock-in of systems and applications.
  7. If the cost of the packaged CDP grows 10X while your business grows 2X. A headless approach cost only increases with additional processing and storage needed to handle the growth. Until now we've seen headless being more cost effective. 
  8. Various cloud services and dedicated/specialized tools (such as reverse ETL) make it relatively easy to connect data and tools, allowing flexibility of choice. 
  9. Suppose you want to enjoy flexibility in onboarding and offboarding point solutions in the future, which you will because nothing stays the same forever.

Why Integrating GCP Can Enhance Your Existing non-Google Data Lake Infrastructure?

What if your business already has a data lake on AWS, Azure, or another cloud environment? Understandably, you want to safeguard your investments in these clouds and don't want to replicate them.

In such cases, we don't build another datalake. We only bring the data needed for the marketing and sales use cases to the CDP on Google Cloud and link the results to the data lake and other systems. Adopting a multi-cloud strategy is a trend where companies pick the best cloud for their use cases. 

GCP offers seamless integration with various data sources, enabling you to unify and access your data quickly from a single location. This can save you valuable time and resources that would have otherwise been spent on managing disparate data sources. So, while safeguarding your investments in existing clouds is important, it's worth considering the benefits of integrating GCP to unlock the full potential of your data.

Google Cloud is very well positioned for headless CDP for several reasons: 

  • the native connectors with advertising platforms 
  • the AI capabilities that are nicely tuned for marketing and advertising 

Next to that, we adopt a serverless platform, which means that no technical maintenance is required.

Real-time headless CDP applied by Crystalloids

The data flows and integrations that we have developed, leveraging the cost-effective and real-time capabilities of the Google Cloud Platform, are pre-defined building blocks that can be tailored to your specific requirements.

Instead of packaged software, where you depend on the existing functionalities and roadmap of the supplier, we build a solution that will always meet your present and future demands. Therefore our headless CDP (API driven) Data Platform offers more flexibility to define your own rules regarding aspects such as user stitching or building audience segments.

Packaged vs Headless CDP: Choosing the Best Fit

Conclusion

While traditional CDP solutions have served businesses well in managing customer data, headless CDPs represent a significant advancement. Headless CDPs offer increased flexibility, scalability, and agility by separating the front-end user interface from back-end data management. This allows businesses to adapt quickly and efficiently to evolving needs and market conditions.

At Crystalloids, we have been designing and implementing public cloud data ecosystems for over eight years, including what we refer to as a headless Customer Data Platform or Unified Marketing Technology Stack. As the concept of "headless" becomes more prevalent in the industry, it is clear that this approach offers valuable benefits for modern data management strategies.