View profile

Combining Event Data and Object Data to Derive Insights and Drive Action

Combining Event Data and Object Data to Derive Insights and Drive Action
By Arpit Choudhury • Issue #5 • View online
👋 Hey there, welcome to the Data-led newsletter! Subscribe here if you haven’t already.
In the previous issue, I had covered tools and technologies to ingest object data – data from secondary or third-party sources.
In this issue, I am covering the process and tooling to combine event data from first-party sources with object data from third-party sources to analyze the data (to derive insights) and activate the data (to drive action).
Disclaimer: Data-led Academy is vendor-neutral and the tools mentioned here are not necessarily the best tools in the categories they operate in, nor are they the only ones.

Event data from your website and apps (primary data sources) helps analyze user behaviour.
Object data from sales, marketing, advertising, and support tools (secondary data sources) helps analyze user engagement.
So why combine event data with object data?
A combination of the two helps measure the impact of engagement activities on user behaviour. 
This is best explained with an example as follows: 
  1. You lead growth for a company that sells a workflow automation tool.
  2. You look at events on a product analytics tool like Mixpanel or Amplitude to analyze the user journey from sign-up to activation and beyond.
  3. You also use event-based engagement tools like Customer.io or Intercom for lifecycle emails and Userflow or AppCues for in-app onboarding to get the user to perform desired actions and reach the aha moment. 
P.S. That’s me during my time at Integromat. ☝️
However, to measure the impact of your engagement activities, you’d want to see if a particular campaign (emails or in-app prompts asking a user to, say, create a workflow) actually made the user perform the desired action (of creating a workflow).
While this sounds like a no-brainer, such analysis requires you to combine event data from your app with object data from the engagement tools using purpose-built tools.
You can also go a step further and personalize the engagement campaigns using a combination of this data. Here’s a quick example before I jump into the process and the tooling: 
  1. Your lifecycle email campaigns are running on Customer.io
  2. Your in-app onboarding campaigns are running on Userflow
  3. You now wish to personalize the email campaigns based on the completion of specific steps of the onboarding campaign
Once again, to do this, you’d need to combine event data from your app with object data from Userflow (or your onboarding tool) and make this data available on Customer.io (or your email tool)
The process and tools to combine event data with object data
There are two distinct approaches to combining event data and object data for the purpose of analysis and activation.
Analysis and Activation of Data
Analysis and Activation of Data
1. Using a Customer Data Platform (CDP)
A Customer Data Platform, by definition, is a software system that collects customer data from various data sources (both primary and secondary), handles identity resolution to unify the data, and offers the ability to build user audiences that are synced with third-party tools
In essence, a CDP such as Segment Personas or mParticle is an end-to-end solution to collect and combine data, and sync it with third-party tools. 
However, just like any other off-the-shelf, ready-made solution, a CDP has some limitations – here are the main ones:  
  1. Support for third-party data sources is limited and you might have to build your own integrations with certain tools that don’t exist as data sources.
  2. Almost always, combining data from disparate sources requires data transformation and since a CDP is meant to be used by less-technical users, the transformation capabilities are limited.
  3. Lastly, the biggest limitation of a CDP is that you need to adhere to a fixed data model – you cannot model the data as per the data model of your app (many-to-many relationship between accounts and users, for example) or that of external tools.
While this might make a CDP sound like a deal-breaker, there are actually many use cases and reasons to adopt a CDP anyway. To learn more, check out resource [1] at the bottom. 
2. Using a Data Warehouse 
A Data Warehouse is essentially a database designed for the purpose of analytics and is used to store data from all possible data sources – first-party apps, third-party tools, as well as production databases. 
Due to the affordability and the ease of setting up a data warehouse, popular cloud data warehousing solutions (like Snowflake, Google BigQuery, and AWS Redshift) are experiencing rapid adoption.
It’s pretty straightforward to sync data from all your data sources using purpose-built data collection tools – I won’t go into the details as this was covered in the last two issues. [2] [3]
Once the data is in the data warehouse, you need to write SQL queries in the warehouse to combine and transform data from multiple sources. This process is referred to as data transformation and it enables building custom data models which are then synced to third-party tools for analysis and activation.
While data can be transformed solely within the warehouse itself, purpose-built data transformation tools (like dbt and Trifacta) bring a lot of flexibility and automation to this process. 
Data Transformation & Reverse ETL
Data Transformation & Reverse ETL
Either way, once the data has been prepared, it needs to be synced from the warehouse to the tools used for data analysis and data activation. 
In terms of analysis, all Business Intelligence (BI) tools (like Looker, Mode, or Tableau) and some Product Analytics tools (like Indicative and Rakam) have native integrations with popular data warehouses (other vendors are also building native support for warehouses).
However, in order to activate the data in the warehouse, you need to use yet another integration tool to sync the data from the warehouse to third-party tools used for sales, marketing, advertising, and support. And this is where Reverse ETL tools (like Hightouch, Census, and Grouparoo) come into play – enabling you to extract modelled data from the warehouse and load the data into third-party tools. 
In a nutshell, unlike the CDP approach, the warehouse approach requires more time and more resources both in terms of talent and tools. And while this approach brings in complete flexibility, it is only recommended in the presence of a dedicated data team that, at the very least, comprises a data engineer and a data analyst. 
Summary and What’s Next
This issue covered the process and tooling to combine event data from first-party sources and object data from third-party sources to derive insights and drive action. 
In the next issue, I am planning to dig deeper into the analysis layer or the activation layer – if you have a preference, please LMK!
And don’t forget to send in your questions via Twitter or LinkedIn.
Thanks for reading! 🙏
How easy was it to understand the content of this issue?
Resources to dig deeper
Share this issue with a friend or two and help me reach more people! 💗
Did you enjoy this issue?
Arpit Choudhury

A free newsletter by the founder of Data-led Academy sent almost twice a month for folks to keep pace with the data space!

If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue
Data-led Academy, Urbana NRI Complex, Kolkata, India