One of the core promises of the Rudder platform is that – when it comes to business lifecycle events – enterprises can generate once, analyze anywhere. The platform frees enterprises from having to mangle the same data differently for different target analytic platforms just to suit one or more of the following needs
- Map/match the model of the generated data to the format in which the target platform expects it
- Aggregate events because the target platform cannot operate on collections; so multiple calls need to be made to the platform to load the data one item at a time while linking them through a common identifier (e.g. every line item from a shopping cart to be uploaded individually linking them through a common order id). The problem aggravates when the same target platform charges based on call volume!!
- Selectively forward events to a platform depending on the target platform’s core strength and focus functional area
Whatever may be the driver – the end result is increased code complexity, greater regression effect and more time consuming release cycles for integrating newer analytics platforms.
The Rudder platform changes that. Using the Rudder SDK, developers generate ONE event that is routed to the central processing hub. One enables Destinations for the event through the Rudder Management Console.
The Rudder platform comes with support for a plethora of Destinations like Google Analytics, Amplitude, AppsFlyer and many more.
Rudder events already contain additional information regarding the source and associated device context without requiring developers to explicitly supply those.
Developers can also associate a user identifier for the events where applicable.
Additionally, Rudder event payload contains few key structures that are built based on the additional parameters/properties coded in by the developer. These are
- rl_type – Broad categories of messages e.g. pageview, screenview, track, identify
- rl_event – An enumeration representing the business lifecycle event e.g. Product List Viewed, Checkout Option Changed and so on1
- rl_user_properties – Schema-free JSON data structure representing attributes to be associated with the User who has triggered the business event. Examples can be user identifier specific to the enterprise, address, age, gender – what have you
- rl_properties – Schema-free JSON data structure representing attributes associated with the event itself. Example may be collection of Products for a Product List Viewed event where each Product in turn might have an SKU, Brand, Variant, Size associated with it
The source and context-specific attributes are automatically mapped, as applicable without compromising their business significance, to parameters/structures specific to the destination.
When it comes to the message transformation, two strategies are supported by the Rudder platform
- A direct field-to-field mapping that can be added through the Management Console. Example – rl_properties.products.product.SKU : product_id (map the SKU field under Product structure under Products collection within rl_events_properties to the field product_id in respective calls generated for the Destination)
It is this second strategy that is the focus of our blog today.
The drivers remain the same as described above.
But now, one operates at the level where the final payload for a specific destination is about to be emitted. The regression effect is nullified because an error in the transformation cannot adversely affect the payloads getting emitted to other Destinations.
Code complexity is also reduced for the enterprise developers since the code for event generation remains same and one.
So how do we achieve this?
- The transformation function would be mapped to the particular organization-destination combination and the reference maintained in database
- The Management Console would persist the code in a file naming it in accordance with a pre-defined and fixed convention and saving it at a specific location to aid its loading at runtime
- There would be destination-specific transformation engines residing within the Rudder hub. Each of these engines is essentially a NodeJS script. It operates on a batch of event messages. On each message, it performs the following functions in sequence
- Retrieve name of transformation function mapped to the destination in question for the particular organization
- Call the function within the JS file that encapsulates the complete code entered through the Management Console, passing the collection of event messages that it had originally received
- The function returns back another collection (later sections outline how this new collection is generated)
- On this new collection
- Map source and context attributes to destination-specific payload fields
- Perform field-to-field transformations as defined by the administrator/developer through the Management Console
- Pass on the updated collection to the next stage in the Rudder processing pipeline
In the next part of the blog we take a deeper dive into the inner workings of the transformation function and outline how developers can develop such functions.