Transformations in Rudder: Part 1

One of the core promises of the Rudder platform is that – when it comes to business lifecycle events – enterprises can generate once, analyze anywhere. The platform frees enterprises from having to mangle the same data differently for different target analytic platforms just to suit one or more of the following needs

Map/match the model of the generated data to the format in which the target platform expects it
Aggregate events because the target platform cannot operate on collections; so multiple calls need to be made to the platform to load the data one item at a time while linking them through a common identifier (e.g. every line item from a shopping cart to be uploaded individually linking them through a common order id). The problem aggravates when the same target platform charges based on call volume!!
Selectively forward events to a platform depending on the target platform’s core strength and focus functional area

Whatever may be the driver – the end result is increased code complexity, greater regression effect and more time consuming release cycles for integrating newer analytics platforms.

The Rudder platform changes that. Using the Rudder SDK, developers generate ONE event that is routed to the central processing hub. One enables Destinations for the event through the Rudder Management Console.

The Rudder platform comes with support for a plethora of Destinations like Google Analytics, Amplitude, AppsFlyer and many more.

Rudder events already contain additional information regarding the source and associated device context without requiring developers to explicitly supply those.

Developers can also associate a user identifier for the events where applicable.

Additionally, Rudder event payload contains few key structures that are built based on the additional parameters/properties coded in by the developer. These are

rl_type – Broad categories of messages e.g. pageview, screenview, track, identify
rl_event – An enumeration representing the business lifecycle event e.g. Product List Viewed, Checkout Option Changed and so on1
rl_user_properties – Schema-free JSON data structure representing attributes to be associated with the User who has triggered the business event. Examples can be user identifier specific to the enterprise, address, age, gender – what have you
rl_properties – Schema-free JSON data structure representing attributes associated with the event itself. Example may be collection of Products for a Product List Viewed event where each Product in turn might have an SKU, Brand, Variant, Size associated with it

The source and context-specific attributes are automatically mapped, as applicable without compromising their business significance, to parameters/structures specific to the destination.

When it comes to the message transformation, two strategies are supported by the Rudder platform

A direct field-to-field mapping that can be added through the Management Console. Example – rl_properties.products.product.SKU : product_id (map the SKU field under Product structure under Products collection within rl_events_properties to the field product_id in respective calls generated for the Destination)
A Transformation that can operate on an event message collection (with special focus on the rl_event, rl_user_properties, rl_properties) and emit another collection depending on business logic contained in JavaScript code that can be entered through the Management Console

It is this second strategy that is the focus of our blog today.

The drivers remain the same as described above.

But now, one operates at the level where the final payload for a specific destination is about to be emitted. The regression effect is nullified because an error in the transformation cannot adversely affect the payloads getting emitted to other Destinations.

Code complexity is also reduced for the enterprise developers since the code for event generation remains same and one.

So how do we achieve this?

The enterprise developer/administrator will be able to define a transformation function in JavaScript through the Function Editor in the console
The transformation function would be mapped to the particular organization-destination combination and the reference maintained in database
The Management Console would persist the code in a file naming it in accordance with a pre-defined and fixed convention and saving it at a specific location to aid its loading at runtime
There would be destination-specific transformation engines residing within the Rudder hub. Each of these engines is essentially a NodeJS script. It operates on a batch of event messages. On each message, it performs the following functions in sequence

Retrieve name of transformation function mapped to the destination in question for the particular organization
Dynamically construct the path to the JavaScript file persisted earlier
Call the function within the JS file that encapsulates the complete code entered through the Management Console, passing the collection of event messages that it had originally received
The function returns back another collection (later sections outline how this new collection is generated)
On this new collection

Map source and context attributes to destination-specific payload fields
Perform field-to-field transformations as defined by the administrator/developer through the Management Console

Pass on the updated collection to the next stage in the Rudder processing pipeline

In the next part of the blog we take a deeper dive into the inner workings of the transformation function and outline how developers can develop such functions.

Transformations in Rudder: Part 1

Submit a Comment Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta