While coding the transformation function, developers need to keep the following in mind
- All of their code will go inside a single transform function
- That function receives a collection of event messages which is a JSON structure. Following are few key fields of the structure1
- The function iterates through the collection and for each message – generates one or more messages (keeping the structure unchanged) for further processing
- Developers must not alter the structure of the event messages as otherwise the message might get rejected if required fields are missing
- Developers must maintain the semantic integrity of the message payload. For e.g., a “track” message type with rl_event “revenue” must always have either price+quantity or revenue fields1
- Following are some typical examples of transformations that developers might choose to employ
- Original message type is “pageview” but maybe due to historical reasons, in a certain analytics tool – such events have been previously enlisted as “track” messages with event category “Page Viewed”. So the transformation can generate a 1:1 message for each message in the collection albeit with the message type changed to “track” and rl_event set to “Page Viewed” and appropriate event category populated in the message
- Include only every fifth message, from the original set, in the collection to be returned. If the enterprise envisages a huge number of events to be generated while the target analytics platform charges by volume and samples are sufficient for the analytics of interest – then this kind of sampling might be resorted to
- Do a user-wise aggregation of, say, “total_payments” field from within messages in the batch received having type “track” and event “spin” – and generate a single event with the summed up total_payments per user. Again, on a platform that charges by volume – such aggregation to find metrics like Average Revenue Per User can be handy
- For each Product included in a “track” message for event “Product List Viewed” – generate individual messages that might be flowing into, say, Google Analytics with GA event action as “detail”. This is typically done when the target platform offers a high free volume threshold and at the same time sufficient ability for granular level analysis (like Google Analytics). These are situations where one input message can give rise to more than one messages
- Developers should ideally leave the structures like rl_context and rl_traits unchanged as they are concerned with the application and user specifics. Manipulating the same may lead to erroneous results. However, there can be specific requirements for undertaking such manipulation. For e.g., a customer might be interested to find out the distribution of their customers across Android OS variants. In such case, if the os_name is “android” and os_version is anything between 5.0 and 5.1.1 – then the developer might want to replace the os_version with just “Lollipop”
The transformation function is a differentiator for the Rudder platform and can be a powerful tool in the hands of the enterprise.
It can be used for spend optimization (by reducing the number of messages directed to a platform and leveraging aggregated events) as well as for gaining better business insight (altering message types for facilitating classification).