Naming Conventions
There's a popular saying:
There are only two hard things in [Data Science]: revenue-attribution and naming things.
Naming events, dimensions, and metrics consistently and in a way that they withstand the test of time is HARD. There's a delicate balance between naming things as legible and explicit as possible, and being forgiving enough to not require refactoring your data-pipelines every few years.
As such, we try to adhere to some strict rules and conventions to minimize opinion, confusion, and more importantly rework.
Casing: snake_case
Introducing case-variation into your data collection schema generally adds more room for human error during the implementation process. snake_case (or other formats that use special characters for a delimiter) are easier to transform into other formats when desired (or necessary) without the use of regular expressions.
Most data platforms (not accounting for "adtech platforms" like Meta) normalize and flatten kebab-case, camelCase, and PascalCase into snake case in their underlying database tables or datasets. As such, it makes sense to use snake_case as the primary casing.
Event Naming Structure
In order to keep our events predictable and structured, we follow Segment's Object:Action framework for how events are named. For example, when measuring different interactions on a form, they'd be structured something like this:
form_startform_step_loadform_step_completeform_complete
In the above examples, the form (or form_step) would be the "object" and start, load, complete would be the "action" being measured.
Using this format has a few benefits.
- Events can be easily ordered alphabetically and grouped together
- Have a predictable starting string on an event can make regex based triggers less prone to error.