Concepts & Process

Data Normalization

In order to reduce the amount of post-collection transformation, an effort should be made to do some basic transformation/normalization on dimension values that are collected.

Generic Strings

When collecting generic string data that isn't case-sensitive, (e.g. button click text) - the values should be lowercased.

URLs

Never transform the ENTIRE URL string.
Doing so will also lowercase the query string which will make gclid, other advertising click-ids, and any utm_* parameters which will cause issues with downstream reporting.

Transforming the URL path is perfectly fine if needed - as most web servers automatically force to lowercase or use case insensitive rewrite rules (IIS).

Additionally, enforcing a trailing / on the end of a URL path is recommended to reduce cardinality. Most CMS's enforce either appending a / or leaving it bare - but there will be times where manual cleanup is needed.

Date/Time

Most date/time dimensions collected will be for providing context around content publish dates. It is not necessary to provide details about data ingestion date/time as it is calculated automatically at the time of collection.

Dates as Strings

Using 2:00pm @ Dec, 5th 1987 as an example:

  • Split month, day, and year into separate dimensions to reduce cardinality.
    E.g: publish_year: '1987', publish_month: '12', publish_day: '05'.
  • Split hour and minute into into separate dimensions to reduce cardinality, and use a 24 hour format for hour.
    E.g: event_hour: '14', event_minute: '00'

1st Party Data

See the applicable documentation under Platform Guides for notes about each specific ad-platform and its nuances.

Some advanced advertising platforms like Google/Microsoft/Meta Ads allow an advertiser to "share" 1st party data in a "secure" manner via SHA256 hashed values. In most cases, this is done with email address and phone number.

Each of these platforms have minor nuances on how the raw data needs to be formatted/normalized before performing the hash. Failure to follow these normalization guidelines may result in lower match rates.

© 2026 Level Agency.