Why do the results of the top marketing channels differ between GA4 and BigQuery?

2023-04-18 | Article | Insights

Challenge

A key question in digital marketing is through which marketing channel the web or app traffic was generated. It is important to note that in the following, this refers to the session level; more details can be found here. Insights about the top channels are insanely helpful for future marketing planning and especially for budget allocation. There are two ways to identify the top channels. The simple and quick solution is an exploration report in the GA4 interface that shows the sessions in a given period per custom channel group. A more complex solution is to analyse the raw GA4 data via BigQuery. However, if the results of both approaches are examined, differences in the results are discovered. Why this happens, how it should be interpreted and what it ultimately means for the marketing decisions to be made is described below.

Approach

The top marketing channels can be identified in the GA4 interface with an individual exploration report. All that is required is to select the analysis period, the dimension "Session custom channel group" as well as the metric "Sessions" and visualise the results in tabular form. The "Session custom channel group" is to be displayed in the rows and the "Sessions" as values (see figure 1).

The insights on the top marketing channels, i.e. the custom channel groupings, can also be generated via BigQuery with the help of the query described here .

The result generated via BigQuery for the top marketing channels shows different results to those from GA4 (see exemplary table 1).

The results presented show the same ranking of marketing channels for the analysis period under consideration. The top 3 channels are identical for both approaches: Direct, Organic Search and Social. And the total of all allocated sessions per approach is also almost the same. However, the results show significant differences in the distribution of sessions among marketing channels, which has a big difference in the interpretation of this table and thus on the marketing implications taken.

A closer look at the sample data shows that in the GA4 exploration report, 83% of all sessions are attributed to Direct, 12% to Organic Search, 1% to Social and so on. However, the BigQuery results show a different picture: Here, 93% of all sessions are attributed to Direct, 5% to Organic Search, 1% to Social, and so on. Overall, it is noticeable that the BigQuery results show 13% more Direct sessions compared to GA4. For all other channels, BigQuery reports fewer sessions per channel (from Δ-20% for Newsletter to Δ-93% for Various). It seems that in GA4 the actual Direct sessions are allocated to other marketing channels. And this is actually the reason for the differences - a small detail in the interpretation of the measured channel Direct. Let us look at a single user journey to understand the differences.

Let’s assume a user visits a content site multiple times, and each new session notes which marketing channel they came from; for example like in the following user journey:

For the user described above, a total of 4 Direct sessions and 1 each Organic and Social session are measured during the analysis period. This user journey is also the result of the analysis in BigQuery.

The same user journey would be displayed in GA4 as follows:

There is only 1 session initiated by Direct, but 3 sessions via Organic Search and 2 sessions via the Social channel. How can this be? In GA4, actual Direct sessions (see user journey above) are overwritten with the previous non-direct channels (here Organic Search and Social). Due to this step, they gain in importance accordingly, as they are measured more often than they are actually used in the journey. If the BigQuery results also follow this logic, an assignment of the direct sessions to the previous non-direct sessions would have to be taken into account in the query.

Result

The above explanation shows that the different shares of the top marketing channels are not generated by an error in the query of the raw data via BigQuery. Rather, this is caused by an alternative interpretation and thus recording of the Direct channel.

What does this mean in terms of interpretation? First of all, it is important to be aware of this difference. In this way, the variant of the top marketing channels and their distribution can be chosen that most closely corresponds to one's own understanding. For example, for a brand that initially pursues the goal of building brand awareness, it can be important to push precisely those channels that initiate Direct traffic. In this case, it is helpful to use the data from the GA4 exploration report, because these channels are weighted particularly heavily here. If a marketer is interested in the actual user journeys, where Direct is always measured when this was also the actual traffic source, the BigQuery raw data should be analysed.

Finally, it is important to mention that in the queries of the Universal Analytics raw data, the results are output by default with the same measurement methodology as in the Google Analytics interface. This means that the results do not differ between UA and BigQuery. However, BigQuery offers the option to display the actual user journeys with the help of "trafficSource.isTrueDirect". This feature is currently not (yet) available in the GA4 BigQuery schema; instead, the "actual" user journeys are displayed by default.

Do you need more Info?

Contact