Marketing Analytics Data Warehouse in Google Cloud

2022-07-12 | Article | Insights

Introduction to marketing analytics data warehouse

As part of a continuously growing demand in the market, data warehouses have become essential to one's digital strategy. The primary goal of marketers is to attract a customer base (e.g. number of new customers, decrease in acquisition cost) and then retain (e.g. decrease in churn, improvement in net promoter score) and grow it (e.g. increase in conversion and customer lifetime value, increase in revenue and/ or margin size) by satisfying its needs. Traditionally, marketers would build a funnel alongside the customer journey to outline a user-centric approach to achieve this goal. Each funnel step is determined by data, this can be data used to target a (potential) customer group or data derived from the marketing efforts deployed. In the past, ad servers and Data Management Platforms were the predominant solutions for collecting, modeling, and publishing data for its use in marketing activities. With the rise of privacy-focused discussions, building a solid Data Warehouse has evolved into a more privacy-safe, sophisticated, and contemporary approach for data management and use in marketing endeavors as it has proven to be better aligned with the business needs providing both, a more holistic view to businesses by connecting all dots, as well as respecting user’s privacy.

With marketing analytics, you can bring all your data together to get a clearer picture of the customer journey, predict marketing and business outcomes, and create more personalized experiences for your customers. This is relevant for all businesses as most of them are currently undergoing a data-driven transformation. Implementing a marketing data warehouse in the Google Cloud can help you address the business needs of deriving comprehensive insights from multiple software as a service platforms. This is done by building the architecture needed to consolidate business and advertising data and by getting real-time insights to increase marketing and business performance. Additionally, it can help drive marketing innovation by building machine learning models for business cases, such as customer segmentation, calculating customer lifetime value, and purchase predictions to use for targeting ads. This way, you can improve customer experience by gaining insights into customer preferences consequently making the experience better through personalization.

The following diagram visualizes an example of a marketing analytics reference architecture on the Google Cloud starting with data ingestion from different sources. Data collection generally happens outside of Google Cloud, therefore you need to ingest it. Then you are ready to process it which includes data matching, consolidation, and preparation. Afterward, you can start applying machine learning models and visualize the outcome, understanding and optimizing your activities. Finally, you can transfer the data to Google Analytics and implement it for advertising purposes or onsite personalization. You can also push it through a CDP API to approach a customer match through third-party channels, such as newsletter marketing or social media.

Activation is how Google Cloud customers drive outcomes for their businesses. Being able to push the data back to activation components by creating audiences of users to transfer to these platforms based on existing identifiers such as user IDs or email addresses is a key differentiating factor for marketing analytics DWH in Google Cloud.

Data ingestion

Different data sources may identify the same customer in different ways. For example, Salesforce may capture the email addresses of your customers, while the point-of-sale system (online shops) may return their loyalty card ID. Therefore, you must have a strategy for unifying this data to make it usable.

A solution to this is to transfer and consolidate your data in Google BigQuery as a central location. Many connectors can be used with Google Cloud: Data Transfer Services that Google builds and maintains for Google products; third-party connectors; connectors that are integrated within products; as well as custom connectors built by partners of Google for specific clients and needs.

As data can be ingested from multiple data sources, you can start with the use of BigQuery Data Transfer Service for the Google Marketing Platform or Google Ads, which allows you to schedule and manage recurring load jobs for reporting data. For Google Analytics 4 properties, you can export all of your raw events to BigQuery. To ingest data from sources like Salesforce, SaaS connectors are available in Google Cloud and through Google partners. Additionally, it is possible to ingest data from other public clouds such as Amazon Redshift or on-premises data warehouse systems like Teradata. With BigQuery Omni, Google provides a flexible, multi-cloud analytics solution that allows you to access and analyze data across Google Cloud, Amazon Web Services, and Azure without leaving the BigQuery user interface to break down silos even further and to holistically gain valuable business insights. As BigQuery Omni is powered by Anthos, you can query data without having to manage the underlying infrastructure.

Furthermore, you can ingest data from external sources such as customer relationship management platforms (CRM) or point-of-sale solutions. This can be done using the BQ command-line tool, the BigQuery API, or the Google Cloud Console. In the case of large datasets, Cloud Storage can help optimize your bandwidth use, network speed, and product integration. Moreover, loading data on an event-driven basis can be established with Cloud Function triggers, for example, based on new data availability.

BigQuery provides both batch data processing and streaming capabilities (real-time data processing). With batch processing, you need to decide on the frequency of initiating the process that suits your use case such as once or twice a day. On the other hand, the streaming capabilities enable more up-to-date data at any time.

Data processing

Once ingested, the data can be cleaned and reformatted if needed before starting with the querying process. This ensures consistency in big datasets and can be done with the data processing products available in Google Cloud. Depending on the specific use case, the Data Preparation and Data Warehouse can be merged in one step. The transformation is done directly with SQL queries inside BigQuery. In case of more complexity, an ETL (Extract, Transform, Load) tool can be added (Dataflow or Dataproc).

Dataproc, Dataflow, and Dataprep are three distinct parts of Google Cloud that perform different but related tasks. Dataproc has a Data Science/ ML service for Spark and Hadoop. For marketing analytics, you can use Dataflow, which lets you ingest and analyze both batch and streaming data sources at scale. It creates a new pipeline for data processing with resources that are added or removed as needed. Lastly, Dataprep by Trifacta allows you to visually explore, clean, and prepare data for analysis in BigQuery. It is UI-driven, scalable on-demand, as well as fully automated.

Machine learning application

Machine learning is already integrated into many features of Google’s products - for example, similar audiences or bidding algorithms. However, the built-in machine learning algorithms and capabilities in these tools do not serve the entire marketing funnel - this is where Google Cloud can help. Once you have ingested and processed the data, you can start running machine learning projects. Most marketers focus on their average customer. Instead, AI and ML give you mass precision, providing you with opportunities to evaluate and market at scale based on the specific value of customers or their likelihood to demonstrate a certain behavior. Here are some examples of use cases for machine learning alongside the funnel:

  • Awareness - predict the contact amount a user needs to recall your brand or market to customer segments that share similar characteristics with your most valuable customers;
  • Consideration - actively engage users that are likely to be on the threshold of churning to make them convert;
  • Conversion - predict the value of your (potential) leads and pass them to activation layers, such as the Google Marketing Platform for campaign optimization purposes;
  • Loyalty - find cross-selling opportunities and potential incentives for re-purchasing such as dynamic vouchering based on user paths and user reward system characteristics;
  • Advocacy - optimize your products or services and website based on users’ feedback to increase user satisfaction to turn them into ambassadors.

Based on the level of your knowledge, Google Cloud provides different machine learning components to build and deploy ML models. AutoML Tables can be used for regression analysis and classification models, for example for customer lifetime value prediction. Audience segmentation models can be built using SQL skills and deployed using BigQuery ML. This allows you to execute ML workflows without leaving BigQuery with your data. The most sophisticated methods are served with the Vertex AI. It can be used for both, custom models or AutoML to deploy and test various models simultaneously. Furthermore, MLOps tools, like Vertex AI Pipelines, can be used to easily maintain the model with less complexity.

Insights and activation

Next, you can use your data for insights and/or activation. Whereas gathering insights is about measurement and evaluation of the performance of marketing campaigns based on pre-defined KPIs, activation is about executing a marketing activity against a specific customer segment. Google Cloud has components for both.

A dashboard can help gain access to customer data to spot important trends in user behavior to inform decision-making. It is a catch-all for insight aggregation. For visualization, you can use Data Studio, Looker, Tableau among others. BigQuery and DataStudio are complementary to each other and will allow the creation of on-demand dashboards and reports. Looker is an enterprise business intelligence platform, now owned by Google that incorporates among others the functions of alerting, A/B testing, cohort analysis, and many more. It allows interactive data exploration and activation paths back to the GMP with Actions for Ads through Customer Match. For the others, a third-party integration is provided.

When it comes to activation, this is how you can start outlining your approach:

  • Define your business case;
  • Determine and explore your available datasets;
  • Find the most appropriate analytic approach;
  • Choose the most impactful activation channel and begin activation.

You can build product-level pipelines into any system that can accept the identifier used. This includes Google Analytics where you can push the data back with API calls. Additionally, data can also be pushed to the ad tech tools in GMP with DV360 and SA360, as well as Google Ads, and to third-party vendors such as the Salesforce Marketing Cloud, Adobe Audience Manager, Facebook, etc. You can automate the activation pipeline to streamline the process.

Furthermore, sentiment analysis allows you to customize marketing messages to your customers' moods and use them with email marketing or on social media. This way, you can easily execute sentiment monitoring and targeting. This can help you suppress or differently engage frustrated or angry customers through a push to any system that can activate based on an email address through customer match.


Marketers’ needs have not changed, but the technology to put the customer in the center and focus on customer engagement has significantly improved with the use of data warehouses. For example, these help you understand how much customers will spend. They provide opportunities to predict the likelihood of repurchasing and support your efforts in acquiring customers to scale your business. Customer retention approaches with DWH help you identify the right offers. Furthermore, you can establish ways of engaging your customers into implicit and explicit ambassador activities to use their recommendations for further customer base growth. This means you need to be able to connect all the touchpoints of your customers today and use this intelligence to shape future experiences. The answers to marketers’ questions are hidden in the vast amount of data that you collect from your customers. However, this data is siloed and widely spread by being managed by individuals of separate teams. Marketers in particular have a hard time dealing with this fragmentation. You can build a deep understanding of your customers by centralizing the data in a marketing analytics data warehouse and leveraging Google’s AI. This helps solve all marketing needs of your business of predicting, acquiring, retaining, and growing your customer base to build a more complete view. Without the use of these services it becomes increasingly difficult to deliver the type of relevant experiences people expect today.

Using BigQuery as a marketing data warehouse becomes a self-service asset to marketing and customer teams to accelerate decisions they make about their campaigns, creatives, sales, and customer service and ultimately improves marketing and business performance.

Do you need more Info?

Contact us