Category Archives: Data Science

Advertising is not about revenue to the agency. It’s about revenue to the client. Data science is the road map that insures the client agency relationship works.

Last Updated BY: on .

A Primer For Attribution Models Using Google Analytics

Attribution Models using Google Analytics

I’ve often read that attribution models are not the panacea for advertising budget allocations, nor is it a perfect science for programmatic predictions and deterministic models. Every source that I have found, claiming attribution isn’t a panacea, do not provide methods for attribution models using Google Analytics. Attribution models can  be a vital part of analytics for insights that lead to budget refining and for improving campaign performance with mathematical precision. Consequently I’ve found when using attribution models the analyst can further identify where the campaigns need to be improved from the source, and also the medium. Analysts also identify where key changes are required to an advertising campaign’s call-to-action, the offer, copy and content, and UX. Through the fine tuning of the attribution models it is possible to determine the effect of an increased and decreased budget — by channel. Finally, when everything is correctly linked, coded, and setup, an analyst can just as much provide the same level of predictions to online advertising as much as combinations of online with offline advertising.

The weakness of attribution modeling, if there is one, isn’t in the modeling, it’s more a weakness in the methodology applied to the modeling setup. Providing optimized performance cannot start with attribution models, we have to earn our way into it. An evolutionary process if you will. The starting place for the evolution when using Google Analytics (GA) is data integrity (from a previous article) and consumer Acquisition.  Form the acqusition opportunities as they are provided in the GA data management API we can identify where the website traffic is coming from. The acquisition is primarily divided into channels, 8 channels as they are recorded in the out-of-the box version of GA.

  1. Display
  2. Email
  3. Organic
  4. Referral
  5. Paid Search
  6. Direct
  7. Social
  8. Other

Within the Administration (Admin) opportunities, as provided in GA, we can connect AdWords, Doubleclick, Adsense, Youtube Search Console, and a few other Google API resources. These additional API connections identify even more detail from the acquisition channels. This acquisition (‘A’) section of GA provides an analyst with the summary understanding of where traffic/users are coming from and provides the foundations for insight to why they are coming. The next section of the GA API opportunities reveals what those acquired users are doing on the website.


In this ‘B’ section of the GA API we can better determine what the users are doing, what are they engaging with, and gain insight to how well the acquisition of traffic is relating to the design of the website. This combination, we often call the user experience or UX for short, provides the analyst with answers for 2 primary questions:
  1. How well does the website meet the user’s expectation
  2. Are the acquisition methods delivering the right users
Attribution Models using the ABCs of Google Analytics

Google Analytics API –, ABCs

The basic, out-of-the-box GA opportunities help an analyst discover the entry pages, exit pages, the time a user spends on site, and other general non-sophisticated click-tracking metrics. To get more advanced analysis therefore requires a few additional settings that are dependent on the client needs can be made within the admin console. Also, and depending on the website tracking needs we will use Google Tag Manager (GTM) in combination with GA. The consequential result from using these more advanced techniques are essentially the analyst can define events and custom dimensions that are key to identifying deeper insight for answering questions 1 and 2.


The level of sophistication and exact measurement protocol, data integrity and governance, as well as the specific tags, scripts, variables, conversions/goals, funnels, etc. are provided through a tracking assessment guide. A website tracking assessment guide (STAG) is a document prepared by a highly skilled Google Analytics integration professional. With this keen understanding and guidance the next logical and obvious area the analyst requires knowledge of are conversions. Since the STAG is critical to the measurement strategy it will be the topic of coming article.


The ‘C’ section provided by the GA API provides the conversion process. Not just how many conversions were realized, but the conversion path, and the conversion flow are each provide to the analyst. The major difference in this section from the previous, A and B sections is that we have to set up Goals through the administration console before GA can provide the basic, baked-right-in, flows and paths. While the setting up of the goal may be simple steps from within the console, the accuracy of identifying goal types, sequencing & funnel steps, KPI strategy, and combining GTM, javascript, filters, and custom metrics requires advanced analytics knowledge generally only found from highly educated and experienced data science and senior level advertising data analysts.

Conversion analysis requires the analyst to apply dollar values to each of the goals. For an ecommerce website, and for websites that provide direct selling opportunities, setting a dollar value to the conversion is obvious. However, even websites that do not provide a users with a direct purchasing opportunity such as sites that are designed to generate leads, or drive phone calls to a call center require the analyst to set a dollar value. Even the value of a click, engagement, and goal from sites where the businesses have a long buying cycle or non-direct buying cycle such as medical, banking, insurance, recovery, real estate, etc. the analyst loses key insights to elements of the conversion analysis. In all verticals, and in every case where attribution models are desired, a dollar value for each goal is necessary.  

Core Principals of Attribution Models

As we can readily identify, due-to progressing through the ABCs of GA, the more we drill down in the analysis to deeper levels, in addition to digital and web analytics we also discover the need for understanding several key business principals. These core principals require an advanced understanding of:

  1. Statistical Analysis
  2. Statistical Modeling
  3. Advertising
  4. Sales
  5. Communications
  6. Marketing
  7. Business Intelligence


At the bottom of the C-section of the GA API we find attribution modeling. Here we discover the pinnacle of what GA can provide the analysts. Once the attribution models have been developed, and are providing the richest possible insights to revenue and cost cutting, we move forward with other, more sophisticated analysis platforms such as Analytics Canvas, Big Query, SAS, MiniTab, etc. But, before we do, let’s back up a step or two and take a closer look at preparations for attribution models.

The admin console of the GA API

GA API Admin Console with Opportunities

Before defining the attribution models that are best for a client’s business there are many considerations and prerequisites to consider. Such as determining the website structure, its taxonomy and content which requires a thorough understanding of the target consumer, stages of the buying journey, and the content that will be needed to meet the consumer at each stage of the journey. Further more, we must understand what causes the consumer to buy, what choices determine which competitor the consumer will buy from, which consumer segments are right for the client and which ones are not.

Once defined, these segments can further lead to the analyst defining the advertising strategy and channels that reach these key segments. The website, landing page content, and engagement that each targeted segment expects are prepared. Knowing how to communicate with each segment, developing the communication, and providing the exact next-steps for the consumer brings us to a point in the advertising process where we need metrics. We have to determine what the key performance indicators are for measuring the advertising segment, content, and engagement process. When we are able to measure the channels, and landing pages, and events, both individually and in combinations we can make decisions from the insights gained to optimize the dollars invested to the dollars gained; maximize the ROAS. 

Key Metrics and Segment

The attribution model at this stage of the advertising implementation requires the analysts to provide two metrics, and additionally the analyst will need to define campaign segments in the GA API.

  1. The cost for each medium
  2. The revenue from each medium
  3. Campaign specific segments

The C-section in the GA API requires that the analyst define and provide the values for each of the two metrics. We provide the values from the Administration console. Additionally the analysts will require the use of GTM to identify custom dimensions for user ID, Session ID, time-stamping. GTM is also necessary so that we can apply advanced techniques by defining data-layer variables and java-scripting that enhance the tracking of the consumer through site visit, engagement, and each defined conversion step.  

It’s Probably Worthwhile

,p>At this point you may be thinking attribution modeling is too sophisticated, too time consuming and too expensive. And yes, I will agree that at this level of sophistication the average web or digital analyst probably does not have the time nor the skill available to accomplish attribution models. Consequently they are not going provide the analysis, and insights. However, the benefits are tremendous. Especially consider the value from answers to these :
  1. Which channel is providing the highest ROAS (return on ad spend)
  2. Which consumer segments are key to the acquisition process
  3. Which content is failing to move the segment to the next steps
  4. What would happen if we increase spending in TV, radio, online channels and mediums
  5. What are we losing with the current strategy, what can we gain if it changes
  6. What is the impact on pricing changes, coupons, upsales


We have to!

We need this.

Evolution of Attribution Steps

The steps in the scientific evolution from the ABCs of GA to the attribution modeling:

  1. Data Integrity
  2. Journey Mapping
  3. Site Taxonomy
  4. Content
  5. Site Tracking
  6. Revenue Tracking
  7. Campaign Tracking
  8. Cost Tracking
  9. Statistical Modeling
  10. Predictive Modeling

Each step in the evolution demands higher levels of education and experience, and commitment. Unfortunately, there are very few people with these skills, experiences, and knowledge. Individuals with certifications in Google Analytics are not educated and skilled in these advanced techniques. The GA certification is little more than recognizing an individual who has demonstrated an understanding of the opportunities available in the GA API. It is not a certification of proven know-how to provide analysis, insights, and certainly not a certification in analytical modeling. Many of the Google Analytics Certified Partner (GACP) companies are capable of providing integrity audits, and site tracking guidelines. Most GACPs are skilled with GTM, data-layer variables, and advanced segments through custom JavaScript and metrics. Even at the Google Authorized Reseller (companies who are authorized to sell GA Premier and GA 360) levels of expertise vary widely in some or most all of the skill levels attribution models require.


In conclusion, If you decide to learn and apply attribution models yourself, you will have to take on the cutting edge definitions and creativity in the evolution. If you are going to hire an agency to build the models for you, you will have to get comfortable with — trial, test and change, repeat – to create the right models for your business. At a minimum, you will need to hire an agency with a skilled statistician who can understand the 8 core elements to keep every step of the evolution focused and on track.

The good news, the silver lining to the emerging technology, is that when you follow the 10 steps of the process from Data Integrity to Journey Mapping, etc. and allow the process to naturally evolve, earning your way along through each step, the panacea for optimizing the advertising ROI can be realized. At the very least you will have realized higher returns and you will have the right data for moving forward to big data applications such as Analytics Canvas, Big Query, SAS, or Minitab.

Leave a Comment

Filed under Data Science

True Conversion Rate For Content Value

The real value of a client’s site taxonomy and webpage content is not truly available from Google Analytics when viewing the provided conversion rate. The conversion rates in the Goals Overview data cubes, and built-in tables do not reflect the User Experience of your target market. This raises three major concerns for the analysis of advertising value, budgets, and page performance.
  1. Those conversion rate percentages are calculated using sessions as the denominator.
  2. Those conversion rate percentage are calculated using bounced and non-bounced sessions in the denominator.
  3. Effective and actionable analysis requires a true conversion rate.
Sessions are not the best denominator to use for goals especially when the stakeholders are evaluating advertising budgets based on goal completions by channel, sources and mediums. When your digital advertising team is investing a lot of valuable efforts identifying key demographics and programmatic media buys based on user interests groups and user performance there is even more critical reason to stay away from the Google Analytics built-in reports and metrics. The solution is to use Calculated Metrics. Using a calculated metric available in the Admin section of Google Analytics, you’ll need to define the conversion rate based on Entrances, and based on Non-bounced Users. While the Entrances metric is still not a one-to-one metric of individuals who are visiting the website, conversion rates based on Entrances are typically higher than conversion rates based on sessions or users. Why entrances and not users or sessions is best defined in an article from the google analytics support page on bounce rate. In that article they define a bounce as a visitor who exits the entrance page without visiting any other page of the site. Even if the visitor bookmarks the landing page and returns 10 times but does not engage with the conversion or whatever the call-to-action may be, each return is a bounce.

The Non-Bounced Users metric provides a view of the conversion rate based on visitors who are engaged with the client’s website/content. I call this metric a True Conversion Rate (TCR). The onus for increasing the number of these engaged visitors is on the advertising team. By developing the TCR you are providing the stakeholders and the advertising team with analysis which provides evidence of the landing page and the websites ability to convert when the right target audience clicks through from the advertising. When you combine the TCR with audience segments, behavior segments, and drill down by source/medium in your analysis, many key actionable insights unfold.

  1. Does the landing page convert when visited by the target audience?
  2. Is the call-to-action communicating effectively?
  3. Is the landing page the right message for this audience?
  4. Are you targeting the right audience?
  5. Which programmatic logic is working, which isn’t?
  6. How much different is the conversion rate: > 300%, 500%, etc.?
  7. What can organic engineering learn from the paid audience landers?
There are dozens more insight probing questions that could be added to the list, and depending on the client vertical and offers you can always find significant impact using TCR. Here’s the how-to guide.

Log in to the View level of the client’s Google Analytics Admin tab. From here click on the calculated metric opportunity. 
Calculated metrics for true conversion rate

The View Level of Admin is where to find calculated metrics

Of course we’ll need to click the New Calculated Metric button.


  1. Name the metric
  2. This is a system completed naming convention.
  3. There are five formats to set. In this case we are creating a percentage.
  4. Write the formula.
Four steps to create the metric

Four steps to create the metric

Make absolutely certain to set the formatting to be a percentage. The data management system will add the variables as you begin to type in the formula box. Use the goal completions in the numerator and make sure to use parenthesis to surround the denominator. Remember the rule of operators, PEMDAS?

As shown in the image above I’ve set the denominator for the true conversion rate to subtract Bounces from the total number of Entrances.

Now that you have the TCR defined you will need to create a custom report to use it in analysis. I typically create a single custom report and add a new tab for each goal that I have a TCR for. If you’ve never created a custom chart before… shame! But, you’ll see that it is fairly simple process and far more powerful for discovering insights.

Here’s my system for creating a True Conversion Rate report.

The first step is to navigate over to the Customization section in the client’s Google Analytics Admin. And, of course, create a new custom report.
Creating a custom report for true conversion rate analysis.

Creating a custom report for true conversion rate analysis.

Then you’ll need to build the report to include the metrics, dimensions, and any filters you require.

  1. Name the report, True Conversion Rate
  2. Add the Tabs, one for each calculated TCR
  3. Set the Metric. First metric is the TCR and the second metric is the conversion rate for the same goal. This gives you a side-by-side comparison of the two conversion rates for the same goal.
  4. The dimensions are a drill down function where the first position is the default and each dimension below it provides you with a deeper view. In my case I like to see the TCR by source. When I click on the source I want to next see the breakdown by medium. Drilling down again I’ll click a medium to see the landing page. Finally I can drill down on a landing page to discover the city the user was in when they completed the conversion engagement.
Here’s an example of the custom report.

  1. Select the tab you want to analyse the TCR for.
  2. Compare the standard Google Analytics CR to the calculated metric for TCR
In this case the goal 9 conversion rate as reported in the built is Google reports is 1.26%. When the bounces are removed from the Entrances the True Conversion Rate is 5.27%. That’s a full 418% higher conversion rate. From here, when using a custom report, you can add custom segments just like you can in all GA Reports, and you can drill down to each dimension that was set when you created this custom report.

One last idea on the dimensions to drill down on is to add a Device Category option. There you can see what device is used more frequently for conversions.


In this case from the image above, I’m looking at phone calls to a call center from a website. The custom report identified there are more conversions from desktop devices then from mobile devices.

Interesting… and true conversion rate is 312% higher.

While Google still provides the analyst with no metrics that provide the actual number of visitors to the website or the webpage, we have to innovate and define as closely as possible. Entrances are a far better means to calculating conversion rate than sessions, but clearly we cannot truly provide the client with 100% data integrity. Using the entrances in this calculation of true conversion rate is a giant step closer to mastering optimization. 

Leave a Comment

Filed under Data Science

Analytical Insights for Virtual Page Views

The more I use a Tag Management System (TMS) for tracking and defining analytical insights from digital advertising results the more I value the application. This growing love affair is enhanced even more for me when combined with the use of my data management tools. Google analytics and Site Catalyst can take you a long way in simple website analysis, but when you want to drill down to discover how to make more from what your digital advertising is doing, or to discover what you’re missing, you need better data management tools. In this case study I have a local real estate client with a website that provides users with information about select neighborhoods in the nations 11th largest city. The website developer used virtual pages for each information article in each neighborhood. There are five virtual pages included in each neighborhood, each of them have the same trigger/name. That makes analytical insights a little easier for tracking events, but I cannot rely on page views in Google Analytics (GA) or from Site Catalyst as the KPI. Instead I’ll have to write a custom javascript that calculates the depth of a page and then tracks the scrolling depth to trigger the virtual page views.

Analytical Insights for Virtual Page Views

After creating the script and installing all of the tracking through the tag management system, I can use Google Analytics and the Real Time menu option to view the events. This view provides at-a-glance analytical insights to what users are currently up-to on the site.
Real time event trackin

I’m trying to develop insights for virtual page views to help sales convince the advertisers to want their ads placed on this real estate website. A key to the insights of the website is in the Neighborhoods section. I want advertisers to place their ads at the top of these pages for a premium rate, mid-page for a little less of a premium charge and at the bottom for the lowest advertising cost. Site visitors scroll the neighborhood pages to read relevant information about the specific neighborhood they are interested in. If I drill down on the GA Real Time events I can see exactly what information (virtual page articles) the user’s are reading over the last 30 minutes.

Virtual pages with real time tracking

The event labels tell the story about the user’s scrolling and that translates to how far down the page they scroll. As you may be able to surmise from looking at the table and graph above, the further down the page an article is, the fewer views it received.

For deeper analytical insights I want to discover more than just the obvious, “what virtual pages were viewed; how far did the user scroll down the page.” Selling advertisers on the value of their ad position should be easy enough with just simple scroll tracking, but with this custom scroll tracking script I can discover more than simple virtual-page-view metrics.

What we can know, and learn from the user’s scrolling behavior:

  • Does scroll depth correlate to conversion rate?
  • How much time was spent scrolling from section to section?
  • If a user clicks to the neighborhood landing page what percentage of the page did they view?
  • What percentage of users click the call-to-action at the bottom of the page?
  • What testing can be done to increase scrolling depth?

Using my data management tools ( I can build a custom analysis table for discovering insights to each of these questions. Once I build the custom analysis data with the kpi collectors in the columns, it’s a simple matter of filtering the landing page for ‘/neighborhoods’ so the view data is precisely at the granular level I’m analyzing.

Custom analysis scroll tracking table

Developing the virtual page view tracking.

Within the script I have the opportunity to define the scroll depth by percentages. For this example there are five segments to the page (options.percentage ) so I’m tracking by quintile: 20%, 40%, 60%, 80%, and 100%. In the script I will set the page depth to identify the first virtual page read at 20%. Here’s what I have for tracking triggers:

Any landing page in the ‘/neighborhoods/’ file of the website will fire the scroll-tracking-listener. The list of virtual page names in the neighborhoods are:

Housing = Fires at 20% page view

The Market = Fires at 40% page view

Living Here = Fires at 60% page view

Things to Do = Fires at 80% page view

Stats and Facts = Fires at 100% page view

To build these tracking events into the TMS I need to define four custom variables as shown below.

User defined custom variablesEach of these four variables are a Data Layer Variables and you will follow these five steps to create each one.

  1. Name the Variable (eventAction, eventCategory, eventLable, and eventValue)
  2. Select the Data Layer as the variable type.
  3. Define the Data Layer Variable Name (eventAction, eventCategory, eventLable, and eventValue)
  4. Select Version 2 for the version.
  5. Save the variable

Creating custom variablesOnce the custom variables have been created it’s time to set the triggers for the scroll tracking. For this real estate case study I only want to track page depth so I’m going to create two triggers. First I’ll set the Scroll tracker to fire on any page in the ‘/neighborhoods’ folder of the website. Second I will exclude the pixel depth tracking that is defined in the custom script that I’ll get to in just a few minutes.

Create a Page View event:

  1. Name the trigger, Scroll Tracker
  2. Set the trigger type to ‘Page View.’
  3. Define the trigger to fire on a Page URL that contains the web site folder.
  4. Save the trigger

Create the pixel depth filter:

  1. Name the trigger, “Scroll Distance.”
  2. Set this to a ‘Custom Event’ trigger
  3. Fire on the ScrollDistance event name
  4. Set the eventAction to not contain Pixel Depth
  5. Save the trigger

Filtering the events

The only thing left to do is the creation of the tracking Tags. I want to track these events in Google Analytics (GA) so I will create one Tag to send the custom variable data to GA. I will set the calculations and data layer information in a custom HTML tag with a java script.

Setting the GA Scroll Depth tracking:

  1. Name the GA Tag “Scroll Depth”
  2. Choose Universal Analytics as the Tag Type
  3. Select track type to ‘Event’
  4. Set the category, Action, Label, and Value to the custom variables created earlier.
  5. Set the Fire On to the “Scroll Distance” trigger created earlier
  6. Save the TagSend the data layer to Google Analytics

All that’s needed now is the Tag containing the custom java.

You can get the scroll tracking script here.

Create the TAG for a custom HTML and paste the script.paste the custom script to an html tag

  1. Name the Tag “Scroll Tracker”
  2. Select the Custom Html product
  3. Paste the scroll tracker script into the html
  4. Select the Scroll Tracker trigger created earlier
  5. Save the TAG

Leave a Comment

Filed under Data Science