Making Peace with [Not Provided]

It can be so painful, when reporting on your organic search traffic, to see those powerful words, "[not provided]" at the top of your keyword list. In October 2011, when Google decided to encrypt all keyword referral traffic from users signed into their Google accounts, they said it would only affect a very small percentage of users. Now, with the use of platforms such as Gmail and YouTube keeping users logged in all the time, the number of [not provided] visits has grown exponentially, as well as the headaches of all who work in online marketing.

For the true data-grubber such as myself, the pain of this loss can, at times, feel too big to ignore. What are we supposed to do with all of this lost data? The answer is, of course, Microsoft Excel.

Using Excel to Fill In the Blanks
Based on deductive reasoning, you can be fairly certain that the distribution of keyword percentages across known terms will be consistent among the [not provided] terms. So if, for example, 30% of traffic from your site is coming from "organic oranges" and 40% is coming from "grocery delivery", it is safe to assume that 30% of the [not provided] keywords are "organic oranges" and 40% is "grocery delivery". So here is how to accomplish this in Excel:

1. Pick a landing page.
In order to gain the greatest insight into your site's performance, pick an individual SEO landing page to analyze. Trying to navigate the keyword performance of a large number of pages can get messy very quickly, especially with a large site. So let's say you want to take a look at the keyword performance of a certain product page, because you have been targeting specific keywords for that page and want to see if those keywords are bringing you new traffic. In Google Analytics, go to Content -> Site Content -> Landing Pages and then click on the individual page you want to look at. (Make sure you have the Non-Paid Organic Search advanced filter activated.)

Then, add Keyword as a secondary dimension so you can look at all of the keywords that led people to that specific page.
keyword data

Also, make sure you do not filter out branded search terms in this report. That comes later. If you do it now, you will throw off your percentages because the [not provided] bucket is very likely including those branded terms.

2. Export and organize all your data (including branded searches.)
As you can see, 10.57% of keyword traffic is from [not provided]. To figure out what is hiding in this bucket, send the report to Excel.
Make sure you export all the data, as Google often will only show you only a small set at once. So you may have to export the data a few different times if you have over 500 keywords.
show rows
Compile all of the data into one spreadsheet, being careful not to accidentally lose any keywords in the process.

3. Distribute the percentage.
First, find the percentage of visitors for each keyword. For example, the first provided term has 4241 visits on that specific landing page. Insert a new column between Visits and Pages/Visit and plug in this equation.

[# of visits of keyword x] / [total # of provided visits] = [% of keyword x]
keyword equation
(Make sure to put the absolute reference ($) around the column for total provided visits, otherwise when you drag the equation down the rest of the column, the equation reference will get dragged down as well.)

And then format the column as a percentage and drag the equation down to the bottom.
equation format
So now we can see that the first provided term leads people to the site 33.11% of the time, the next 17.03% of the time and so on and so forth.

Now, based on the deductive reasoning explained above, if one keyword claims 33.11% of total provided keyword visits, then it is safe to assume it can claim 33.11% of not provided keyword visits. If we want to take this one step further, we could find out approximately how many times a keyword appears in [not provided]. Insert another column and type in this equation:

[% of keyword x] * [total # of not provided] = # of times keyword x is in not provided
not provided keyword

Again, add absolute reference around the [not provided] cell so it stays the same the entire way down the column:
not provided stats
Now you can see a highly educated guess as to how many times a specific keyword appears in the [not provided] bucket. When you up all of the keywords in column E, the total should be the same as the number of [not provided] keywords.

4. Distinguish branded and non-branded.
Of course, when working on search engine optimization, what you really want to know is how many people are discovering your site who didn't already know you. The best way to determine this is through looking at branded vs. non-branded terms.

Highlight the column with all of your keywords and set up conditional formatting so you can distinguish all of your branded terms from your non-branded terms.
new formatting title=
In the place of "x" put your branded term(s).

Sort the table by color, placing all of your branded terms at the top. Calculate the sum percentage of all of your branded terms. In this case, it is 73.7%. So 73.7% of 5819 (total number of not provided) is 4288 times a branded keyword is most likely to be found in the [not provided] bucket. Based on your goals for a specific landing page, you might want to see this number go or down and take actions accordingly to make that happen.

But, even more interesting is finding the most highly-used non-branded term. This is the keyword that is bringing people to your site who most likely hasn't heard of you before. If it is one you have tried to optimize the landing page for, then perfect. If not, you may need to re-work some elements on the page to get those targeted terms higher up on this list.

There is so much to learn from the data in this chart, such as the performance of branded vs. non-branded visitors, re-optimize pages around specific keywords, conversion optimization, and more. Even if the numbers are not exact, distributing keyword percentages across [not provided] can give you a much clearer picture into the performance of your site and help you take better actions towards its improvement.

Do you have any solutions for handling the (not provided) problem? Let me know by tweeting me @StephLynette or by leaving a comment below.

Search Engine Optimization Whitepaper

Share this post

Comments on this post

  1. Nice tutorial.

    I think it’s safe to use this method for common keywords in the distribution – branded terms, for example – but not ideal for long tail keywords.

    Like you said though, it’s an educated guess and better than (not provided)

  2. Preben Ormen said:

    I like your workaround for the [not provided] stats. Simple, yet I believe this will be close enough.

    I did not know the real reason behind the [not provided] label, so thanks for sharing.

    I can now parse my Google Analytics data a little differently – maybe I’ll learn something new after that, too.


Comment on this post


Ready to discuss your project? Work With Us
Blue Fountain Media is recognized as a
Top Digital Agency by
Ranked #1 Interactive
Agency by top interactive agencies example
Ranked Top 10 Digital
Agency by awwwards logo
Ranked Top 10 Agency
Worldwide by IMA logo
Our Locations
New York Headquarters
New York Headquarters
102 Madison Avenue - Second Floor
New York, NY 10016
Chicago Office
Chicago Office
222 Merchandise Mart Plaza, Suite 1212
Chicago, IL 60654
Seattle Office
Seattle Office
14980 NE 31st Way, Suite 120
Redmond, WA 98052
  • 102 Madison Ave, 2nd Floor
  • New York, NY 10016
  • 212.260.1978