The analysts amongst us have all been there, we’re working in Google Analytics trying to decipher the latest data set and have to segment a little further. Then we hit the GA sampling wall and suddenly we start questioning the accuracy of the data. Now of course at that point if you have GA Premium (Because you have a spare $150,000 kicking around) then you’re in luck and can simply request an un-sampled report.
However let’s assume you don’t, what is your other choice? Yes you could simply base your analysis on the sampled data and caveat the insights that generates but I am always reluctant to do so myself. In this case we have one option open to us, turn to data partitioning. This involves taking the smallest cohort we can without sampling and stitching those extracts together. Let’s assume we want a months worth of data but we can only get un-sampled data by pulling 30 single days worth of data. Of course this is time laborious, but thankfully there is an easy way to do just that.
Enter the Excel plugin, Analytics Edge. The plugin is available in a number of forms ranging from free to the more advanced (and paid for) “core” add-in. It also offers extra connectors for other services so you can merge data from sites such as Facebook etc. However for the purposes of getting round the sampling we can simply install the free version (Note if you have any problems using the plugin after install you may also need to install the “.Net 4.5 Framework” which is available from Microsoft’s site, links at the bottom of the page. Once you have downloaded the plugin, installed and restarted Excel you will see a new option that looks like this below: Using AnalyticsEdge we can pull data from Google Analytics using the API and instruct it to pull partitioned data (Using smaller chunks to bypass sampling). First you’ll need to ensure that you add your login details for Google Analytics into the plugin, you can add accounts under “Free Google Analytics”, “Accounts”. Once logged in simply click on “Free Google Analytics” and “Analytics Reporting”. We can now select the view we want to extract data from. Under the “Segments” tab we can choose any segment we wish to apply to our data (e.g. Only show mobile data).
Then we move onto one of the crucial parts, the “Fields” tab. To extract our data without sampling we need to use a date dimension, this can either be month, week or day but will depend on what level of sampling you are receiving. If the level of traffic to your site is too high at a week level and this still results in sampling then try taking it down to date to receive daily extracts (e.g. 30 separate queries for a months worth of data).
Then select the other dimensions and metrics you need for your report, in this example we have chosen simply the page for the dimension and the metrics of page views and unique page views. Under the filters tab we can add any specific filters we want based on the dimensions we are pulling (e.g. Only extract a particular page statistics). This leads onto to the date tab where we choose the specific date range we wish to extract, last 30 days or last calendar month would be sufficient for our example. The sort/count tab allows us to order the extract based on any of our dimensions or metrics we chose (e.g. Order alphabetically by page or numerically by the page views).
However the next crucial tab is under the options tab and the tick boxes for warning about sampled results and minimising sampling. Tick both of those boxes and click “Finish”, this will then go and perform the queries needed to extract the data without sampling (Providing your date level field is sufficient enough to do so). You’ll now see the status of the extraction process (it can take a while so leave it and go grab a coffee if it’s taking some time). If all went well you should have the data you require and can begin analysing. We’ll cover some techniques on how to easily merge this data into one set later and some of the caveats around unique data but to all intents and purposes you just found the solution to data sampling in Google Analytics, hoorah!