Analytics Query Language (GAQL)

The ShufflePoint Analytics Query Language (GAQL) is used to create Excel® and PowerPoint® documents with refreshable Google™ Analytics data. It is also the basis of ShufflePoint's Google Analytics with Google Gadgets integration.

GAQL is in the genre of the OLAP query language MDX. GAQL, like MDX, allows a query to specify both the data to return and the “geometry” of the result by assigning dimensions and metrics to axes. This is implemented in GAQL using the same “ON ROWS” and “ON COLUMS” as in MDX.

Related Quick Start Guides

Basic Syntax

The basic syntax of GAQL is:

SELECT

<metrics or dimensions> ON COLUMNS <top/by>

<metrics or dimensions> ON ROWS <top/by>

FROM

<profile list>

WHERE

<optional metrics list>

<optional timeframe>

<optional filters>

This query will return the tabular result shown below:

SELECT

METRICS ga:visits, ga:avgPageviews, ga:bounceRate ON COLUMNS

DIMENSIONS ga:continent ON ROWS BY ga:visits

FROM

default

WHERE

TIMEFRAME yearToDate

Continent Visits Pages/Visit Bounce Rate
Americas 143 2 60.14%
Europe 93 2 67.74%
Asia 27 2 44.44%
Oceania 11 2 63.64%
Africa 1 2 .00%

Keywords

The following are reserved keywords in GAQL:

SELECT, COLUMNS, ON, ROWS, FROM, FORMAT, WHERE, DIMENSIONS, METRICS, TIMEFRAME, FILTER, BY, ASC, DESC

GAQL keywords are not case sensitive

Axis Expressions

Expressions with “ON ROWS” and “ON ROWS” are axis expressions. They are used to specify which dimensions or metrics should become the column and row headings of the result.

Things to note about axis expressions

  1. A given axis may contain a list of metrics or a single dimension
  2. Names within a list are separated by commas
  3. The metric and dimension names must have a “ga:” prefix (in the case of Google names) or an “sp:” prefix (in the case of ShufflePoint names)
  4. A given axis can have either metrics or a dimension
  5. Currently, only one dimension is allowed on a given axis. This limitation will be relaxed at some point.

The dimensions and metrics available witin GAQL are based on the ones supported by Google Analytics API. These are documented on this Google reference page .

Timeframe Expressions

A GAQL query must contain a timeframe. The timeframe can be specified as a named timeframe or by specifying a start and end date.

A timeframe specified with start and end dates takes the following form.

SELECT ...

WHERE

TIMEFRAME 2009-03-10:2009-04-10

Like profiles, “DEFAULT” can be specified to say “use the default I have specified for this report or session”. The named timeframes which GAQL recognizes are listed in the table below.

Timeframe nameMeaning
TODAYdata for the current day
YESTERDAYYesterday
YESTERDAY2Two days ago
LASTWEEKLast week
LASTWEEK2Two weeks ago
LAST10DAYSLast 10 days
LAST30DAYSLast 30 days
LAST60DAYSLast 60 days
THISMONTHThis month
LASTMONTHLast month
LASTMONTH2Two months ago
YEARTODATEYear to date
LASTYEARLast 12 months

From Expressions

The keyword “FROM” is followed by either the keyword “DEFAULT” or by a list of profiles. “DEFAULT” says “use the default I have specified for this report or session”. Defaults are a powerful feature in ShufflePoint that allow queryies and containing reports to easily be reused.

Profiles may be specified in several formats. The shortest format is to provide the profile ID, as in:

SELECT ...

FROM 43594

This means that the query will get data for the profile with the id “43594”. A profile name may also be specified by name. Named profiles must be wrapped by double quote or square brackets:

SELECT ...

FROM [www.shufflepoint.com]

Since profile names in Google Analytics are NOT guaranteed to be unique, the GAQL engine will use the first profile in your account which matches the provided name. If you want to be certain that you are using the correct profile, but also wish to see the name in the query, then use the format:

SELECT ...

FROM [43594].[www.shufflepoint.com]

Dimensions on each axis (pivots)

You can place a dimension the columns axis and a different dimension on the rows axis. In database and Excex speak, this is referred to as a pivot. Pivoting will also soon be available in the Google Analytics portal (but we had it first!). When you have dimensions on both axes, you need to specify a metric in the WHERE clause.

An good example would be to ask for unique visitors by continent for the last ten days:

SELECT

DIMENSIONS ga:continent ON COLUMNS

DIMENSIONS ga:date ON ROWS

FROM

default

WHERE

METRICS ga:uniquePageviews

TIMEFRAME last10days

Note that currently GAQL only supports a single metric in the WHERE clause.

Two row axis dimensions

GAQL queries in Gadget data feeds and in Excel Web Queries (interactive and IQY) support having more than one row axis dimension. In our next release cycle we will be adding support for this to the query tool.

SELECT

METRICS ga:visits ON COLUMNS

DIMENSIONS ga:pagePath, ga:pageTitle ON ROWS

FROM

[www.shufflepoint.com]

WHERE

TIMEFRAME default

Sorting and TOP

When the ROWS axis contains a dimension, you can sort the rows in the result table by a metric by using a sort expression on the row axis. The full syntax for such an axis is:

<metrics list> ON COLUMNS

<dimension> ON ROWS TOP <n> BY <metric> [ASC|DESC]

This query will return the tabular result shown below:

SELECT

METRICS ga:visits, ga:avgPageviews, ga:bounceRate ON COLUMNS

DIMENSIONS ga:country ON ROWS TOP 10 BY ga:visits DESC

FROM

default

WHERE

TIMEFRAME yearToDate

Country/TerritoryVisitsPages/VisitBounce Rate
United States2,275439.87%
United Kingdom297350.84%
Netherlands191342.93%
France180350.56%
India170255.29%
Spain156253.85%
Canada149357.72%
Japan144259.72%
Germany142341.55%
Brazil141351.77%

Node that GAQL does not yet support sorting on calculated metrics.

Filter Expressions

The Google Analytics API supports filter expressions which act to limit the data which is returned - either by metric value thresholds or by dimension member values. GAQL passes specified filter expressions directly to Google, so they are not documented here. See this Google reference page for details on the supported filters.

An example filter expressed in a GAQL query is

SELECT ...

WHERE

FILTER ga:pageviews > 10 AND ga:browser != "Safari"

Note that the right side of the filter expression must be placed within double quotes or square brackets.

SELECT ...

WHERE

FILTER ga:country == "United States"

Format Function

The format function uses a format string and one or more metric parameters.

SELECT ...

FROM

default

WHERE

METRICS FORMAT("{0:d} visits so far this year",ga:visits)

TIMEFRAME yearToDate

The format function can also take a dimension name as a third argument. If present, the second placeholder in the format expression will get the value of the count of the returned dimension members. So the following GAQL query would return a single cell with the string “5124 visits from 96 countries so far this year.”

SELECT

FROM

default

WHERE

METRICS FORMAT("{0:d} visits from {1:d} countries so far this year",ga:visits,ga:country )

TIMEFRAME yearToDate

Note that the format() function currently only supports the generation of a single text string result in a WHERE clause.

String literals in metric lists

Lists of metrics in an axis expression may contain string literals. An example of such a list is shown in the following query:

SELECT

METRICS ga:visits, "visits to the site, with an average of", ga:avgPageviews, "pageviews" ON COLUMNS

FROM

default

WHERE

TIMEFRAME lastWeek

String literals are very useful in the context of populating PowerPoint text placeholders because each cell in the GAQL result table will fill the formatted runs you define in ShufflePoint Studio. So the above query might be styled as:

56 visits to the site, with an average of 3 pageviews

Formatting Metrics

The default display formatting of a metric is specified by the Google API or by ShufflePoint if it is a calculated metric. For example, the display format for ga:bounceRate is 0.00%. One of the features of GAQL is the ability to specify a format override within the query. An example column expression without an override is:

METRICS ga:visits, ga:bounceRate ON COLUMNS

To format the bounce rate with no decimals, one would change it to:

METRICS ga:visits, ga:bounceRate:{0%} ON COLUMNS

Special percent formatting

If a count metric is formatted as a percent, then ShufflePoint will interpret this as a request to display the percent of total contributed by the metric for each dimension member.

SELECT

METRICS ga:visits, ga:visits:{%} on columns

DIMENSIONS ga:visitorType ON ROWS

FROM

default

WHERE

TIMEFRAME default

Visitor Type Visits % Visits
New Visitor 1453 .867981
Returning Visitor 221 .132019

Dimension Count

If a dimension is placed in a METRICS expression, then the dimension is converted to a metric of "count of dimension members". For example, the following query returns a value of 44 for languages. This is a count of the number of different languages of the visitors for the designated timeframe.

SELECT

METRICS ga:visits, ga:language ON COLUMNS

FROM

default

WHERE

TIMEFRAME default

Profile Visits Language
70442 1674 44

Shifting time periods

When a time period dimension such as ga:month or ga:week is sent to the Google API, the result set is always ordered by calendar and beginning at first day of the year. So if in June, 2009 you query:

SELECT

METRICS ga:visitors ON COLUMNS

DIMENSIONS ga:month ON ROWS

FROM

default

WHERE

TIMEFRAME lastYear

You will find that your months are ordered like so:

Month Visits
Jan 2009 1656
Feb 2009 1871
March 2009 2083
April 2009 1721
May 2009 1505
June 2008 1593
July 2008 1398
Aug 2008 1428
Sept 2008 1410
Oct 2008 1550
Nov 2008 1708
Dec 2008 1501

You may prefer to display the rows starting with June 2008 and ending with May 2009. This can be accomplished by adding a shift argument to the month dimension:

DIMENSIONS ga:month:5 ON ROWS

This shifts the rows to achieve the desired sequence of months.

Transform clause

The new TRANSFORMS clause in GAQL is used to apply rollup operations on returned results. Currently the only transform operation supported is REPLACE. This operation does regular expression driven rollups of queries with the pagePath dimension. The following example is from our top pages blog article:

TRANSFORMS

REPLACE [(/|/Default\.aspx)$] WITH [/Default.aspx]

SELECT

METRICS ga:visits ON COLUMNS

DIMENSIONS ga:pagePath, ga:pageTitle ON ROWS TOP 10 NOOTHER BY ga:visits

FROM

[www.shufflepoint.com]

WHERE

TIMEFRAME default

The TRANSFORM says "find pages that end in "/" or in "/default.aspx" and aggregate them to the "/default.aspx" version. Without the transform clause present, the first few rows would be:

Page path Page title Visits
/GoogleAnalytics.aspx Features 367
/ ShufflePoint, Inc. 275
/Default.aspx ShufflePoint, Inc. 117
/help/QuickStartExcelWebQuery.aspx Quick Start - Excel Web Queries 115

With the transform present, the 2nd ("/") and 13th ("/default.aspx") entries are merged with the 3rd entry ("/Default.aspx"). The idea is that these three entries are really the same page and so should only have a single row present in the result set, and ordering should be based upon the rolled up metric values from those three entries.

Page path Page title Visits
/GoogleAnalytics.aspx Features 367
/Default.aspx ShufflePoint, Inc. 408
/help/QuickStartExcelWebQuery.aspx Quick Start - Excel Web Queries 115

NOOTHER qualifier

By default, GAQL will add an "other" row to a result set with a TOP clause. So the query:

SELECT

METRICS ga:avgPageDuration ON COLUMNS

DIMENSIONS ga:continent ON ROWS top 4 by ga:avgPageDuration

FROM

default

WHERE

TIMEFRAME default

will be run as

SELECT

METRICS ga:avgPageDuration ON COLUMNS

DIMENSIONS ga:continent ON ROWS top 4 BottomOther by ga:avgPageDuration

FROM

default

WHERE

TIMEFRAME default

because "BottomOther" is the implicit default value for the OTHER qualifier. The result set will have an "other" row added, and it will be the 4th row:

Continent Avg. Time on Page
Americas 00:01:23
Europe 00:01:00
Asia 00:01:35
(other) 00:01:13

If you do not wish to have the "other" row added, add the "NoOther" qualifier to the query after the TOP expression:

SELECT

METRICS ga:avgPageDuration ON COLUMNS

DIMENSIONS ga:continent ON ROWS top 4 NoOther by ga:avgPageDuration

FROM

default

WHERE

TIMEFRAME default

Continent Avg. Time on Page
Americas 00:01:23
Europe 00:01:00
Asia 00:01:35
Oceania 00:01:01

Header qualifiers

There may be times that you need a result set without row or column headings. This could be useful, for example, if you were merging several queries into one big table in Excel and you wish to avoind duplicate column headings. The "NoRowHeaders" and "NoColumnHeaders" qualifiers serve this purpose. For example, the query below will result in the table below.

SELECT NoRowHeaders

metrics ga:uniquePageviews, ga:bounceRate on columns

dimensions ga:country on rows top 5 by ga:pageviews

FROM

default

WHERE

TIMEFRAME default

Unique Pageviews Bounce Rate
873 .744015
208 .747664
162 .753247
151 .758621
142 .604167