BigQuery QuickStart
Configuring Access to M-Lab Data
To search M-Lab’s data using BigQuery, follow these quick steps to get started:
Step 1. Configure Google Cloud Platform Console Project
Create a project, and enable billing in Google Cloud Platform Console (or use an existing project with billing enabled).
You will not be charged for queries against tables in the M-Lab data set. M-Lab is committed to open data, and offering our data free of charge is part of that commitment.
Step 2. Configure M-Lab Table Access
Join the M-Lab Discuss group with the same account you used to create your Google Cloud Platform Console project. Joining this group enables your account so that you can make queries against M-Lab’s BigQuery tables.
Using a Service Account
If you are querying BigQuery via a service account (e.g., @developer.gserviceaccount.com) that cannot join the M-Lab Discuss group, please email support@measurementlab.net so that we can enable your account manually.
BigQuery Web Interface
BigQuery offers a web interface so that you can query M-Lab BigQuery data from your browser. To query from the web interface, go to:
https://bigquery.cloud.google.com
Try the following query as an example:
Calculate how many NDT tests were performed per day since M-Lab epoch.
SELECT
STRFTIME_UTC_USEC(web100_log_entry.log_time * 1000000, '%Y-%m-%d') AS day,
COUNT(*) AS num_tests
FROM
plx.google:m_lab.ndt.all
GROUP BY
day
ORDER BY
day ASC;
BigQuery Tools in the Google Cloud SDK
You may also run queries at the command line using BigQuery tools in the Google Cloud SDK.
Download and install the Google Cloud SDK using the installation and Quick Start instructions for your operating system.
After installation, authentication, and restarting your terminal, BigQuery’s command-line tools are available in your shell. Note additional options such as specifying output format are available in the BigQuery command-line tools.
Try the following query as an example:
$ bq query --format=csv "
-- Calculate how many NDT tests were performed per day since M-Lab epoch.
SELECT
STRFTIME_UTC_USEC(web100_log_entry.log_time * 1000000, '%Y-%m-%d') AS day,
COUNT(*) AS num_tests
FROM
plx.google:m_lab.ndt.all
GROUP BY
day
ORDER BY
day ASC"
If you are new to BigQuery, we suggest that you next consult the following resources:
BigQuery API
Given the resources and expertise, you can develop your own application that uses M-Lab data. To learn more about building a custom application, refer to the Google Cloud Platform Console documentation
Telescope is an example of an application that M-Lab developed that uses the BigQuery Python API to download M-Lab data. Telescope is available from M-Lab on Github.