BigQuery: The story of my life

Me being a toddler:

I am BigQuery. My genesis is from Google. I am a cloud-based service that combines a NoSQL-style data store with “SQL-like” querying capabilities.

My mom said, before I was born, querying massive datasets was a hurdle without proper hardware and infrastructure. But now I solve this problem by enabling interactive analysis of massively large datasets working in conjunction with Google Cloud Storage. I am a fully managed, NoOps, low-cost analytics database, where you can deploy petabytes of data, use familiar SQL to query and also get pay-as-you-go model.

When I am a crazy teen:

I had 4 remarkable basic concepts which still help me through the life.

  • Projects
  • Tables
  • Datasets
  • Jobs

Don’t you want to know what they all are? Chill, Let me explain those to you…

  • Projects:

Projects are top-level containers in Google Cloud Platform. Each project has a friendly name and a unique ID.

  • Tables:

Each table has a schema that describes field names, types, and other information. In addition to tables containing data stored in managed storage, I also support both views, which are virtual tables defined by a SQL query, and external tables, which are tables defined over data stored in.

  • Datasets:

Datasets allow you to organize and control access to your tables. Because tables are contained in datasets, you’ll need to create at least one dataset before loading data into me.

  • Jobs:

Jobs are actions you construct and I execute on your behalf to load data, export dataquery data, or copy data. Since jobs can potentially take a long time to complete, they execute asynchronously and can be polled for their status.

Now, a responsible grown-up:

Interacting with me is quite an ease. All you have to do is follow these 3 ways,

(i) Loading and exporting data:

In most cases, you load data into my Storage. If you want to get the data back from me, you can export the data. You can also set up a table as a federated data source which allows you to use a query to transform your data as you load it.

(ii) Querying and viewing data:

Once you load your data to me, there are a few ways to query or view the data in your tables:

Querying data

  • Calling thejobs.query() method
  • Calling thejobs.insert() method with a query configuration

Viewing data

  • Calling thetabledata.list() method
  • Calling thejobs.getQueryResults() method

(iii) Managing data:

In addition to querying and viewing data, you can manage my data in the following ways:

  • Listing projects, jobs, tables and datasets
  • Getting information about jobs, tables and datasets
  • Defining, updating or patching tables and datasets
  • Deleting tables and datasets

My recent Relationship with Congruent:

Phase 1 with Congruent:

My first task with congruent was to receive data ingest from specific tables of Salesforce.

  • Congruent developed the necessary routines using APIs to retrieve data from Salesforce and transfer it to me, using the ‘chunking’ approach.
  • They also made it possible to configure the time of the day, when the download and export process would run.

Boo-ya!!! Nothing can hide from me, even if new fields are added to the Salesforce table, such fields would also get imported into me.

Phase 2 with Congruent:

The scope of this project with Congruent is to:

  • Download from an FTP site, extract and upload the Segment files and Crosswalk files to me.
  • Process the data from these two inputs to identify households which are potential buyers of a product segment and generate an ‘Audience Builder User’ table in me.

I will notify the users through their mail id regarding the success/failure of the procedure on completion of the each and every process.

A Start from an end:

            I am mesmerizing a lot of data enthusiasts with my lightning-fast analytics database. Customers find my performance liberating, allowing them to experiment with enormous datasets without compromise. Finally, I found my purpose in this big data world. Have you?!

Date Published: March 17, 2016

