Introducing Cloudant Query

By Simon Metson

Cloudant Query gives you a declarative way to define and query indexes on your database. We’ve long wanted to build a system that would make the developer user experience of Cloudant productive from day-one of using the system, and more performant for our longtime fans. With Cloudant Query we’re following through on our plan to make our database service easier and more powerful to use.

CQ webcast July 24. Reg & eat your veg!

The dataset

The dataset we’re using in this article is a small subset of IMDb data that the service makes available for non-commercial and educational purposes. Here, we’ve denormalized the separate tables for Actor, Movie, and Person to fit within Cloudant’s JSON document-oriented model.


So in accordance with IMDb’s Conditions of Use statement, we’d like to add:


Information courtesy of

IMDb

(http://www.imdb.com).

Used with permission.

Replicate example data & follow along.

Overview

Cloudant Query gives you two new API endpoints: one to manage indexes, and one to execute queries against.

Instead of writing a javascript map reduce view, you define an index by POSTing a JSON body listing the fields you want to index:

POST https://examples.cloudant.com/movies-demo/_index


{
  "index": {
    "fields": ["Person_name"]
  }
}

NB: Some syntax has been omitted for the sake of clarity. Check out this example app, follow our tutorial, or refer to the API.

Once you’ve defined an index, you can run queries against it. A query is run by POSTing JSON to the _find endpoint. The JSON must contain a selector object (if you know MongoDB Query documents this should look familiar) and can contain other optional parameters (fields, sort and limit, etc.):

{
  "selector": {
	"Person_name": "Robert De Niro"
  },
  "fields": [
	"Movie_name",
	"Movie_year"
  ]
}

Result:

{
  "docs": [
	{"Movie_name": "Goodfellas","Movie_year": 1990},
	{"Movie_name": "Meet the Fockers","Movie_year": 2004},
	{"Movie_name": "Shark Tale","Movie_year": 2004},
	{"Movie_name": "Raging Bull","Movie_year": 1980},
	{"Movie_name": "Untouchables, The","Movie_year": 1987},
	{"Movie_name": "Godfather: Part II, The","Movie_year": 1974},
	{"Movie_name": "Meet the Parents","Movie_year": 2000},
	{"Movie_name": "Deer Hunter, The", "Movie_year": 1978}
  ]
}

This query hits the index we defined above, equivalent to a ?key="Robert De Niro" view query, then dynamically filters the result data to just the document keys defined in the fields list. If a fields list is omitted, I get back the whole document.

What’s really neat about Cloudant Query is that I can run a query against an index and refine the result set by applying conditions on fields beyond the original index. For instance, I could find movies Robert De Niro made in a specific year with the following selector:

{
  "selector": {
	"Movie_year": 1978,
	"Person_name": "Robert De Niro"
  }
}

Result:

{
  "docs": [
    {
      "Movie_genre": "DW",
      "Movie_name": "Deer Hunter, The",
      "Movie_rating": "R",
      "Movie_runtime": 183,
      "Movie_year": 1978,
      "Person_dob": "1943-08-17",
      "Person_name": "Robert De Niro",
      "Person_pob": "New York, New York, USA",
      "_id": "1f003ce73056238720c2e8f7da428f32",
      "_rev": "1-3fa59b11f43719f46c288b9bb9943d1d"
    }
  ]
}

Or find films he was in before a certain year:

{
  "selector": {
	"Movie_year": {"$lt": 1999},
	"Person_name": "Robert De Niro"
  }
}

You’ll want your selector to be fairly restrictive to get the best performance, and you’d maybe want to define a complex index (e.g., "fields": ["Person_name", "Movie_year"]) if you were making a lot of queries like this, or needed to sort by Movie_year.

On indexes

We defined an index above using:

POST https://examples.cloudant.com/movies-demo/_index

{
  "index": {
    "fields": ["Person_name"]
  }
}

Which returned:

{
  "result": "created"
}

If the index already existed the response would have been:

{
  "result": "exists"
}

But what happens if I make a query that doesn’t have a suitable index?

POST https://examples.cloudant.com/movies-demo/_find

{
  "selector": {
    "Movie_earnings_rank": 191
  }
}

Because your database may be terabytes in size and reindexing the whole dataset may be operationally complicated, Cloudant Query won’t automatically index your data, but it will tell you the indexes you need to define. How you choose to handle that error mode is up to you.

{
	"reason": "No index exists for this selector, try indexing one of: Movie_earnings_rank",
	"error": "no_usable_index"
}

What’s next?

Cloudant Query is implemented closer to our core Erlang engine, and we’ve been working on performance optimizations with our colleagues at IBM. So in addition to building on the established concepts (HTTP API, JSON, etc.) you know and love, Cloudant Query has the added benefit of performance over our traditional system of defining javascript map reduce views.

We hope that Cloudant Query works like you’d expect a database to work: set some conditions, and get the data you need using a simple, scalable and concise API. Coupled with map reduce views and search, we hope Query rounds out Cloudant’s ability to let you understand and use your data effectively.

Cloudant Query is now live on our multi-tenant service, and we’re liaising with our dedicated customers to get it onto their clusters soon. We’d love to hear your feedback on Query, and see what applications you build on it via the comments below. In the years to come, we hope Cloudant Query is the first tool you reach for when working with Cloudant.

Watch this Introducing Cloudant Query video to learn the concepts behind Cloudant Query.


Watch this Using Cloudant Query video to see how to build and query using Cloudant Query.


Find more videos and tutorials in the Cloudant Learning Center.


Don’t have an account? Sign up for free today and give it a spin!

Sign Up for Updates!

Recent Posts