Things To Know About Google App Engine Datastore Indexes

July 17, 2013

Property must be indexed to enable Query

If Index doesn’t exist on the specific property in an entity, the datastore returns a NeedIndexError in Python when the query on the property is run. e.g. for Student.query(Student.name == “James”).get(), Student.name must be indexed.

Most property are indexed by default

Most property in entity is indexed by default, including StringProperty, IntegerProperty, FloatProperty, BooleanProperty, DateTimeProperty, KeyProperty, ComputedProperty etc. Indexing can be disable by passing in indexed=False. Some property are not indexed by default (usually property which cater for large value), including TextProperty, BlobProperty, JsonProperty, etc.

Each index incur additional writes (and money)

Each indexed property require 2 write. If an entity have 3 indexed property Student(name, age, score), each new entity will require 2 entity writes + 3*2 writes per indexed property value = 8 writes. To update an entity, it require 1 entity write + 4 writes per modified indexed property value. Current cost is $0.09 per 100k write operations.

Query works most of the time due to default index

Most query won’t throw an NeedIndexError because of the builtin index / default index. If two property is queried in Student.query(Student.name == “James”, Student.age > 10), datastore’s “Query Planner” will utilize its “zigzag merge join algorithm” as explained in Index Selection and Advanced Search.

How to properly use composite index (index.yaml)

Performance could be improved using composite index specified in index.yaml when query using more than 1 property (read Index Selection and Advanced Search), through it would incurred additional index write. New entity incur 1 write per composite index value, while update entity incur 2 writes per modified composite index value.

Composite index is REQUIRED if SORT is used, where the sort key is different from the equality filter.

Beware unnecessary index by autogenerate

Read: https://developers.google.com/appengine/docs/python/config/indexconfig

Test index in development environment

As of the 1.5.3 SDKs, you can test the validity of your index definitions in the development web server. To do so, configure the development server so it does not auto-generate indexes. If you have a Python application, you can disable auto-generated indexes in the development web server using dev_appserver.py: dev_appserver.py –require_indexes

Modify Model’s Schema and Indexed

If you modify you model’s schema half way through, you want to do a query based on the new properties, the old entities would not have these new properties thus cannot be queried. You need to load all entities, set the value of the new property, and save them back to Datastore. (read Updating Your Model’s Schema)

This work is licensed under a
Creative Commons Attribution-NonCommercial 4.0 International License.