When a entity model class is defined, basic property class such as StringProperty
, IntergerProperty
, DateTimeProperty
, etc are indexed by default (you can disable it by using indexed=False
). TextProperty
is not indexed.
class Read(Model): posted_date = ndb.DateTimeProperty() name = ndb.StringProperty(indexed=False) description = ndb.TextProperty() parent_key = ndb.KeyProperty() is_active = ndb.BooelanProperty(default=True)
Only indexed property can be queried.
items = Read.query(Read.is_active == True).fetch()
NOTE: if you declare a new property after created N items, the new property query doesn't work for old items until you save/put the old items.
When you need to perform sorting on the queried result, you would need to declare composite index.
Edit index.yaml
.
- kind: Read properties: - name: is_active - name: posted_date direction: desc
items = Read.query(Read.is_active == True).order(-Read.posted_date).fetch()
NOTE: To make sure all the necessary composite index are declared during development (no auto generation), use dev_appserver.py --require_indexes.
When you need query Read.is_active
and Read.parent_key
, you have 2 options.
Option 1: Create a full index. The benefit of this index is best performance.
- kind: Read properties: - name: is_active - name: parent_key - name: posted_date direction: desc
It only works for the following query (both query property must be specified).
items = Read.query(Read.is_active == True, Read.parent_key == key).order(-Read.posted_date).fetch()
Options 2: Create 2 partial index. The performance is slower as it might involve multiple query using zigzag merge join
algorithm.
- kind: Read properties: - name: is_active - name: posted_date direction: desc- kind: Read properties: - name: parent_key - name: posted_date direction: desc
An extra benefit of option 2 is that all the following query are supported.
# query by is_activeitems = Read.query(Read.is_active == True).order(-Read.posted_date).fetch()# query by parent_keitems = Read.query(Read.parent_key == key).order(-Read.posted_date).fetch()# query by is_active and parent_keyitems = Read.query(Read.is_active == True, Read.parent_key == key).order(-Read.posted_date).fetch()
For best performance and flexibility, you can declare all 3 composite index.
- kind: Read properties: - name: is_active - name: posted_date direction: desc- kind: Read properties: - name: parent_key - name: posted_date direction: desc - kind: Read properties: - name: is_active - name: parent_key - name: posted_date direction: desc
There are also other considerations (besides performance and flexibility) such as index limit and size. Read the references section for a better understanding.
NOTE: It is advisable not to declare composite index which you don't intent to use.
References:
- https://cloud.google.com/appengine/articles/indexselection
- https://ana-balica.github.io/2016/11/28/intricacies-and-optimization-of-datastore-indexes/
- https://cloud.google.com/datastore/docs/concepts/indexes#index_limits
- https://cloud.google.com/appengine/docs/standard/python/config/indexconfig
- https://cloud.google.com/datastore/docs/concepts/limits