Use Android Paging Library for Synchronous Batch Processing (Kotlin)

December 27, 2018

I want to run a background worker to process a big number of database records, so I need to load them by batches (paging) to avoid OutOfMemoryError. I should be using LIMIT sqlite statement for this purpose, but I decide to check out the new Paging Library to see if this could be a reasonable solution.

TLDR: Stick with sqlite LIMIT statement.

Paging Library

Paging Library is a pretty impressive (and probably complex) library to handle paging of data source (data source could be database, network or both). Most of code example show how the result of paging is to be shown using RecyclerView (you just need to call adapter.submitList(pagedList), where all the UI code for scroling and paging are handled for you). If you use Room for Database access, you don’t even need to write your own DataSource.

Sadly, documentation for loading data without RecyclerView or PagedListAdapter is fairly limited. I am guessing this is probably not the primary use case.

Paging Library dependencies

// https://mvnrepository.com/artifact/androidx.paging/paging-runtime?repo=google
def paging_version = "2.1.0-rc01"
implementation "androidx.paging:paging-runtime-ktx:$paging_version"
// implementation "androidx.paging:paging-rxjava2-ktx:$paging_version"

I am using Room as the data source.

def room_version = '2.0.0'
implementation "androidx.room:room-runtime:$room_version"
kapt "androidx.room:room-compiler:$room_version"
androidTestImplementation "androidx.room:room-testing:$room_version"

Setup DataSource with Room

@Dao
interface PinDao : BaseDao<Pin> {
    @Query("SELECT * FROM pin WHERE is_active = 1 ORDER BY created DESC")
    fun fetchAllDataSource(): DataSource.Factory<Int, Pin>
}

NOTE: Learn about Room or create your own DataSource.

Load PagedList using PagedList.Builder

val pageSize = 5
val config: PagedList.Config = PagedList.Config.Builder()
        .setInitialLoadSizeHint(pageSize)
        .setPageSize(pageSize)
        // .setPrefetchDistance(pageSize*2)
        // .setEnablePlaceholders(false)
        .build() 

val pagedList = PagedList.Builder(pinDao.fetchAllDataSource.create(), config)
        .setNotifyExecutor {
            runBlocking(Dispatchers.Main) { it.run() }
        }
        .setFetchExecutor {
            it.run()
        }
        .build()  

NOTE: I am running my code in WorkManager, which by default is a background/worker thread. setNotifyExecutor needs a UI thread, so I am using runBlocking(Dispatchers.Main) using Kotlin Coroutines. setFetchExecutor need a background thread, and since WorkManager is already a background thread, so we can run it directly.

Loop PagedList Data

for ((index, it) in pagedList.withIndex()) {
    val pin = (if (it == null) {
        Timber.d("Process ${index+1}: loadAround")

        pagedList.loadAround(index)
        Timber.d("loadedCount=${pagedList.loadedCount}")
        pagedList[index]

    } else {
        it
    })

    if (pin == null) {
        Timber.d("Process ${index + 1}: null")
        continue
    }

    Timber.d("Process ${index + 1}: ${pin.id}")

    // if update here, loadAround doesn't work anymore
    // pinDao.update(pin)
}   

By default, PagedList.Config.setEnablePlaceholders is true and pagedList will hold the full number of items. If there are a total of 100 items with page size of 10, pagedList.size will be 100, with the first 10 items having valid value while the 11th item shall be null. pagedList.loadAround(index) need to be called to load new items through paging, and you can access new items using pagedList[index].

NOTE: Technically, you can create a loop using pagedList.size, calling pagedList.loadAround(index) at each loop cycle and access item using pagedList[index].

NOTE: You might want to look into LivePagedListBuilder or RxPagedListBuilder for PagedList loading.

Caveats

If you update the records (pinDao.update(pin)) while looping, pagedList.loadAround(index) would not load new records and pagedList[index] will return null. This is probably because the DataSource is stale/invalidated, thus require getting a new PagedList. There is probably a callback (maybe using LivePagedListBuilder) for this purpose, or need to call some invalidate method.

I tried setting PagedList.Config.setEnablePlaceholders(false) and couldn’t load new records/next page using pagedList.loadAround(index) (pagedList.size or pagedList.loadedCount didn’t increase from initial size). I am not sure if I am doing it wrong to load next page, or there is a bug. Issue Tracker.

I have a feeling Paging Library is more suited for display of information rather than batch processing.

References:

This work is licensed under a
Creative Commons Attribution-NonCommercial 4.0 International License.