I want to run a background worker to process a big number of database records, so I need to load them by batches (paging) to avoid OutOfMemoryError
. I should be using LIMIT
sqlite statement for this purpose, but I decide to check out the new Paging Library to see if this could be a reasonable solution.
TLDR: Stick with sqlite LIMIT
statement.
Paging Library
Paging Library
is a pretty impressive (and probably complex) library to handle paging of data source (data source could be database, network or both). Most of code example show how the result of paging is to be shown using RecyclerView
(you just need to call adapter.submitList(pagedList)
, where all the UI code for scroling and paging are handled for you). If you use Room
for Database access, you don't even need to write your own DataSource
.
Sadly, documentation for loading data without RecyclerView
or PagedListAdapter
is fairly limited. I am guessing this is probably not the primary use case.
Paging Library dependencies
// https://mvnrepository.com/artifact/androidx.paging/paging-runtime?repo=google
def paging_version = "2.1.0-rc01"
implementation "androidx.paging:paging-runtime-ktx:$paging_version"
// implementation "androidx.paging:paging-rxjava2-ktx:$paging_version"
I am using Room
as the data source.
def room_version = '2.0.0'
implementation "androidx.room:room-runtime:$room_version"
kapt "androidx.room:room-compiler:$room_version"
androidTestImplementation "androidx.room:room-testing:$room_version"
Setup DataSource with Room
@Daointerface PinDao : BaseDao<Pin> { @Query("SELECT * FROM pin WHERE is_active = 1 ORDER BY created DESC") fun fetchAllDataSource(): DataSource.Factory<Int, Pin>}
NOTE: Learn about Room or create your own DataSource.
Load PagedList using PagedList.Builder
val pageSize = 5val config: PagedList.Config = PagedList.Config.Builder() .setInitialLoadSizeHint(pageSize) .setPageSize(pageSize) // .setPrefetchDistance(pageSize*2) // .setEnablePlaceholders(false) .build() val pagedList = PagedList.Builder(pinDao.fetchAllDataSource.create(), config) .setNotifyExecutor { runBlocking(Dispatchers.Main) { it.run() } } .setFetchExecutor { it.run() } .build()
NOTE: I am running my code in WorkManager, which by default is a background/worker thread. setNotifyExecutor
needs a UI thread, so I am using runBlocking(Dispatchers.Main)
using Kotlin Coroutines
. setFetchExecutor
need a background thread, and since WorkManager
is already a background thread, so we can run it directly.
Loop PagedList Data
for ((index, it) in pagedList.withIndex()) { val pin = (if (it == null) { Timber.d("Process ${index+1}: loadAround") pagedList.loadAround(index) Timber.d("loadedCount=${pagedList.loadedCount}") pagedList[index] } else { it }) if (pin == null) { Timber.d("Process ${index + 1}: null") continue } Timber.d("Process ${index + 1}: ${pin.id}") // if update here, loadAround doesn't work anymore // pinDao.update(pin)}
By default, PagedList.Config.setEnablePlaceholders is true and pagedList
will hold the full number of items. If there are a total of 100 items with page size of 10, pagedList.size
will be 100, with the first 10 items having valid value while the 11th item shall be null. pagedList.loadAround(index)
need to be called to load new items through paging, and you can access new items using pagedList[index]
.
NOTE: Technically, you can create a loop using pagedList.size
, calling pagedList.loadAround(index)
at each loop cycle and access item using pagedList[index]
.
NOTE: You might want to look into LivePagedListBuilder or RxPagedListBuilder for PagedList
loading.
Caveats
If you update the records (pinDao.update(pin)
) while looping, pagedList.loadAround(index)
would not load new records and pagedList[index]
will return null. This is probably because the DataSource is stale/invalidated, thus require getting a new PagedList
. There is probably a callback (maybe using LivePagedListBuilder
) for this purpose, or need to call some invalidate method.
I tried setting PagedList.Config.setEnablePlaceholders(false)
and couldn't load new records/next page using pagedList.loadAround(index)
(pagedList.size
or pagedList.loadedCount
didn't increase from initial size). I am not sure if I am doing it wrong to load next page, or there is a bug. Issue Tracker.
I have a feeling Paging Library is more suited for display of information rather than batch processing.
References: