13.3. apoc.periodic.iterate - Chapter 13. Cypher Execution

With apoc.periodic.iterate you provide 2 statements, the first outer statement is providing a stream of values to be processed. The second, inner statement processes one element at a time or with iterateList:true the whole batch at a time.

The results of the outer statement are passed into the inner statement as parameters, they are automatically made available with their names.

Table 13.1. configuration options
param	default	description
batchSize	1000	that many inner statements are run within a single tx params: {_count, _batch}
parallel	false	run inner statement in parallel, note that statements might deadlock
retries	0	if the inner statement fails with an error, sleep 100ms and retry until retries-count is reached, param {_retry}
iterateList	false	the inner statement is only executed once but the whole batchSize list is passed in as parameter {_batch}
params	{}	externally passed in map of params
concurrency	50	How many concurrent tasks are generate when using `parallel:true`
failedParams	-1	If set to a non-negative value, for each failed batch up to `failedParams` parameter sets are returned in in `yield failedParams`.

Table 13.1. configuration options

param

default

description

batchSize

1000

that many inner statements are run within a single tx params: {_count, _batch}

parallel

false

run inner statement in parallel, note that statements might deadlock

retries

if the inner statement fails with an error, sleep 100ms and retry until retries-count is reached, param {_retry}

iterateList

false

the inner statement is only executed once but the whole batchSize list is passed in as parameter {_batch}

params

{}

externally passed in map of params

concurrency

How many concurrent tasks are generate when using parallel:true

failedParams

-1

If set to a non-negative value, for each failed batch up to failedParams parameter sets are returned in in yield failedParams.

We plan to make iterateList:true the default in upcoming releases, due to the automatic UNWINDing and providing of nested results as variables, most queries should continue work.

So if you were to add an :Actor label to several million :Person nodes, you would run:

Which would take 10k people from the stream and update them in a single transaction, executing the second statement for each person.

Those executions can happen in parallel as updating node-labels or properties doesn’t conflict.

If you do more complex operations like updating or removing relationships, either don’t use parallel OR make sure that you batch the work in a way that each subgraph of data is updated in one operation, e.g. by transferring the root objects. If you attempt complex operations, try to use e.g. retries:3 to retry failed operations.

CALL apoc.periodic.iterate( "MATCH (o:Order) WHERE o.date > '2016-10-13' RETURN o", "MATCH (o)-[:HAS_ITEM]->(i) WITH o, sum(i.value) as value SET o.value = value", {batchSize:100, iterateList:true, parallel:true})

The stream of other data can also come from another source, like a different database, CSV or JSON file.