APOC User Guide 3.0.8.6

Introduction

Note	Go here for documentation for APOC version 3.1.x documentation for APOC version 3.2.x

Neo4j 3.0 introduced the concept of user defined procedures. Those are custom implementations of certain functionality, that can’t be (easily) expressed in Cypher itself. Those procedures are implemented in Java and can be easily deployed into your Neo4j instance, and then be called from Cypher directly.

The APOC library consists of many (about 300) procedures to help with many different tasks in areas like data integration, graph algorithms or data conversion.

License

Apache License 2.0

"APOC" Name history

Apoc was the technician and driver on board of the Nebuchadnezzar in the Matrix movie. He was killed by Cypher.

APOC was also the first bundled A Package Of Components for Neo4j in 2009.

APOC also stands for "Awesome Procedures On Cypher"

Installation

Download latest release

Go to http://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/3.0.8.6

to find the latest release and download the binary jar to place into your $NEO4J_HOME/plugins folder.

Version Compatibility Matrix

Since APOC relies in some places on Neo4j’s internal APIs you need to use the right APOC version for your Neo4j installaton.

Any version to be released after 1.1.0 will use a different, consistent versioning scheme: <neo4j-version>.<apoc> version. The trailing <apoc> part of the version number will be incremented with every apoc release.

apoc version	neo4j version
3.2.0.2	3.2.0-alpha07 (3.2.x)
3.1.3.6	3.1.3 (3.1.x)
3.1.2.5	3.1.2
3.1.0.4	3.1.0-3.1.1
3.0.8.6	3.0.5-3.0.9 (3.0.x)
3.0.4.3	3.0.4
1.1.0	3.0.0 - 3.0.3
1.0.0	3.0.0 - 3.0.3

using APOC with Neo4j Docker image

The Neo4j Docker image allows to supply a volume for the /plugins folder. Download the APOC release fitting your Neo4j version to local folder plugins and provide it as a data volume:

mkdir plugins
pushd plugins
wget https://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/download/3.0.8.6/apoc-3.0.8.6-all.jar
popd
docker run --rm -e NEO4J_AUTH=none -p 7474:7474 -v $PWD/plugins:/plugins -p 7687:7687 neo4j:3.0.9

Build & install the current development branch from source

git clone http://github.com/neo4j-contrib/neo4j-apoc-procedures
cd neo4j-apoc-procedures
./gradlew shadow
cp build/libs/apoc-<version>-SNAPSHOT-all.jar $NEO4J_HOME/plugins/
$NEO4J_HOME/bin/neo4j restart

A full build including running the tests can be run by ./gradlew build.

Calling Procedures & Functions within Cypher

User defined Functions can be used in any expression or predicate, just like built-in functions.

Procedures can be called stand-alone with CALL procedure.name();

But you can also integrate them into your Cypher statements which makes them so much more powerful.

Load JSON example

WITH 'https://raw.githubusercontent.com/neo4j-contrib/neo4j-apoc-procedures/master/src/test/resources/person.json' AS url

CALL apoc.load.json(url) YIELD value as person

MERGE (p:Person {name:person.name})
   ON CREATE SET p.age = person.age, p.children = size(person.children)

Procedure & Function Signatures

To call procedures correctly, you need to know their parameter names, types and positions. And for YIELDing their results the output column name and type.

You can see the procedures signature in the output of CALL dbms.procedures() (The same applies for functions with CALL dbms.functions())

CALL dbms.procedures() YIELD name, signature
WITH * WHERE name STARTS WITH 'apoc.algo.dijkstra'
RETURN name, signature

The signature is always name : : TYPE, so in this case:

apoc.algo.dijkstra
 (startNode :: NODE?, endNode :: NODE?,
   relationshipTypesAndDirections :: STRING?, weightPropertyName :: STRING?)
:: (path :: PATH?, weight :: FLOAT?)

Parameters:

Name Type

Name	Type
Procedure Parameters
`startNode`	`Node`
`endNode`	`Node`
`relationshipTypesAndDirections`	`String`
`weightPropertyName`	`String`
Output Return Columns
`path`	`Path`
`weight`	`Float`

Procedure Parameters

startNode

Node

endNode

Node

relationshipTypesAndDirections

String

weightPropertyName

String

Output Return Columns

path

Path

weight

Float

Help and Usage

call apoc.help('search')

lists name, description-text and if the procedure performs writes, search string is checked against beginning (package) or end (name) of procedure

helpful

CALL apoc.help("apoc") YIELD name, text
WITH * WHERE text IS null
RETURN name AS undocumented

To generate the help from @Description annotations, apoc currently scans the jar file with ASM.

Overview of APOC Procedures

Configuration Options

Set these config options in $NEO4J_HOME/neo4j.conf

All boolean options default to false, i.e. are disabled, unless mentioned otherwise.

apoc.trigger.enabled=true

Enable triggers

apoc.ttl.enabled=true

Enable time to live background task

apoc.ttl.schedule=5

Set frequency in seconds to run ttl background task (default 60)

apoc.import.file.enabled=true

Enable reading local files from disk

apoc.export.file.enabled=true

Enable writing local files to disk

apoc.jdbc.<key>.uri=jdbc-url-with-credentials

store jdbc-urls under a key to be used by apoc.load.jdbc

apoc.es.<key>.uri=es-url-with-credentials

store es-urls under a key to be used by elasticsearch procedures

apoc.mongodb.<key>.uri=mongodb-url-with-credentials

store mongodb-urls under a key to be used by mongodb procedures

apoc.couchbase.<key>.uri=couchbase-url-with-credentials

store couchbase-urls under a key to be used by couchbase procedures

Manual Indexes

Index Queries

Procedures to add to and query manual indexes

Note	Please note that there are (case-sensitive) automatic schema indexes, for equality, non-equality, existence, range queries, starts with, ends-with and contains!

apoc.index.addAllNodes('index-name',{label1:['prop1',…],…})

add all nodes to this full text index with the given fields, additionally populates a 'search' index field with all of them in one place

apoc.index.addNode(node,['prop1',…])

add node to an index for each label it has

apoc.index.addNodeByLabel('Label',node,['prop1',…])

add node to an index for the given label

apoc.index.addNodeByName('name',node,['prop1',…])

add node to an index for the given name

apoc.index.addRelationship(rel,['prop1',…])

add relationship to an index for its type

apoc.index.addRelationshipByName('name',rel,['prop1',…])

add relationship to an index for the given name

apoc.index.removeNodeByName('name',node) remove node from an index for the given name

apoc.index.removeRelationshipByName('name',rel) remove relationship from an index for the given name

apoc.index.search('index-name', 'query') YIELD node, weight

search for the first 100 nodes in the given full text index matching the given lucene query returned by relevance

apoc.index.nodes('Label','prop:value*') YIELD node, weight

lucene query on node index with the given label name

apoc.index.relationships('TYPE','prop:value*') YIELD rel, weight

lucene query on relationship index with the given type name

apoc.index.between(node1,'TYPE',node2,'prop:value*') YIELD rel, weight

lucene query on relationship index with the given type name bound by either or both sides (each node parameter can be null)

apoc.index.out(node,'TYPE','prop:value*') YIELD node, weight

lucene query on relationship index with the given type name for outgoing relationship of the given node, returns end-nodes

apoc.index.in(node,'TYPE','prop:value*') YIELD node, weight

lucene query on relationship index with the given type name for incoming relationship of the given node, returns start-nodes

Index Management

CALL apoc.index.list() YIELD type,name,config

lists all manual indexes

CALL apoc.index.remove('name') YIELD type,name,config

removes manual indexes

CALL apoc.index.forNodes('name',{config}) YIELD type,name,config

gets or creates manual node index

CALL apoc.index.forRelationships('name',{config}) YIELD type,name,config

gets or creates manual relationship index

Add node to index example

match (p:Person) call apoc.index.addNode(p,["name","age"]) RETURN count(*);
// 129s for 1M People
call apoc.index.nodes('Person','name:name100*') YIELD node, weight return * limit 2

Schema Index Queries

Schema Index lookups that keep order and can apply limits

apoc.index.orderedRange(label,key,min,max,sort-relevance,limit) yield node

schema range scan which keeps index order and adds limit, values can be null, boundaries are inclusive

apoc.index.orderedByText(label,key,operator,value,sort-relevance,limit) yield node

schema string search which keeps index order and adds limit, operator is 'STARTS WITH' or 'CONTAINS'

Meta Graph

Returns a virtual graph that represents the labels and relationship-types available in your database and how they are connected.

CALL apoc.meta.graphSample()

examines the database statistics to build the meta graph, very fast, might report extra relationships

CALL apoc.meta.graph

examines the database statistics to create the meta-graph, post filters extra relationships by sampling

CALL apoc.meta.subGraph({labels:[labels],rels:[rel-types],excludes:[label,rel-type,…]})

examines a sample sub graph to create the meta-graph

CALL apoc.meta.data

examines a subset of the graph to provide a tabular meta information

CALL apoc.meta.stats yield labelCount, relTypeCount, propertyKeyCount, nodeCount, relCount, labels, relTypes, stats

returns the information stored in the transactional database statistics

CALL apoc.meta.type(value)

type name of a value (INTEGER,FLOAT,STRING,BOOLEAN,RELATIONSHIP,NODE,PATH,NULL,UNKNOWN,MAP,LIST)

CALL apoc.meta.isType(value,type)

returns a row if type name matches none if not

CALL apoc.meta.types(node or relationship or map) YIELD value

returns a a map of property-keys to their names

isType example

MATCH (n:Person)
CALL apoc.meta.isType(n.age,"INTEGER")
RETURN n LIMIT 5

Schema

apoc.schema.assert({indexLabel:[indexKeys],…},{constraintLabel:[constraintKeys,…]}) yield label, key, unique, action

asserts that at the end of the operation the given indexes and unique constraints are there, each label:key pair is considered one constraint/label

Locking

call apoc.lock.nodes([nodes])

acquires a write lock on the given nodes

call apoc.lock.rels([relationships])

acquires a write lock on the given relationship

call apoc.lock.all([nodes],[relationships])

acquires a write lock on the given nodes and relationships

from/toJson

CALL apoc.convert.toJson([1,2,3])

converts value to json string

CALL apoc.convert.toJson( {a:42,b:"foo",c:[1,2,3]})

converts value to json map

CALL apoc.convert.fromJsonList('[1,2,3]')

converts json list to Cypher list

CALL apoc.convert.fromJsonMap( '{"a":42,"b":"foo","c":[1,2,3]}')

converts json map to Cypher map

CALL apoc.json.setJsonProperty(node,key,complexValue)

sets value serialized to JSON as property with the given name on the node

CALL apoc.json.getJsonProperty(node,key)

converts serialized JSON in property back to original object

CALL apoc.json.getJsonPropertyMap(node,key)

converts serialized JSON in property back to map

CALL apoc.convert.toTree([paths]) yield value

creates a stream of nested documents representing the at least one root of these paths

Export / Import

Export to CSV

YIELD file, source, format, nodes, relationships, properties, time, rows

apoc.export.csv.query(query,file,config)

exports results from the cypher statement as csv to the provided file

apoc.export.csv.all(file,config)

exports whole database as csv to the provided file

apoc.export.csv.data(nodes,rels,file,config)

exports given nodes and relationships as csv to the provided file

apoc.export.csv.graph(graph,file,config)

exports given graph object as csv to the provided file

Export to Cypher Script

Data is exported as cypher statements (for neo4j-shell, and partly apoc.cypher.runFile to the given file.

YIELD file, source, format, nodes, relationships, properties, time

apoc.export.cypher.all(file,config)

exports whole database incl. indexes as cypher statements to the provided file

apoc.export.cypher.data(nodes,rels,file,config)

exports given nodes and relationships incl. indexes as cypher statements to the provided file

apoc.export.cypher.graph(graph,file,config)

exports given graph object incl. indexes as cypher statements to the provided file

apoc.export.cypher.query(query,file,config)

exports nodes and relationships from the cypher statement incl. indexes as cypher statements to the provided file

GraphML Import / Export

GraphML is used by other tools, like Gephi and CytoScape to read graph data.

YIELD file, source, format, nodes, relationships, properties, time

apoc.import.graphml(file-or-url,{batchSize: 10000, readLabels: true, storeNodeIds: false, defaultRelationshipType:"RELATED"})

imports graphml into the graph

apoc.export.graphml.all(file,config)

exports whole database as graphml to the provided file

apoc.export.graphml.data(nodes,rels,file,config)

exports given nodes and relationships as graphml to the provided file

apoc.export.graphml.graph(graph,file,config)

exports given graph object as graphml to the provided file

apoc.export.graphml.query(query,file,config)

exports nodes and relationships from the cypher statement as graphml to the provided file

Loading Data from RDBMS

CALL apoc.load.jdbc('jdbc:derby:derbyDB','PERSON') YIELD row CREATE (:Person {name:row.name})

load from relational database, either a full table or a sql statement

CALL apoc.load.jdbc('jdbc:derby:derbyDB','SELECT * FROM PERSON WHERE AGE > 18')

load from relational database, either a full table or a sql statement

CALL apoc.load.driver('org.apache.derby.jdbc.EmbeddedDriver')

To simplify the JDBC URL syntax and protect credentials, you can configure aliases in conf/neo4j.conf:

apoc.jdbc.myDB.url=jdbc:derby:derbyDB

CALL apoc.load.jdbc('jdbc:derby:derbyDB','PERSON')

becomes

CALL apoc.load.jdbc('myDB','PERSON')

The 3rd value in the apoc.jdbc.<alias>.url= effectively defines an alias to be used in apoc.load.jdbc('<alias>',….

Loading Data from Web-APIs (JSON, XML, CSV)

CALL apoc.load.json('http://example.com/map.json') YIELD value as person CREATE (p:Person) SET p = person

load from JSON URL (e.g. web-api) to import JSON as stream of values if the JSON was an array or a single value if it was a map

CALL apoc.load.xml('http://example.com/test.xml') YIELD value as doc CREATE (p:Person) SET p.name = doc.name

load from XML URL (e.g. web-api) to import XML as single nested map with attributes and _type, _text and _children fields.

CALL apoc.load.xmlSimple('http://example.com/test.xml') YIELD value as doc CREATE (p:Person) SET p.name = doc.name

load from XML URL (e.g. web-api) to import XML as single nested map with attributes and type, _text fields and <childtype> collections per child-element-type.

CALL apoc.load.csv('url',{sep:";"}) YIELD lineNo, list, map

load CSV fom URL as stream of values
config contains any of: {skip:1,limit:5,header:false,sep:'TAB',ignore:['tmp'],arraySep:';',mapping:{years:{type:'int',arraySep:'-',array:false,name:'age',ignore:false}}

Interacting with Elastic Search

apoc.es.stats(host-url-Key)

elastic search statistics

apoc.es.get(host-or-port,index-or-null,type-or-null,id-or-null,query-or-null,payload-or-null) yield value

perform a GET operation

apoc.es.query(host-or-port,index-or-null,type-or-null,query-or-null,payload-or-null) yield value

perform a SEARCH operation

apoc.es.getRaw(host-or-port,path,payload-or-null) yield value

perform a raw GET operation

apoc.es.postRaw(host-or-port,path,payload-or-null) yield value

perform a raw POST operation

apoc.es.post(host-or-port,index-or-null,type-or-null,query-or-null,payload-or-null) yield value

perform a POST operation

apoc.es.put(host-or-port,index-or-null,type-or-null,query-or-null,payload-or-null) yield value

perform a PUT operation

Interacting with MongoDB

CALL apoc.mongodb.get(host-or-port,db-or-null,collection-or-null,query-or-null) yield value

perform a find operation on mongodb collection

CALL apoc.mongodb.count(host-or-port,db-or-null,collection-or-null,query-or-null) yield value

perform a find operation on mongodb collection

CALL apoc.mongodb.first(host-or-port,db-or-null,collection-or-null,query-or-null) yield value

perform a first operation on mongodb collection

CALL apoc.mongodb.find(host-or-port,db-or-null,collection-or-null,query-or-null,projection-or-null,sort-or-null) yield value

perform a find,project,sort operation on mongodb collection

CALL apoc.mongodb.insert(host-or-port,db-or-null,collection-or-null,list-of-maps)

inserts the given documents into the mongodb collection

CALL apoc.mongodb.delete(host-or-port,db-or-null,collection-or-null,list-of-maps)

inserts the given documents into the mongodb collection

CALL apoc.mongodb.update(host-or-port,db-or-null,collection-or-null,list-of-maps)

inserts the given documents into the mongodb collection

Copy these jars into the plugins directory:

mvn dependency:copy-dependencies
cp target/dependency/mongodb*.jar target/dependency/bson*.jar $NEO4J_HOME/plugins/

CALL apoc.mongodb.first('mongodb://localhost:27017','test','test',{name:'testDocument'})

Interacting with Couchbase

CALL apoc.couchbase.get(nodes, bucket, documentId) yield id, expiry, cas, mutationToken, content

Retrieves a couchbase json document by its unique ID

CALL apoc.couchbase.exists(nodes, bucket, documentId) yield value

Check whether a couchbase json document with the given ID does exist

CALL apoc.couchbase.insert(nodes, bucket, documentId, jsonDocument) yield id, expiry, cas, mutationToken, content

Insert a couchbase json document with its unique ID

CALL apoc.couchbase.upsert(nodes, bucket, documentId, jsonDocument) yield id, expiry, cas, mutationToken, content

Insert or overwrite a couchbase json document with its unique ID

CALL apoc.couchbase.append(nodes, bucket, documentId, jsonDocument) yield id, expiry, cas, mutationToken, content

Append a couchbase json document to an existing one

CALL apoc.couchbase.prepend(nodes, bucket, documentId, jsonDocument) yield id, expiry, cas, mutationToken, content

Prepend a couchbase json document to an existing one

CALL apoc.couchbase.remove(nodes, bucket, documentId) yield id, expiry, cas, mutationToken, content

Remove the couchbase json document identified by its unique ID

CALL apoc.couchbase.replace(nodes, bucket, documentId, jsonDocument) yield id, expiry, cas, mutationToken, content

Replace the content of the couchbase json document identified by its unique ID.

CALL apoc.couchbase.query(nodes, bucket, statement) yield queryResult

Executes a plain un-parameterized N1QL statement.

CALL apoc.couchbase.posParamsQuery(nodes, bucket, statement, params) yield queryResult

Executes a N1QL statement with positional parameters.

CALL apoc.couchbase.namedParamsQuery(nodes, bucket, statement, paramNames, paramValues) yield queryResult

Executes a N1QL statement with named parameters.

Copy these jars into the plugins directory:

mvn dependency:copy-dependencies
cp target/dependency/java-client-2.3.1.jar target/dependency/core-io-1.3.1.jar target/dependency/rxjava-1.1.5.jar $NEO4J_HOME/plugins/

CALL apoc.couchbase.get(['localhost'], 'default', 'artist:vincent_van_gogh')

Streaming Data to Gephi

apoc.gephi.add(url-or-key, workspace, data)

streams provided data to Gephi

Notes

Gephi has a streaming plugin, that can provide and accept JSON-graph-data in a streaming fashion.

Make sure to install the plugin firsrt and activate it for your workspace (there is a new "Streaming"-tab besides "Layout"), right-click "Master"→"start" to start the server.

You can provide your workspace name (you might want to rename it before you start thes streaming), otherwise it defaults to workspace0

The default Gephi-URL is http://localhost:8080, resulting in http://localhost:8080/workspace0?operation=updateGraph

You can also configure it in conf/neo4j.conf via apoc.gephi.url=url or apoc.gephi.<key>.url=url

Example

match path = (:Person)-[:ACTED_IN]->(:Movie)
WITH path LIMIT 1000
with collect(path) as paths
call apoc.gephi.add(null,'workspace0', paths) yield nodes, relationships, time
return nodes, relationships, time

Creating Data

CALL apoc.create.node(['Label'], {key:value,…})

create node with dynamic labels

CALL apoc.create.nodes(['Label'], [{key:value,…}])

create multiple nodes with dynamic labels

CALL apoc.create.addLabels( [node,id,ids,nodes], ['Label',…])

adds the given labels to the node or nodes

CALL apoc.create.removeLabels( [node,id,ids,nodes], ['Label',…])

removes the given labels from the node or nodes

CALL apoc.create.relationship(person1,'KNOWS',{key:value,…}, person2)

create relationship with dynamic rel-type

CALL apoc.create.uuid YIELD uuid

creates an UUID

CALL apoc.create.uuids(count) YIELD uuid

creates count UUIDs

CALL apoc.nodes.link([nodes],'REL_TYPE')

creates a linked list of nodes from first to last

CALL apoc.nodes.isDense(node/[nodes]/id/[ids]) yield node, dense

returns each node and a 'dense' flag if it is a dense node

CALL apoc.node.relationship.exists(node, rel-direction-pattern)

yields true effectively when the node has the relationships of the pattern

Virtual Nodes/Rels

Virtual Nodes and Relationships don’t exist in the graph, they are only returned to the UI/user for representing a graph projection. They can be visualized or processed otherwise. Please note that they have negative id’s.

CALL apoc.create.vNode(['Label'], {key:value,…})

returns a virtual node

CALL apoc.create.vNodes(['Label'], [{key:value,…}])

returns virtual nodes

CALL apoc.create.vRelationship(nodeFrom,'KNOWS',{key:value,…}, nodeTo)

returns a virtual relationship

CALL apoc.create.vPattern({_labels:['LabelA'],key:value},'KNOWS',{key:value,…}, {_labels:['LabelB'],key:value})

returns a virtual pattern

CALL apoc.create.vPatternFull(['LabelA'],{key:value},'KNOWS',{key:value,…},['LabelB'],{key:value})

returns a virtual pattern

Example

MATCH (a)-[r]->(b)
WITH head(labels(a)) AS l, head(labels(b)) AS l2, type(r) AS rel_type, count(*) as count
CALL apoc.create.vNode(['Meta_Node'],{name:l}) yield node as a
CALL apoc.create.vNode(['Meta_Node'],{name:l2}) yield node as b
CALL apoc.create.vRelationship(a,'META_RELATIONSHIP',{name:rel_type, count:count},b) yield rel
RETURN *;

Virtual Graph

Create a graph object (map) from information that’s passed in. It’s basic structure is: {name:"Name",properties:{properties},nodes:[nodes],relationships:[relationships]}

apoc.graph.from(data,'name',{properties}) yield graph

creates a virtual graph object for later processing it tries its best to extract the graph information from the data you pass in

apoc.graph.fromData([nodes],[relationships],'name',{properties})