10.4. Text Functions

10.4.1. Overview Text Functions

`apoc.text.indexOf(text, lookup, offset=0, to=-1==len)`	find the first occurence of the lookup string in the text, from inclusive, to exclusive,, -1 if not found, null if text is null.
`apoc.text.indexesOf(text, lookup, from=0, to=-1==len)`	finds all occurences of the lookup string in the text, return list, from inclusive, to exclusive, empty list if not found, null if text is null.
`apoc.text.replace(text, regex, replacement)`	replace each substring of the given string that matches the given regular expression with the given replacement.
`apoc.text.regexGroups(text, regex)`	returns an array containing a nested array for each match. The inner array contains all match groups.
`apoc.text.join(['text1','text2',…], delimiter)`	join the given strings with the given delimiter.
`apoc.text.format(text,[params],language)`	sprintf format the string with the params given, and optional param language (default value is 'en').
`apoc.text.lpad(text,count,delim)`	left pad the string to the given width
`apoc.text.rpad(text,count,delim)`	right pad the string to the given width
`apoc.text.random(length, [valid])`	returns a random string to the specified length
`apoc.text.capitalize(text)`	capitalise the first letter of the word
`apoc.text.capitalizeAll(text)`	capitalise the first letter of every word in the text
`apoc.text.decapitalize(text)`	decapitalize the first letter of the word
`apoc.text.decapitalizeAll(text)`	decapitalize the first letter of all words
`apoc.text.swapCase(text)`	Swap the case of a string
`apoc.text.camelCase(text)`	Convert a string to camelCase
`apoc.text.upperCamelCase(text)`	Convert a string to UpperCamelCase
`apoc.text.snakeCase(text)`	Convert a string to snake-case
`apoc.text.toUpperCase(text)`	Convert a string to UPPER_CASE
`apoc.text.charAt(text, index)`	Returns the decimal value of the character at the given index
`apoc.text.code(codepoint)`	Returns the unicode character of the given codepoint
`apoc.text.hexCharAt(text, index)`	Returns the hex value string of the character at the given index
`apoc.text.hexValue(value)`	Returns the hex value string of the given value
`apoc.text.byteCount(text,[charset])`	return size of text in bytes
`apoc.text.bytes(text,[charset]) - return bytes of the text`	apoc.text.toCypher(value, {skipKeys,keepKeys,skipValues,keepValues,skipNull,node,relationship,start,end})
`tries it’s best to convert the value to a cypher-property-string`	apoc.text.base64Encode(text) - Encode a string with Base64
`apoc.text.base64Decode(text) - Decode Base64 encoded string`	apoc.text.base64UrlEncode(url) - Encode a url with Base64

The replace, split and regexGroups functions work with regular expressions.

10.4.2. Data Extraction

`apoc.data.url('url') as {protocol,user,host,port,path,query,file,anchor}`	turn URL into map structure
`apoc.data.email('email_address') as {personal,user,domain}`	extract the personal name, user and domain as a map (needs javax.mail jar)
`apoc.data.domain(email_or_url)`	deprecated returns domain part of the value

10.4.3. Text Similarity Functions

`apoc.text.distance(text1, text2)`	compare the given strings with the Levenshtein distance algorithm
`apoc.text.levenshteinDistance(text1, text2)`	compare the given strings with the Levenshtein distance algorithm
`apoc.text.levenshteinSimilarity(text1, text2)`	calculate the similarity (a value within 0 and 1) between two texts based on Levenshtein distance.
`apoc.text.hammingDistance(text1, text2)`	compare the given strings with the Hamming distance algorithm
`apoc.text.jaroWinklerDistance(text1, text2)`	compare the given strings with the Jaro-Winkler distance algorithm
`apoc.text.sorensenDiceSimilarity(text1, text2)`	compare the given strings with the Sørensen–Dice coefficient formula, assuming an English locale
`apoc.text.sorensenDiceSimilarityWithLanguage(text1, text2, languageTag)`	compare the given strings with the Sørensen–Dice coefficient formula, with the provided IETF language tag
`apoc.text.fuzzyMatch(text1, text2)`	check if 2 words can be matched in a fuzzy way. Depending on the length of the String it will allow more characters that needs to be edited to match the second String.

10.4.4. Phonetic Comparison Functions

`apoc.text.phonetic(value)`	Compute the US_ENGLISH phonetic soundex encoding of all words of the text value which can be a single string or a list of strings
`apoc.text.doubleMetaphone(value)`	Compute the Double Metaphone phonetic encoding of all words of the text value which can be a single string or a list of strings
`apoc.text.clean(text)`	strip the given string of everything except alpha numeric characters and convert it to lower case.
`apoc.text.compareCleaned(text1, text2)`	compare the given strings stripped of everything except alpha numeric characters converted to lower case.

Table 10.1. Procedure
`apoc.text.phoneticDelta(text1, text2) yield phonetic1, phonetic2, delta`	Compute the US_ENGLISH soundex character difference between two given strings

10.4.5. Formatting Text

Format the string with the params given, and optional param language.

without language param ('en' default).

RETURN apoc.text.format('ab%s %d %.1f %s%n',['cd',42,3.14,true]) AS value // abcd 42 3.1 true

with language param.

RETURN apoc.text.format('ab%s %d %.1f %s%n',['cd',42,3.14,true],'it') AS value // abcd 42 3,1 true

10.4.6. String Search

The indexOf function, provides the fist occurrence of the given lookup string within the text, or -1 if not found. It can optionally take from (inclusive) and to (exclusive) parameters.

RETURN apoc.text.indexOf('Hello World!', 'World') // 6

The indexesOf function, provides all occurrences of the given lookup string within the text, or empty list if not found. It can optionally take from (inclusive) and to (exclusive) parameters.

RETURN apoc.text.indexesOf('Hello World!', 'o',2,9) // [4,7]

If you want to get a substring starting from your index match, you can use this

returns World!

WITH 'Hello World!' as text, length(text) as len
WITH text, len, apoc.text.indexOf(text, 'World',3) as index
RETURN substring(text, case index when -1 then len-1 else index end, len);

10.4.7. Regular Expressions

will return 'HelloWorld'.

RETURN apoc.text.replace('Hello World!', '[^a-zA-Z]', '')

RETURN apoc.text.regexGroups('abc <link xxx1>yyy1</link> def <link xxx2>yyy2</link>','<link (\\w+)>(\\w+)</link>') AS result

// [["<link xxx1>yyy1</link>", "xxx1", "yyy1"], ["<link xxx2>yyy2</link>", "xxx2", "yyy2"]]

10.4.8. Split and Join

will split with the given regular expression return ['Hello', 'World'].

RETURN apoc.text.split('Hello   World', ' +')

will return 'Hello World'.

RETURN apoc.text.join(['Hello', 'World'], ' ')

10.4.9. Data Cleaning

will return 'helloworld'.

RETURN apoc.text.clean('Hello World!')

will return true.

RETURN apoc.text.compareCleaned('Hello World!', '_hello-world_')

will return only 'Hello World!'.

UNWIND ['Hello World!', 'hello worlds'] as text
RETURN apoc.text.filterCleanMatches(text, 'hello_world') as text

The clean functionality can be useful for cleaning up slightly dirty text data with inconsistent formatting for non-exact comparisons.

Cleaning will strip the string of all non-alphanumeric characters (including spaces) and convert it to lower case.

10.4.10. Case Change Functions

10.4.10.1. Capitalise the first letter of the word with `capitalize`

RETURN apoc.text.capitalize("neo4j") // "Neo4j"

10.4.10.2. Capitalise the first letter of every word in the text with `capitalizeAll`

RETURN apoc.text.capitalizeAll("graph database") // "Graph Database"

10.4.10.3. Decapitalize the first letter of the string with `decapitalize`

RETURN apoc.text.decapitalize("Graph Database") // "graph Database"

10.4.10.4. Decapitalize the first letter of all words with `decapitalizeAll`

RETURN apoc.text.decapitalizeAll("Graph Databases") // "graph databases"

10.4.10.5. Swap the case of a string with `swapCase`

RETURN apoc.text.swapCase("Neo4j") // nEO4J

10.4.10.6. Convert a string to lower camelCase with `camelCase`

RETURN apoc.text.camelCase("FOO_BAR");    // "fooBar"
RETURN apoc.text.camelCase("Foo bar");    // "fooBar"
RETURN apoc.text.camelCase("Foo22 bar");  // "foo22Bar"
RETURN apoc.text.camelCase("foo-bar");    // "fooBar"
RETURN apoc.text.camelCase("Foobar");     // "foobar"
RETURN apoc.text.camelCase("Foo$$Bar");   // "fooBar"

10.4.10.7. Convert a string to UpperCamelCase with `upperCamelCase`

RETURN apoc.text.upperCamelCase("FOO_BAR");   // "FooBar"
RETURN apoc.text.upperCamelCase("Foo bar");   // "FooBar"
RETURN apoc.text.upperCamelCase("Foo22 bar"); // "Foo22Bar"
RETURN apoc.text.upperCamelCase("foo-bar");   // "FooBar"
RETURN apoc.text.upperCamelCase("Foobar");    // "Foobar"
RETURN apoc.text.upperCamelCase("Foo$$Bar");  // "FooBar"

10.4.10.8. Convert a string to snake-case with `snakeCase`

RETURN apoc.text.snakeCase("test Snake Case"); // "test-snake-case"
RETURN apoc.text.snakeCase("FOO_BAR");         // "foo-bar"
RETURN apoc.text.snakeCase("Foo bar");         // "foo-bar"
RETURN apoc.text.snakeCase("fooBar");          // "foo-bar"
RETURN apoc.text.snakeCase("foo-bar");         // "foo-bar"
RETURN apoc.text.snakeCase("Foo bar");         // "foo-bar"
RETURN apoc.text.snakeCase("Foo  bar");        // "foo-bar"

10.4.10.9. Convert a string to UPPER_CASE with `toUpperCase

RETURN apoc.text.toUpperCase("test upper case"); // "TEST_UPPER_CASE"
RETURN apoc.text.toUpperCase("FooBar");          // "FOO_BAR"
RETURN apoc.text.toUpperCase("fooBar");          // "FOO_BAR"
RETURN apoc.text.toUpperCase("foo-bar");         // "FOO_BAR"
RETURN apoc.text.toUpperCase("foo--bar");        // "FOO_BAR"
RETURN apoc.text.toUpperCase("foo$$bar");        // "FOO_BAR"
RETURN apoc.text.toUpperCase("foo 22 bar");      // "FOO_22_BAR"

10.4.11. Base64 De- and Encoding

Encode or decode a string in base64 or base64Url

EncodeBase64.

RETURN apoc.text.base64Encode("neo4j") // bmVvNGo=

DecodeBase64.

RETURN apoc.text.base64Decode("bmVvNGo=") // neo4j

EncodeBase64Url.

RETURN apoc.text.base64EncodeUrl("http://neo4j.com/?test=test") // aHR0cDovL25lbzRqLmNvbS8_dGVzdD10ZXN0

DecodeBase64Url.

RETURN apoc.text.base64DecodeUrl("aHR0cDovL25lbzRqLmNvbS8_dGVzdD10ZXN0") // http://neo4j.com/?test=test

10.4.12. Random String

You can generate a random string to a specified length by calling apoc.text.random with a length parameter and optional string of valid characters.

The valid parameter will accept the following regex patterns, alternatively you can provide a string of letters and/or characters.

`Pattern`	Description
`A-Z`	A-Z in uppercase
`a-z`	A-Z in lowercase
`0-9`	Numbers 0-9 inclusive

The following call will return a random string including uppercase letters, numbers and . and $ characters.

RETURN apoc.text.random(10, "A-Z0-9.$")

10.4.13. Text Similarity Functions

10.4.13.1. Compare the strings with the Levenshtein distance

Compare the given strings with the StringUtils.distance(text1, text2) method (Levenshtein).

RETURN apoc.text.distance("Levenshtein", "Levenstein") // 1

10.4.13.2. Compare the given strings with the Sørensen–Dice coefficient formula.

computes the similarity assuming Locale.ENGLISH.

RETURN apoc.text.sorensenDiceSimilarity("belly", "jolly") // 0.5

computes the similarity with an explicit locale.

RETURN apoc.text.sorensenDiceSimilarityWithLanguage("halım", "halim", "tr-TR") // 0.5

10.4.13.3. Check if 2 words can be matched in a fuzzy way with `fuzzyMatch`

Depending on the length of the String it will allow more characters that needs to be edited to match the second String.

RETURN apoc.text.fuzzyMatch("The", "the") // true