How to create nodes with variable labels in cypher? - json

I am using JSON APOC plugin to create nodes from a JSON with lists in it, and I am trying to create nodes whose label is listed as an element in the list:
{
"pdf":[
{
"docID": "docid1",
"docLink": "/examplelink.pdf",
"docType": "PDF"
}
],
"jpeg":[
{
"docID": "docid20",
"docLink": "/examplelink20.pdf",
"docType": "JPEG"
}
],
...,}
And I want to both iterate through the doctypes (pdf, jpeg) and set the label as the docType property in the list. Right now I have to do separate blocks for each doctype list (jpeg: [], pdf:[]):
WITH "file:////input.json" AS url
CALL apoc.load.json(url) YIELD value
UNWIND value.pdf as doc
MERGE (d:PDF {docID: doc.docID})
I'd like to loop through the doctype lists, creating the node for each doctype with the label as either the list name (pdf) or the node's docType name (PDF). Something like:
WITH "file:////input.json" AS url
CALL apoc.load.json(url) YIELD value
for each doctypelist in value
for each doc in doctype list
MERGE(d:doc.docType {docID: doc.docID})
Or
WITH "file:////input.json" AS url
CALL apoc.load.json(url) YIELD value
for each doctypelist in value
for each doc in doctype list
MERGE(d {docID: doc.docID})
ON CREATE SET d :doc.docType

Cypher currently does not support this. To set a label, you must hardcode it into the Cypher. You could do filters, or multiple matches to do this in a tedious way, but if you aren't allowed to install any plug-ins to your Neo4j db, I would recommend either just putting an index on the type, or use a node+relation instead of the label. (There are a lot of valid doc types, so if you have to support them all, pure Cypher will make it very painful.)
Using APOC however, there is a procedure specifically for this apoc.create.addLabels
CREATE (:Movie {title: 'A Few Good Men', genre: 'Drama'});
MATCH (n:Movie)
CALL apoc.create.addLabels( id(n), [ n.genre ] ) YIELD node
REMOVE node.genre
RETURN node;

Related

How can I query for multiple values after a wildcard?

I have a json object like so:
{
_id: "12345",
identifier: [
{
value: "1",
system: "system1",
text: "text!"
},
{
value: "2",
system: "system1"
}
]
}
How can I use the XDevAPI SearchConditionStr to look for the specific combination of value + system in the identifier array? Something like this, but this doesn't seem to work...
collection.find("'${identifier.value}' IN identifier[*].value && '${identifier.system} IN identifier[*].system")
By using the IN operator, what happens underneath the covers is basically a call to JSON_CONTAINS().
So, if you call:
collection.find(":v IN identifier[*].value && :s IN identifier[*].system")
.bind('v', '1')
.bind('s', 'system1')
.execute()
What gets executed, in the end, is (simplified):
JSON_CONTAINS('["1", "2"]', '"2"') AND JSON_CONTAINS('["system1", "system1"]', '"system1"')
In this case, both those conditions are true, and the document will be returned.
The atomic unit is the document (not a slice of that document). So, in your case, regardless of the value of value and/or system, you are still looking for the same document (the one whose _id is '12345'). Using such a statement, the document is either returned if all search values are part of it, and it is not returned if one is not.
For instance, the following would not yield any results:
collection.find(":v IN identifier[*].value && :s IN identifier[*].system")
.bind('v', '1')
.bind('s', 'system2')
.execute()
EDIT: Potential workaround
I don't think using the CRUD API will allow to perform this kind of "cherry-picking", but you can always use SQL. In that case, one strategy that comes to mind is to use JSON_SEARCH() for retrieving an array of paths corresponding to each value in the scope of identifier[*].value and identifier[*].system i.e. the array indexes and use JSON_OVERLAPS() to ensure they are equal.
session.sql(`select * from collection WHERE json_overlaps(json_search(json_extract(doc, '$.identifier[*].value'), 'all', ?), json_search(json_extract(doc, '$.identifier[*].system'), 'all', ?))`)
.bind('2', 'system1')
.execute()
In this case, the result set will only include documents where the identifier array contains at least one JSON object element where value is equal to '2' and system is equal to system1. The filter is effectively applied over individual array items and not in aggregate, like on a basic IN operation.
Disclaimer: I'm the lead developer of the MySQL X DevAPI Connector for Node.js

How to get a list of all attributes(either set or not set) that are applicable for a specific tag in HTML?

How to get a list of all attributes(either set or not set) that are applicable for a specific tag in HTML?
For example, to get a list of all attributes that are applicable for tag...
Is JS method to do so?
Following are my imaginary JS methods describing my requirement...
document.getAllApplicableAttributesForTag('a');
OR
element = document.getElementsByTagName('a')[0];
element.getAllApplicableAttributes();
Is there any valid way to do it?
You can use element.attributes to get a list of all attributes.
or simply use getAttributes() from hotdom library to get all attributes in a proper format with more options.
const a = dom.select('a');
dom.getAttributes(a);
above code will return all attributes of the anchor tag.
[
{attr: "href", value: "#"},
{attr: "title", value: "foo"}
]

Creating nodes and relations from JSON (dynamically)

I've got a couple hundred JSONs in a structure like the following example:
{
"JsonExport": [
{
"entities": [
{
"identity": "ENTITY_001",
"surname": "SMIT",
"entityLocationRelation": [
{
"parentIdentification": "PARENT_ENTITY_001",
"typeRelation": "SEEN_AT",
"locationIdentity": "LOCATION_001"
},
{
"parentIdentification": "PARENT_ENTITY_001",
"typeRelation": "SEEN_AT",
"locationIdentity": "LOCATION_002"
}
],
"entityEntityRelation": [
{
"parentIdentification": "PARENT_ENTITY_001",
"typeRelation": "FRIENDS_WITH",
"childIdentification": "ENTITY_002"
}
]
},
{
"identity": "ENTITY_002",
"surname": "JACKSON",
"entityLocationRelation": [
{
"parentIdentification": "PARENT_ENTITY_002",
"typeRelation": "SEEN_AT",
"locationIdentity": "LOCATION_001"
}
]
},
{
"identity": "ENTITY_003",
"surname": "JOHNSON"
}
],
"identification": "REGISTRATION_001",
"locations": [
{
"city": "LONDON",
"identity": "LOCATION_001"
},
{
"city": "PARIS",
"identity": "LOCATION_002"
}
]
}
]
}
With these JSON's, I want to make a graph consisting of the following nodes: Registration, Entity and Location. This part I've figured out and made the following:
WITH "file:///example.json" AS json_file
CALL apoc.load.json(json_file,"$.JsonExport.*" ) YIELD value AS data
MERGE(r:Registration {id:data.identification})
WITH json_file
CALL apoc.load.json(json_file,"$.JsonExport..locations.*" ) YIELD value AS locations
MERGE(l:Locations{identity:locations.identity, name:locations.city})
WITH json_file
CALL apoc.load.json(json_file,"$.JsonExport..entities.*" ) YIELD value AS entities
MERGE(e:Entities {name:entities.surname, identity:entities.identity})
All the entities and locations should have a relation with the registration. I thought I could do this by using the following code:
MERGE (e)-[:REGISTERED_ON]->(r)
MERGE (l)-[:REGISTERED_ON]->(r)
However this code doesn’t give the desired output. It creates extra "empty" nodes and doesn't connect to the registration node. So the first question is: How do I connect the location and entities nodes to the registration node. And in light of the other JSON's, the entities and locations should only be linked to the specific registration.
Furthermore, I would like to make the entity -> location relation and the entity - entity relation and use the given type of relation (SEEN_AT or FRIENDS_WITH) as label for the given relation. How can this be done? I'm kind of lost at this point and don’t see how to solve this. If someone could guide me into the right direction I would be much obliged.
Variable names (like e and r) are not stored in the DB, and are bound to values only within individual queries. MERGE on a pattern with an unbound variable will just create the entire pattern (including creating an empty node for unbound node variables).
When you MERGE a node, you should only specify the unique identifying property for that node, to avoid duplicates. Any other properties you want to set at the time of creation should be set using ON CREATE SET.
It is inefficient to parse through the JSON data 3 times to get different areas of the data. And it is especially inefficient the way your query was doing it, since each subsequent CALL/MERGE group of clauses would be done multiple times (since every previous CALL produces multiple rows, and the number of rows increases multiplicative). You can use aggregation to get around that, but it is unnecessary in your case, since you can just do the entire query in a single pass through the JSON data.
This may work for you:
CALL apoc.load.json(json_file,"$.JsonExport.*" ) YIELD value AS data
MERGE(r:Registration {id:data.identification})
FOREACH(ent IN data.entities |
MERGE (e:Entities {identity: ent.identity})
ON CREATE SET e.name = ent.surname
MERGE (e)-[:REGISTERED_ON]->(r)
FOREACH(loc1 IN ent.entityLocationRelation |
MERGE (l1:Locations {identity: loc1.locationIdentity})
MERGE (e)-[:SEEN_AT]->(l1))
FOREACH(ent2 IN ent.entityEntityRelation |
MERGE (e2:Entities {identity: ent2.childIdentification})
MERGE (e)-[:FRIENDS_WITH]->(e2))
)
FOREACH(loc IN data.locations |
MERGE (l:Locations{identity:loc.identity})
ON CREATE SET l.name = loc.city
MERGE (l)-[:REGISTERED_ON]->(r)
)
For simplicity, it hard-codes the FRIENDS_WITH and REGISTERED_ON relationship types, as MERGE only supports hard-coded relationship types.
So playing with neo4j/cyper I've learned some new stuff and came to another solution for the problem. Based on the given example data, the following can create the nodes and edges dynamically.
WITH "file:///example.json" AS json_file
CALL apoc.load.json(json_file,"$.JsonExport.*" ) YIELD value AS data
CALL apoc.merge.node(['Registration'], {id:data.identification}, {},{}) YIELD node AS vReg
UNWIND data.entities AS ent
CALL apoc.merge.node(['Person'], {id:ent.identity}, {}, {id:ent.identity, surname:ent.surname}) YIELD node AS vPer1
UNWIND ent.entityEntityRelation AS entRel
CALL apoc.merge.node(['Person'],{id:entRel.childIdentification},{id:entRel.childIdentification},{}) YIELD node AS vPer2
CALL apoc.merge.relationship(vPer1, entRel.typeRelation, {},{},vPer2) YIELD rel AS ePer
UNWIND data.locations AS loc
CALL apoc.merge.node(['Location'], {id:loc.identity}, {name:loc.city}) YIELD node AS vLoc
UNWIND ent.entityLocationRelation AS locRel
CALL apoc.merge.relationship(vPer1, locRel.typeRelation, {},{},vLoc) YIELD rel AS eLoc
CALL apoc.merge.relationship(vLoc, "REGISTERED_ON", {},{},vReg) YIELD rel AS eReg1
CALL apoc.merge.relationship(vPer1, "REGISTERED_ON", {},{},vReg) YIELD rel AS eReg2
CALL apoc.merge.relationship(vPer2, "REGISTERED_ON", {},{},vReg) YIELD rel AS eReg3
RETURN vPer1,vPer2, vReg, vLoc, eLoc, eReg1, eReg2, eReg3

Format for storing json in Amazon DynamoDB

I've got JSON file that looks like this
{
"alliance":{
"name_part_1":[
"Ab",
"Aen",
"Zancl"
],
"name_part_2":[
"aca",
"acia",
"ythrae",
"ytos"
],
"name_part_3":[
"Alliance",
"Bond"
]
}
}
I want to store it in dynamoDB.
The thing is that I want a generator that would take random elements from fields like name_part_1, name_part_2 and others (number of name_parts_x is unlimited and overalls number of items in each parts might be several hundreds) and join them to create a complete word. Like
name_part_1[1] + name_part_2[10] + name_part[3]
My question is that what format I should use to do that effectively? Or NoSQL shouldn't be used for that? Should I refactor JSON for something like
{
"name": "alliance",
"parts": [ "name_part_1", "name_part_2", "name_part_3" ],
"values": [
{ "name_part_1" : [ "Ab ... ] }, { "name_part_2": [ "aca" ... ] }
]
}
This is a perfect use case for DynamoDB.
You can structure like this,
NameParts (Table)
namepart (partition key)
range (hash key)
namepart: name_part_1 range: 0 value: "Ab"
This way each name_part will have its own range and scalable. You can extend it to thousands or even millions.
You can do a batch getitem from the sdk of your choice and join those values.
REST API reference:
https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchGetItem.html
Hope it helps.
You can just put the whole document as it is in DynamoDB and then use document path to access the elements you want.
Document Paths
In an expression, you use a document path to tell DynamoDB where to
find an attribute. For a top-level attribute, the document path is
simply the attribute name. For a nested attribute, you construct the
document path using dereference operators.
The following are some examples of document paths. (Refer to the item
shown in Specifying Item Attributes.)
A top-level scalar attribute: ProductDescription A top-level list
attribute. (This will return the entire list, not just some of the
elements.) RelatedItems The third element from the RelatedItems list.
(Remember that list elements are zero-based.) RelatedItems[2] The
front-view picture of the product. Pictures.FrontView All of the
five-star reviews. ProductReviews.FiveStar The first of the five-star
reviews. ProductReviews.FiveStar[0] Note The maximum depth for a
document path is 32. Therefore, the number of dereferences operators
in a path cannot exceed this limit.
Note that each document requires a unique Partition Key.

What's the correct JsonPath expression to search a JSON root object using Newtonsoft.Json.NET?

Most examples deal with the book store example from Stefan Gössner, however I'm struggling to define the correct JsonPath expression for a simple object (no array):
{ "Id": 1, "Name": "Test" }
To check if this json contains Id = 1.
I tried the following expression: $..?[(#.Id == 1]), but this does find any matches using Json.NET?
Also tried Manatee.Json for parsing, and there it seems the jsonpath expression could be like $[?($.Id == 1)] ?
The path that you posted is not valid. I think you meant $..[?(#.Id == 1)] (some characters were out of order). My answer assumes this.
The JSON Path that you're using indicates that the item you're looking for should be in an array.
$ start
.. recursive search (1)
[ array item specification
?( item-based query
#.Id == 1 where the item is an object with an "Id" with value == 1 at the root
) end item-based query
] end array item specification
(1) the conditions following this could match a value no matter how deep in the hierarchy it exists
You want to just navigate the object directly. Using $.Id will return 1, which you can validate in your application.
All of that said...
It sounds to me like you want to validate that the Id property is 1 rather than to search an array for an object where the Id property is 1. To do this, you want JSON Schema, not JSON Path.
JSON Path is a query language for searching for values which meet certain conditions (e.g. an object where Id == 1.
JSON Schema is for validating that the JSON meet certain requirements (your data's in the right shape). A JSON Schema to validate that your object has a value of 1 could be something like
{
"properties": {
"Id": {"const":1}
}
}
Granted this isn't very useful because it'll only validate that the Id property is 1, which ideally should only be true for one object.