trying to csv load relationships in NEO4J - csv

I'm trying to csv load relationships. My nodes represent 80 priests and 200 churches. I am trying to do this - which works:
MATCH (p:Priest{name: "Baranowski, Alexander Sylvester" }),(c:Church{name: "St Wenceslaus"})
MERGE (p)-[:POSTED {posting:'1955-61', zip: '60618'}]->(c)
but with 800 rels.
My csv sheet has priests listed perhaps 10x and so need to connect to 10 different churches.
My rels are years and zip codes. Nothing I have read and tried has worked. Ideas?
Thanks for your help.

you can try this.
put your CSV into the import folder of your neo4j instance.
load csv with headers from "file:///postings.csv" as row
MERGE (p:Priest{name: row.priest })
MERGE (c:Church{name: row.church })
MERGE (p)-[:POSTED {posting:row.posting, zip: row.zip}]->(c)

I assume posting is always present in the data.
load csv with headers from "file:///postings.csv" as row
MERGE (p:Priest{name: row.priest })
MERGE (c:Church{name: row.church })
MERGE (p)-[rel:POSTED{posting:row.posting}]->(c)
On Create set rel.zip=row.zip

Related

Problems to load .csv files generated in Social Network Benchmark to Neo4j

I've got problems in my work with Neo4j, and if you please to help, I will thank you a lot!
My work is something like this. I´ve got to study and evaluate several graph databases, and to do that I must use a benchmark. The benchmark that I'm used is the Social Network Benchmark (SNB)
I generate files with different setups all accordingly to the setup chosen. Something similar to this: forum_0.csv
This .csv files got certain headers, like this: id | title | creationDate | etc...
The next step in my project is to load them to Neo4j, build a database to test them with certain query’s, and my problems start here at this point.
I have loaded some files to Neo4j but others don't because of errors and I don't understand why.
I'm using this code to load those files. In this example I load the forum.csv to Neo4j.
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS
FROM ".../forum_0.csv" AS csvLine
FIELDTERMINATOR "|"
CREATE (:FORUM_0 {id:csvLine.id, title:csvLine.title, creationDate:csvLine.creationDate})
And with this code, the data from this file is loaded to Neo4j correctly.
But with this file - forum_containerOf_post_0.csv I can´t load the data correctly with this header - Forum.id | Post.id.
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS
FROM ".../forum_containerOf_post_0.csv" AS csvLine
FIELDTERMINATOR "|"
CREATE (:FCOP_0 {Forum.id:csvLine.Forum.id, Post.id:csvLine.Post.id})
The problem in here is I can´t access the id of forum_0.csv in the load process of forum_containerOf_post_0.csv. How can I access to that id, or another property? Is the lack of some Cypher code?
Is there something wrong in the process? Is there someone here that work with this - SNB and Neo4j?
Is there someone here to help me in this problem?
I tried to explain my problem but if you have questions about my problem, feel free to ask.
Thank you for your time
The problem is with the headers in the second file. If you want to embed periods . in the header column names you need to back tick the columns when you reference them in the load csv statement.
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS
FROM ".../forum_containerOf_post_0.csv" AS csvLine
FIELDTERMINATOR "|"
CREATE (:FCOP_0 {Forum.id:csvLine.`Forum.id`, Post.id:csvLine.`Post.id`})
Yeah you got right in your answer, but with a little correction
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS
FROM ".../forum_containerOf_post_0.csv" AS csvLine
FIELDTERMINATOR "|"
CREATE (:FCOP_0 {`Forum.id`:csvLine.Forum.id, `Post.id`:csvLine.Post.id})
But I discover other problem. This creates me the FCOP_0 node label but without the properties that the forum_containerOf_post_0.csv have. The two properties are Forum.id and Post.id, but with this process the properties are not loaded to the respectives nodes...it creates the FCOP_O node label in Neo4j but their nodes don't have properties, those two properties.
Can you please help me?

Uploading CSV in neo4j

I am trying to upload the following csv (https://www.dropbox.com/s/95j774tg13qsdxr/out.csv?dl=0) file in to neo4j by following command
LOAD CSV WITH HEADERS FROM
"file:/home/pavan637/Neo4jDemo/out.csv"
AS csvimport
match (uniprotid:UniprotID{Uniprotid: csvimport.Uniprot_ID})
merge (Prokaryotes_Proteins: Prokaryotes_Proteins{UniprotID: csvimport.DBUni, ProteinID: csvimport.ProteinID, IdentityPercentage: csvimport.IdentityPercentage, AlignedLength:csvimport.al, Mismatches:csvimport.mm, QueryStart:csvimport.qs, QueryEnd: csvimport.qe, SubjectStrat: csvimport.ss, SubjectEnd: csvimport.se, Evalue: csvimport.evalue, BitScore: csvimport.bs})
merge (uniprotid)-[:BlastResults]->(Prokaryotes_Proteins)
I used "match" command in the LOAD CSV command in order to match with the "Uniprot_ID's" of previously loaded CSV.
I have first loaded ReactomeDB.csv (https://www.dropbox.com/s/9e5m1629p3pi3m5/Reactomesample.csv?dl=0) with the following cypher
LOAD CSV WITH HEADERS FROM
"file:/home/pavan637/Neo4jDemo/Reactomesample.csv"
AS csvimport
merge (uniprotid:UniprotID{Uniprotid: csvimport.Uniprot_ID})
merge (reactionname: ReactionName{ReactionName: csvimport.ReactionName, ReactomeID: csvimport.ReactomeID})
merge (uniprotid)-[:ReactionInformation]->(reactionname)
into neo4j which was successful.
Later on I am uploading out.csv
From both the CSV files, Uniprot_ID columns are present and some of those ID's are same. Though some of the Uniprot_ID are common, neo4j is not returning any rows.
Any solutions
Thanks in Advance
Pavan Kumar Alluri
Just a few tips:
only use ONE label and ONE property for MERGE
set the others with ON CREATE SET ...
try to create nodes and rels separately, otherwise you might get into memory issues
you should be consistent with your spelling and upper/lowercase of properties and labels, otherwise you will spent hours in debugging (labels, rel-types and property-names are case-sensitive)
you probably don't need merge for relationships, create should do fine
for your statement:
CREATE CONSTRAINT ON (up:UniprotID) assert pp.Uniprotid is unique;
CREATE CONSTRAINT ON (pp:Prokaryotes_Proteins) assert pp.UniprotID is unique;
USING PERIODIC COMMIT 10000
LOAD CSV WITH HEADERS FROM "file:/home/pavan637/Neo4jDemo/out.csv" AS csvimport
merge (pp: Prokaryotes_Proteins {UniprotID: csvimport.DBUni})
ON CREATE SET pp.ProteinID=csvimport.ProteinID,
pp.IdentityPercentage=csvimport.IdentityPercentage, ...
;
LOAD CSV WITH HEADERS FROM "file:/home/pavan637/Neo4jDemo/out.csv" AS csvimport
match (uniprotid:UniprotID{Uniprotid: csvimport.Uniprot_ID})
match (pp: Prokaryotes_Proteins {UniprotID: csvimport.DBUni})
merge (uniprotid)-[:BlastResults]->(Prokaryotes_Proteins);

CSV LOAD and updating existing nodes / creating new ones

I might be on the wrong track so I could use some helpful input. I receive data from other systems by CSV files which I can import into my DB with CSV LOAD. So far so good.
I stucked when I need to reload the CSV again to follow up updates. I cannot delet the former data as I might have additional user input already attached so I would need a query that imports the CSV data, makes a match and when it finds the node it will just use SET to override the existing properties. Saying that I am unsure how to catch the cases where there is no node in the DB (new record) and we need to create a node.
LOAD CSV FROM "file:xxx.csv" AS csvLine
MATCH (c:Customer {code:"ABC"})
SET c.name = name: csvLine[0]
***OPTIONAL MATCH // Here I am unsure how to express when the node is not found***
MERGE (c:Customer { name: csvLine[0], code: csvLine[1]})
So ideally Cypher would check if the node is there and make an UPDATE by SET the new property coming with the CSV or - if the node cannot be found - creates a new one with the CSV data.
And - as a sidenote: How would I find nodes that are not in the CSV file but in the DB in order to mark them as obsolete? (This might not be able in the import but maybe someone has an idea how to solve this in order to keep the DB clean of deleted records - which can only be detected by a comparison with the latest CSV import - happy for every idea).
Any idea or hint how to write the query for updaten the graph while importing?
You need to use MERGEs ON MATCH and/or ON CREATE handlers, see http://neo4j.com/docs/stable/query-merge.html#_use_on_create_and_on_match. I assume the customer code in the second column is the identifier - so the name in column one might change on updates:
LOAD CSV FROM "file:xxx.csv" AS csvLine
MERGE (c:Customer {code:csvLine[1]})
ON CREATE SET c.name = csvLine[0]
ON MATCH SET c.name = csvLine[0]

neo4j: LOAD CSV doubled first data record of the CSV

I have a bug somewhere in my query - and any help would be very appreciated. I use LOAD CSV to import data into my DB. The CSV for testing is
"User1","Group1"
"User2","Group2"
"User3","Group3"
"User1","Group1"
"User2","Group2"
Result for the import should be
Every user is imported as a node without double entry
Every group is imported without double entries
The relationship between user and group is implemented
I work with this query:
LOAD CSV FROM "file:....." AS csvLine
MERGE (u:User { name: csvLine[0]})
MERGE (g:Group { name: csvLine[1]})
CREATE (u)-[:IS_MEMBER_OF]->(g)
Whenever I run the import I get all as expected with one exception. The first user in the csv file is always doubled - I have always two nodes with the first user name. All other users exists exactly one time. I would be happy to learn what's wrong with this approach - any comments are appreciated.
Thanks
Balael
I tried this and ended up with 3 Users, 3 Groups and the relationships connecting them exactly as you'd expect:
load csv from "https://gist.githubusercontent.com/mneedham/256b809f5622aebc311f/raw/0be2d9fac59ee453314c140f778c25b8fcad4b4c/file.csv" as csvLine
MERGE (u:User { name: csvLine[0]})
MERGE (g:Group { name: csvLine[1]})
CREATE (u)-[:IS_MEMBER_OF]->(g)
Can you show the output of doing:
MATCH (u:User) RETURN u
and:
MATCH (g:Group) RETURN g
Thanks
Mark

Having trouble creating relationships from csv import in neo4j

I have used a command like this to successfully create named nodes from csv:
load csv with headers from "file:/Users/lwyglend/Developer/flavourGroups.csv" as
flavourGroup
create (fg {name: flavourGroup.flavourGroup})
set fg:flavourGroup
return fg
However I am not having any luck using load from csv to create relationships with a similar command:
load csv with headers from "file:/Users/lwyglend/Developer/flavoursByGroup.csv" as
relationship
match (flavour {name: relationship.flavour}),
(flavourGroup {name: relationship.flavourGroup})
create flavour-[:BELONGS_TO]->flavourGroup
From a headed csv file that looks a bit like this:
flavour, flavourGroup
fish, marine
caviar, marine
There are no errors, the command seems to execute, but no relationships are actually created.
If I do a simple match on name: fish and name: marine and then construct the belongs to relationship between the fish and marine pre-existing nodes with cypher, the relationship sets up fine.
Is there a problem with importing from csv? Is my code wrong somehow? I have played around with a few different things but as a total newb to neo4j would appreciate any advice you have.
Wiggle,
I don't know for sure if this is your problem, but I discovered that if you have spaces after your commas in your CSV file (as you show in your example), they appear to be included as part of the field names and field contents. When I made a CSV file like the one you showed and tried to load it, I found that it failed. When I took out the spaces, I found that it succeeded.
As a test, try this query:
LOAD FROM CSV WITH HEADERS FROM "file:/Users/lwyglend/Developer/flavoursByGroup.csv" AS line
RETURN line.flavourGroup
then try this query:
LOAD FROM CSV WITH HEADERS FROM "file:/Users/lwyglend/Developer/flavoursByGroup.csv" AS line
RETURN line.` flavourGroup`
Grace and peace,
Jim
I'm a bit late in answering your question, but I don't think the spaces alone are the culprit. In your example cypher there is no association to the actual nodes in your database, only the csv alias named "relationship".
Try something along this line instead:
load csv with headers from "file:/Users/lwyglend/Developer/flavoursByGroup.csv" as
relationship
match (f:flavour), (fg:flavourGroup)
where f.name = relationship.flavour and
fg.name = relationship.flavourGroup
create (f)-[:BELONGS_TO]->(fg)