I have a csv file and I want to make 2 nodes with relation (node country-reported_on->node report_date). I have tried this code but it returns empty nodes with numbers instead of country name.
Here is what my dataset looks like:
PEOPLE_POSITIVE_CASES_COUNT;REPORT_DATE;COUNTRY_SHORT_NAME;PEOPLE_DEATH_COUNT;LIFE_EXPECTANCY;GDP;DENSITY_POPULATION;WORKFORCE
0;22.01.2020;Lesotho;0;54.836;875.353432963926;70.5616600790514
134;09.07.2020;Lesotho;1;54.836;875.353432963926;70.5616600790514
79557;02.03.2021;Zambia;1104;64.194;985.132436038869;94.4781600309238
106470;02.03.2021;Kenya;1863;66.991;1878.58070251348;94.4781600309238
Here is the code that I used:
LOAD CSV WITH HEADERS FROM "file:///dataset.csv"
as row WITH row WHERE row.COUNTRY_SHORT_NAME IS NOT NULL
MERGE (c:Country {name: row.COUNTRY_SHORT_NAME,
life_exp: row.LIFE_EXPECTANCY,
gdp: row.GDP,
density_population: row.DENSITY_POPULATION,
worforce: row.WORKFORCE } )
MERGE ( d:Report_date { date: row.REPORT_DATE } )
MERGE (c)-[:reported_on {cases_count: row.PEOPLE_POSITIVE_CASES_COUNT,
death_count: row.PEOPLE_DEATH_COUNT}]->(d)
EDIT
I changed the delimiter to ';' because that is what we had in our dataset however we still get bad results here is how it looks like in neo4j after running this code:
LOAD CSV WITH HEADERS FROM "file:///dataset.csv"
as row FIELDTERMINATOR ';' WITH row WHERE row.COUNTRY_SHORT_NAME IS NOT NULL
MERGE (c:Country {name: row.COUNTRY_SHORT_NAME,
life_exp: row.LIFE_EXPECTANCY,
gdp: row.GDP,
density_population: row.DENSITY_POPULATION,
worforce: row.WORKFORCE } )
MERGE ( d:Report_date { date: row.REPORT_DATE } )
MERGE (c)-[:reported_on {cases_count: row.PEOPLE_POSITIVE_CASES_COUNT,
death_count: row.PEOPLE_DEATH_COUNT}]->(d)
I think you got confused with node caption in Neo4j browser, all nodes get assigned an node id by default and nodes must be showing that. You can change it to country name property by clicking on node label. Screen shot for reference.
I am using the following .csv file for Neo4j import. There are 202 rackets. The numbers below racketX are the rating the user has given that racket.
I want to create the relationships among the users and the rating they have given to each racket. This is my current approach:
LOAD CSV WITH HEADERS FROM 'http://spreding.online/racket-recommendation-system/data/formattedFiles/formattedUsers.csv' AS row
WITH row
WHERE row.username IS NOT NULL
MERGE (u:User {
username: row.username,
height_m: toInteger(row.height),
weight_kg: toInteger(row.weight)
})
WITH row, u, range(3, 204) as indexes
MATCH (r:Racket)
UNWIND r as racket
UNWIND indexes as i
MERGE (u)-[:RATES {rating:toInteger(row[i])}]->(racket)
I get a "cannot access a map" error. Can you help me?
I would break down the load into multiple steps.
Load the users.
LOAD CSV WITH HEADERS FROM 'http://spreding.online/racket-recommendation-system/data/formattedFiles/formattedUsers.csv' AS row
WITH row
WHERE row.username IS NOT NULL
MERGE (u:User {
username: row.username,
height_m: toInteger(row.height),
weight_kg: toInteger(row.weight)
})
Load the rackets.
UNWIND RANGE(1,202) as idx
CREATE (:Racket {racketNumber:"racket"+idx})
Load the relationships.
LOAD CSV WITH HEADERS FROM 'http://spreding.online/racket-recommendation-system/data/formattedFiles/formattedUsers.csv' AS row
UNWIND RANGE (1,202) as idx
MATCH (u:User {username:row.username})
MATCH (r:Racket {racketNumber:"racket"+idx})
MERGE (u)-[:RATES {rating:toInteger(row["racket"+idx])}]->(r)
I have a data set in csv format. One of the fields is "elem_type". Based on this type I need to create different types nodes and give to the "columns" of my csv a different name based on the "elem_type" when loading the data using csv load, any way to do that?
My csv has no header and the data look like this:
0, 123, Marco, Ciao
1, 345, Merceds, Car, Yellow
2, 987, Boat, 150cm
Based on the first colmuns that is my "elem_type" i want to load the data and define 3 types of nodes (Person, Car, Boat) and also based on the elem_type define different header
I highly recommend to pre-parse the csv file into separate files for each label. It will make the cypher for import so much easier. In the following I use a little trick by wrapping the CASE command inside a FOREACH:
load csv from "file:///test.csv" as line
foreach (i in case when line[0] = '0' then [1] else [] end |
merge (p:Person {id: line[1]}) set p.name = line[2] )
foreach (i in case when line[0] = '1' then [1] else [] end |
merge (c:Car {id: line[1]}) set c.name = line[2], c.color = line[4] )
foreach (i in case when line[0] = '2' then [1] else [] end |
merge (b:Boat {id: line[1]}) set b.name = line[2] )
Also, don't forget to add indexes on the properties you are merging on.
I have a CSV file with data like:
ID,Name,Role,Project
1,James,Owner,TST
2,Ed,Assistant,TST
3,Jack,Manager,TST
and want to create people whose relationships to the project are therein specified. I attempted to do it like this:
load csv from 'file:/../x.csv' as line
match (p:Project {code: line[3]})
create (n:Individual {name: line[1]})-[r:line[2]]->(p);
but it barfs with:
Invalid input '[': expected an identifier character, whitespace, '|',
a length specification, a property map or ']' (line 1, column 159
(offset: 158))
as it can't seem to dereference line in the relationship creation. if I hard-code that it works:
load csv from 'file:/../x.csv' as line
match (p:Project {code: line[3]})
create (n:Individual {name: line[1]})-[r:WORKSFOR]->(p);
so how do I make the reference?
Right now you can't as this is structural information.
Either use neo4j-import tool for that.
Or specify it manually as you did, or use this workaround:
load csv with headers from 'file:/../x.csv' as line
match (p:Project {code: line.Project})
create (n:Individual {name: lineName})
foreach (x in case line.Role when "Owner" then [1] else [] end |
create (n)-[r:Owner]->(p)
)
foreach (x in case line.Role when "Assistant" then [1] else [] end |
create (n)-[Assistant]->(p)
)
foreach (x in case line.Role when "Manager" then [1] else [] end |
create (n)-[r:Manager]->(p)
)
Michael's answer stands, however, I've found that what I can do is specify an attribute to the relationship, like this:
load csv from 'file:/.../x.csv' as line
match (p:Project {code: line[3]})
create (i:Individual {name: line[1]})-[r:Role { type: line[2] }]->(p)
and I can make Neo4j display the type attribute of the relationship instead of the label
this question is old, but there is a post by Mark Needham
that provide a great and easy solution using APOC
as follow
load csv with headers from "file:///people.csv" AS row
MERGE (p1:Person {name: row.node1})
MERGE (p2:Person {name: row.node2})
WITH p1, p2, row
CALL apoc.create.relationship(p1, row.relationship, {}, p2) YIELD rel
RETURN rel
note: the "YIELD rel" is essential and so for the return part
I am new to prolog and am considering using it for a small data analysis application. Here is what I am seeking to accomplish:
I have a CSV file with some data of the following from:
a,b,c
d,e,f
g,h,i
...
The data is purely numerical and I need to do the following: 1st, I need to group rows according to the following scheme:
So what's going on above?
I start at the 1st row, which has value 'a' in column one. Then, I keep going down the rows until I hit a row whose value in column one differs from 'a' by a certain amount, 'z'. The process is then repeated, and many "groups" are formed after the process is complete.
For each of these groups, I want to find the mean of columns two and three (as an example, for the 1st group in the picture above, the mean of column two would be: (b+e+h)/3).
I am pretty sure this can be done in prolog. However, I have 50,000+ rows of data and since prolog is declarative, I am not sure how efficient prolog would be at accomplishing the above task?
Is it feasible to work out a prolog program to accomplish the above task, so that efficiency of the program is not significantly lower than a procedural analog?
this snippet could be a starting point for your task
:- [library(dcg/basics)].
rownum(Z, AveList) :- phrase_from_file(row_scan(Z, [], [], AveList), 'numbers.txt').
row_scan(Z, Group, AveSoFar, AveList) -->
number(A),",",number(B),",",number(C),"\n",
{ row_match(Z, A,B,C, Group,AveSoFar, Group1,AveUpdated) },
row_scan(Z, Group1, AveUpdated, AveList).
row_scan(_Z, _Group, AveList, AveList) --> "\n";[].
% row_match(Z, A,B,C, Group,Ave, Group1,Ave1)
row_match(_, A,B,C, [],Ave, [(A,B,C)],Ave).
row_match(Z, A,B,C, [H|T],Ave, Group1,Ave1) :-
H = (F,_,_),
( A - F =:= Z
-> aggregate_all(agg(count,sum(C2),sum(C3)),
member((_,C2,C3), [(A,B,C), H|T]), agg(Count,T2,T3)),
A2 is T2/Count, A3 is T3/Count,
Group1 = [], Ave1 = [(A2,A3)|Ave]
; Group1 = [H,(A,B,C)|T], Ave1 = Ave
).
with this input
1,2,3
4,5,6
7,8,9
10,2,3
40,5,6
70,8,9
16,0,0
yields
?- rownum(6,L).
L = [ (3.75, 4.5), (5, 6)]