Multiple LOAD CSV statements in one Cypher query - csv

Trying to import rows and create nodes from different .csv files in one cypher query:
// User nodes
LOAD CSV WITH HEADERS
FROM 'file:///profile.csv' AS profile_line
CREATE (user:User { userId: profile_line.User })
// Project nodes
LOAD CSV WITH HEADERS
FROM 'file:///project.csv' AS project_line
CREATE (project:Project { projectId: project_line.projectId })
// Image nodes
LOAD CSV WITH HEADERS
FROM 'file:///media.csv' AS image_line
CREATE (image:Image { imageId: '<imageId>' })
Throws the following error:
"WITH is required between CREATE and LOAD CSV (line 9, column 1 (offset: 211))
"CREATE (project:Project { projectId: project_line.projectId })"
I am unclear as to how the WITH statement should be constructed.

If you're using the Neo4j Browser, the easiest way to do this is to just separate your statements with semicolons and turn on the 'multi-statement query editor' (which also gives you a nice little progress indicator as each statement is run):
LOAD CSV WITH HEADERS
FROM 'file:///profile.csv' as profile_line
CREATE (user: User { userId: profile_line.User });
LOAD CSV WITH HEADERS
FROM 'file:///project.csv' as project_line
CREATE (project: Project { projectId: project_line.projectId });
LOAD CSV WITH HEADERS
FROM 'file:///media.csv' as image_line
CREATE (image: Image { imageId: image_line.ImageId });
Otherwise, it is still possible to do this. What we want is a WITH statement that will return a single row irrespective of how many nodes the previous CREATE ended up creating, so any aggregation function will do. For example:
LOAD CSV WITH HEADERS
FROM 'file:///profile.csv' as profile_line
CREATE (user: User { userId: profile_line.User })
WITH max(1) as dummy
LOAD CSV WITH HEADERS
FROM 'file:///project.csv' as project_line
CREATE (project: Project { projectId: project_line.projectId })
WITH max(1) as dummy
LOAD CSV WITH HEADERS
FROM 'file:///media.csv' as image_line
CREATE (image: Image { imageId: image_line.ImageId })
Added 9 labels, created 9 nodes, set 9 properties, completed after 17 ms.
Though I think this is super unclear for future-you, and I wouldn't recommend it.

Related

Create nodes from CSV in Neo4j

I have a csv file and I want to make 2 nodes with relation (node country-reported_on->node report_date). I have tried this code but it returns empty nodes with numbers instead of country name.
Here is what my dataset looks like:
PEOPLE_POSITIVE_CASES_COUNT;REPORT_DATE;COUNTRY_SHORT_NAME;PEOPLE_DEATH_COUNT;LIFE_EXPECTANCY;GDP;DENSITY_POPULATION;WORKFORCE
0;22.01.2020;Lesotho;0;54.836;875.353432963926;70.5616600790514
134;09.07.2020;Lesotho;1;54.836;875.353432963926;70.5616600790514
79557;02.03.2021;Zambia;1104;64.194;985.132436038869;94.4781600309238
106470;02.03.2021;Kenya;1863;66.991;1878.58070251348;94.4781600309238
Here is the code that I used:
LOAD CSV WITH HEADERS FROM "file:///dataset.csv"
as row WITH row WHERE row.COUNTRY_SHORT_NAME IS NOT NULL
MERGE (c:Country {name: row.COUNTRY_SHORT_NAME,
life_exp: row.LIFE_EXPECTANCY,
gdp: row.GDP,
density_population: row.DENSITY_POPULATION,
worforce: row.WORKFORCE } )
MERGE ( d:Report_date { date: row.REPORT_DATE } )
MERGE (c)-[:reported_on {cases_count: row.PEOPLE_POSITIVE_CASES_COUNT,
death_count: row.PEOPLE_DEATH_COUNT}]->(d)
EDIT
I changed the delimiter to ';' because that is what we had in our dataset however we still get bad results here is how it looks like in neo4j after running this code:
LOAD CSV WITH HEADERS FROM "file:///dataset.csv"
as row FIELDTERMINATOR ';' WITH row WHERE row.COUNTRY_SHORT_NAME IS NOT NULL
MERGE (c:Country {name: row.COUNTRY_SHORT_NAME,
life_exp: row.LIFE_EXPECTANCY,
gdp: row.GDP,
density_population: row.DENSITY_POPULATION,
worforce: row.WORKFORCE } )
MERGE ( d:Report_date { date: row.REPORT_DATE } )
MERGE (c)-[:reported_on {cases_count: row.PEOPLE_POSITIVE_CASES_COUNT,
death_count: row.PEOPLE_DEATH_COUNT}]->(d)
I think you got confused with node caption in Neo4j browser, all nodes get assigned an node id by default and nodes must be showing that. You can change it to country name property by clicking on node label. Screen shot for reference.

Is there a way to import a csv into Neo4j using foreach or unwind?

I am using the following .csv file for Neo4j import. There are 202 rackets. The numbers below racketX are the rating the user has given that racket.
I want to create the relationships among the users and the rating they have given to each racket. This is my current approach:
LOAD CSV WITH HEADERS FROM 'http://spreding.online/racket-recommendation-system/data/formattedFiles/formattedUsers.csv' AS row
WITH row
WHERE row.username IS NOT NULL
MERGE (u:User {
username: row.username,
height_m: toInteger(row.height),
weight_kg: toInteger(row.weight)
})
WITH row, u, range(3, 204) as indexes
MATCH (r:Racket)
UNWIND r as racket
UNWIND indexes as i
MERGE (u)-[:RATES {rating:toInteger(row[i])}]->(racket)
I get a "cannot access a map" error. Can you help me?
I would break down the load into multiple steps.
Load the users.
LOAD CSV WITH HEADERS FROM 'http://spreding.online/racket-recommendation-system/data/formattedFiles/formattedUsers.csv' AS row
WITH row
WHERE row.username IS NOT NULL
MERGE (u:User {
username: row.username,
height_m: toInteger(row.height),
weight_kg: toInteger(row.weight)
})
Load the rackets.
UNWIND RANGE(1,202) as idx
CREATE (:Racket {racketNumber:"racket"+idx})
Load the relationships.
LOAD CSV WITH HEADERS FROM 'http://spreding.online/racket-recommendation-system/data/formattedFiles/formattedUsers.csv' AS row
UNWIND RANGE (1,202) as idx
MATCH (u:User {username:row.username})
MATCH (r:Racket {racketNumber:"racket"+idx})
MERGE (u)-[:RATES {rating:toInteger(row["racket"+idx])}]->(r)

How to load data from a CSV-file to a Highcharts sankey diagram

I want to load data from an external CSV-file to a Highcharts sankey diagram. After trying several options, I am not sure if this is even possible, as the result is always an empty chart? The CSV-file will be on the same server in the final version.
As a simple case, see the fiddle https://jsfiddle.net/oy095kzb/
which is merely copy/paste from the official Highcharts sankey example (where data is included in the code), except that data-module is loaded and csvURL is used instead:
series: [{
keys: ['from', 'to', 'weight'],
data: {
csvURL:'https://www.test.basleratlas.ch/sankey_test.csv'
},
type: 'sankey',
name: 'Sankey demo series'
}]
CSV-file-structure:
'from', 'to', 'weight'
'Brazil', 'Portugal', 5
'Brazil', 'France', 1
'Brazil', 'Spain', 1
'Brazil', 'England', 1
'Canada', 'Portugal', 1
...
Notice that the data.csvURL feature is using to fetch the data from the CSV which is stored on the server, like this link: https://demo-live-data.highcharts.com/vs-load.csv, meanwhile it seems that your link downloading the CSV file. Next notice that your CSV file doesn't have defined column names.
EDIT
Notice that the data object should be defined outside the series object config.
Also, I used the complete callback to parse the data into the proper format.
Demo: https://jsfiddle.net/BlackLabel/qLu37548/
API: https://api.highcharts.com/highcharts/data.complete

D3 Loading in CSV file then using only specific columns

I've had a hard time getting two columns from a CSV file with which I am planning on building a basic bar chart. I was planning on getting 2 arrays (one for each column) within one array that I would use as such to build a bar chart. Just getting started with D3, as you can tell.
Currently loading the data in gets me an array of objects, which is then a mess to get two columns of key-value pairs. I'm not sure my thinking is correct...
I see this similar question:
d3 - load two specific columns from csv file
But how would I use selections and enter() to accomplish my goal?
You can't load just 2 columns of a bigger CSV, but you can load the whole thing and extract the columns you want.
Say your csv is like this:
col1,col2,col3,col4
aaa1,aaa2,aaa3,aaa4
bbb1,bbb2,bbb3,bbb4
ccc1,ccc2,ccc3,ccc4
And you load it with
csv('my.csv', function(err, data) {
console.log(data)
/*
output:
[
{ col1:'aaa1', col2:'aaa2', col3:'aaa3', col4:'aaa4' },
{ col1:'bbb1', col2:'bbb2', col3:'bbb3', col4:'bbb4' },
{ col1:'ccc1', col2:'ccc2', col3:'ccc3', col4:'ccc4' }
]
*/
})
If you only want col2 and col3 (and you don't want to simply leave the other columns' data in there, which shouldn't be an issue anyway), you can do this:
var cols2and3 = data.map(function(d) {
return {
col2: d.col2,
col3: d.col3
}
});
console.log(cols2and3)
/*
output:
[
{ col2:'aaa2', col3:'aaa3' },
{ col2:'bbb2', col3:'bbb3' },
{ col2:'ccc2', col3:'ccc3' }
]
*/
I.e. the above code produced a new array of objects with only two props per object.
If you want just an array of values per column — not objects with both columns' values — you can:
var col2data = data.map(function(d) { return d.col2 }
var col3data = data.map(function(d) { return d.col3 }
console.log(col2) // outputs: ['aaa2', 'bbb2', 'ccc2']

Relationship that connect the same node in Neo4j

I will try to be very succinct with my problem. I have the node Person that I loaded using a .csv file and I have another .csv file to be loaded - person_speaks_language_0.csv
(got this header: idPerson|languagePSL )
How can I relate this? How can I create this relationship?
Grabbing another example, that is very similar to the previous one, and that I can't solve. I have the Comment node loaded in Neo4j an I need to load another .csv file, that file is - comment_replyOf_comment_0.csv
(got his header: idComment|idComment)
How can I load this file? How can I connect a relation that goes "in and out" from the same node - that connects the same node?
For the first example. there is 2 options.
If you want Language to be a separate node, try this cypher:
LOAD CSV FROM 'person_speaks_language_0.csv' AS line
MATCH (p:Person)
WHERE p.id=line[0]
MERGE (p)-[r:Speaks]->(l:Language { name: line[1])})
RETURN p, l, r
Or, probably, better option
LOAD CSV FROM 'person_speaks_language_0.csv' AS line
MERGE (p:Person { id:line[0] })-[r:Speaks]->(l:Language { name: line[1]) })
RETURN p, l, r
If you want Language to be a property, try this:
LOAD CSV FROM 'person_speaks_language_0.csv' AS line
MERGE (p { id:line[0], language:line[1] })
RETURN p
The RETURN statement is optional and you don't want to include it for a big csv files (although it could be useful for debug).
For the second example, try this:
LOAD CSV FROM 'comment_replyOf_comment_0.csv' AS line
MERGE (c1:Comment { id:line[0] })-[r:Commented]->(c2:Comment { id:line[1]) })
RETURN c1, r, c2