How to specify relationship type in CSV? - csv

I have a CSV file with data like:
ID,Name,Role,Project
1,James,Owner,TST
2,Ed,Assistant,TST
3,Jack,Manager,TST
and want to create people whose relationships to the project are therein specified. I attempted to do it like this:
load csv from 'file:/../x.csv' as line
match (p:Project {code: line[3]})
create (n:Individual {name: line[1]})-[r:line[2]]->(p);
but it barfs with:
Invalid input '[': expected an identifier character, whitespace, '|',
a length specification, a property map or ']' (line 1, column 159
(offset: 158))
as it can't seem to dereference line in the relationship creation. if I hard-code that it works:
load csv from 'file:/../x.csv' as line
match (p:Project {code: line[3]})
create (n:Individual {name: line[1]})-[r:WORKSFOR]->(p);
so how do I make the reference?

Right now you can't as this is structural information.
Either use neo4j-import tool for that.
Or specify it manually as you did, or use this workaround:
load csv with headers from 'file:/../x.csv' as line
match (p:Project {code: line.Project})
create (n:Individual {name: lineName})
foreach (x in case line.Role when "Owner" then [1] else [] end |
create (n)-[r:Owner]->(p)
)
foreach (x in case line.Role when "Assistant" then [1] else [] end |
create (n)-[Assistant]->(p)
)
foreach (x in case line.Role when "Manager" then [1] else [] end |
create (n)-[r:Manager]->(p)
)

Michael's answer stands, however, I've found that what I can do is specify an attribute to the relationship, like this:
load csv from 'file:/.../x.csv' as line
match (p:Project {code: line[3]})
create (i:Individual {name: line[1]})-[r:Role { type: line[2] }]->(p)
and I can make Neo4j display the type attribute of the relationship instead of the label

this question is old, but there is a post by Mark Needham
that provide a great and easy solution using APOC
as follow
load csv with headers from "file:///people.csv" AS row
MERGE (p1:Person {name: row.node1})
MERGE (p2:Person {name: row.node2})
WITH p1, p2, row
CALL apoc.create.relationship(p1, row.relationship, {}, p2) YIELD rel
RETURN rel
note: the "YIELD rel" is essential and so for the return part

Related

Create nodes conditionally when loading nodes from csv in Neo4j

I have a data set in csv format. One of the fields is "elem_type". Based on this type I need to create different types nodes and give to the "columns" of my csv a different name based on the "elem_type" when loading the data using csv load, any way to do that?
My csv has no header and the data look like this:
0, 123, Marco, Ciao
1, 345, Merceds, Car, Yellow
2, 987, Boat, 150cm
Based on the first colmuns that is my "elem_type" i want to load the data and define 3 types of nodes (Person, Car, Boat) and also based on the elem_type define different header
I highly recommend to pre-parse the csv file into separate files for each label. It will make the cypher for import so much easier. In the following I use a little trick by wrapping the CASE command inside a FOREACH:
load csv from "file:///test.csv" as line
foreach (i in case when line[0] = '0' then [1] else [] end |
merge (p:Person {id: line[1]}) set p.name = line[2] )
foreach (i in case when line[0] = '1' then [1] else [] end |
merge (c:Car {id: line[1]}) set c.name = line[2], c.color = line[4] )
foreach (i in case when line[0] = '2' then [1] else [] end |
merge (b:Boat {id: line[1]}) set b.name = line[2] )
Also, don't forget to add indexes on the properties you are merging on.

dumping list to JSON file creates list within a list [["x", "y","z"]], why?

I want to append multiple list items to a JSON file, but it creates a list within a list, and therefore I cannot acces the list from python. Since the code is overwriting existing data in the JSON file, there should not be any list there. I also tried it by having just an text in the file without brackets. It just creates a list within a list so [["x", "y","z"]] instead of ["x", "y","z"]
import json
filename = 'vocabulary.json'
print("Reading %s" % filename)
try:
with open(filename, "rt") as fp:
data = json.load(fp)
print("Data: %s" % data)#check
except IOError:
print("Could not read file, starting from scratch")
data = []
# Add some data
TEMPORARY_LIST = []
new_word = input("give new word: ")
TEMPORARY_LIST.append(new_word.split())
print(TEMPORARY_LIST)#check
data = TEMPORARY_LIST
print("Overwriting %s" % filename)
with open(filename, "wt") as fp:
json.dump(data, fp)
example and output with appending list with split words:
Reading vocabulary.json
Data: [['my', 'dads', 'house', 'is', 'nice']]
give new word: but my house is nicer
[['but', 'my', 'house', 'is', 'nicer']]
Overwriting vocabulary.json
So, if I understand what you are trying to accomplish correctly, it looks like you are trying to overwrite a list in a JSON file with a new list created from user input. For easiest data manipulation, set up your JSON file in dictionary form:
{
"words": [
"my",
"dad's",
"house",
"is",
"nice"
]
}
You should then set up functions to separate your functionality to make it more manageable:
def load_json(filename):
with open(filename, "r") as f:
return json.load(f)
Now, we can use those functions to load the JSON, access the words list, and overwrite it with the new word.
data = load_json("vocabulary.json")
new_word = input("Give new word: ").split()
data["words"] = new_word
write_json("vocabulary.json", data)
If the user inputs "but my house is nicer", the JSON file will look like this:
{
"words": [
"but",
"my",
"house",
"is",
"nicer"
]
}
Edit
Okay, I have a few suggestions to make before I get into solving the issue. Firstly, it's great that you have delegated much of the functionality of the program over to respective functions. However, using global variables is generally discouraged because it makes things extremely difficult to debug as any of the functions that use that variable could have mutated it by accident. To fix this, use method parameters and pass around the data accordingly. With small programs like this, you can think of the main() method as the point in which all data comes to and from. This means that the main() function will pass data to other functions and receive new or edited data back. One final recommendation, you should only be using all capital letters for variable names if they are going to be constant. For example, PI = 3.14159 is a constant, so it is conventional to make "pi" all caps.
Without using global, main() will look much cleaner:
def main():
choice = input("Do you want to start or manage the list? (start/manage)")
if choice == "start":
data = load_json()
words = data["words"]
dictee(words)
elif choice == "manage":
manage_list()
You can use the load_json() function from earlier (notice that I deleted write_json(), more on that later) if the user chooses to start the game. If the user chooses to manage the file, we can write something like this:
def manage_list():
choice = input("Do you want to add or clear the list? (add/clear)")
if choice == "add":
words_to_add = get_new_words()
add_words("vocabulary.json", words_to_add)
elif choice == "clear":
clear_words("vocabulary.json")
We get the user input first and then we can call two other functions, add_words() and clear_words():
def add_words(filename, words):
with open(filename, "r+") as f:
data = json.load(f)
data["words"].extend(words)
f.seek(0)
json.dump(data, f, indent=4)
def clear_words(filename):
with open(filename, "w+") as f:
data = {"words":[]}
json.dump(data, f, indent=4)
I did not utilize the load_json() function in the two functions above. My reasoning for this is because it would call for opening the file more times than needed, which would hurt performance. Furthermore, in these two functions, we already need to open the file, so it is okayt to load the JSON data here because it can be done with only one line: data = json.load(f). You may also notice that in add_words(), the file mode is "r+". This is the basic mode for reading and writing. "w+" is used in clear_words(), because "w+" not only opens the file for reading and writing, it overwrites the file if the file exists (that is also why we don't need to load the JSON data in clear_words()). Because we have these two functions for writing and/or overwriting data, we don't need the write_json() function that I had initially suggested.
We can then add to the list like so:
>>> Do you want to start or manage the list? (start/manage)manage
>>> Do you want to add or clear the list? (add/clear)add
>>> Please enter the words you want to add, separated by spaces: these are new words
And the JSON file becomes:
{
"words": [
"but",
"my",
"house",
"is",
"nicer",
"these",
"are",
"new",
"words"
]
}
We can then clear the list like so:
>>> Do you want to start or manage the list? (start/manage)manage
>>> Do you want to add or clear the list? (add/clear)clear
And the JSON file becomes:
{
"words": []
}
Great! Now, we implemented the ability for the user to manage the list. Let's move on to creating the functionality for the game: dictee()
You mentioned that you want to randomly select an item from a list and remove it from that list so it doesn't get asked twice. There are a multitude of ways you can accomplish this. For example, you could use random.shuffle:
def dictee(words):
correct = 0
incorrect = 0
random.shuffle(words)
for word in words:
# ask word
# evaluate response
# increment correct/incorrect
# ask if you want to play again
pass
random.shuffle randomly shuffles the list around. Then, you can iterate throught the list using for word in words: and start the game. You don't necessarily need to use random.choice here because when using random.shuffle and iterating through it, you are essentially selecting random values.
I hope this helped illustrate how powerful functions and function parameters are. They not only help you separate your code, but also make it easier to manage, understand, and write cleaner code.

How to use ransack to search MySQL JSON array in Rails 5

Rails 5 now support native JSON data type in MySQL, so if I have a column data that contains an array: ["a", "b", "c"], and I want to search if this column contains values, so basically I would like to have something like: data_json_cont: ["b"]. So can this query be built using ransack ?
Well I found quite some way to do this with Arrays(not sure about json contains for hash in mysq). First include this code in your active record model:
self.columns.select{|column| column.type == :json}.each do |column|
ransacker "#{column.name}_json_contains".to_sym,
args: [:parent, :ransacker_args] do |parent, args|
query_parts = args.map do |val|
"JSON_CONTAINS(#{column.name}, '#{val.to_json}')"
end
query = query_parts.join(" * ")
Arel.sql(query)
end
end
Then assuming you have class Shirt with column size, then you can do the following:
search = Shirt.ransack(
c: [{
a: {
'0' => {
name: 'size_json_contains',
ransacker_args: ["L", "XL"]
}
},
p: 'eq',
v: [1]
}]
)
search.result
It works as follows: It checks that the array stored in the json column contains all elements of the asked array, by getting the result of each json contains alone, then multiplying them all, and comparing them to arel predicate eq with 1 :) You can do the same with OR, by using bitwise OR instead of multiplication.

Counting occurrences of a list item from a list?

(See edit at the bottom of this post)
I'm making a program in Elixir that counts the types of HTML tags from a list of tags that I've already obtained. This means that the key should be the tag and the value should be the count.
e.g. in the following sample file
<html><head><body><sometag><sometag><sometag2><sometag>
My output should be something like the following:
html: 1
head: 1
body: 1
sometag: 3
sometag2: 1
Here is my code:
def tags(page) do
taglist = Regex.scan(~r/<[a-zA-Z0-9]+/, page)
dict = Map.new()
Enum.map(taglist, fn(x) ->
tag = String.to_atom(hd(x))
Map.put_new(dict, tag, 1)
end)
end
I know I should be probably using Enum.each instead but when I do that my dictionary ends up just being empty instead of incorrect.
With Enum.map, this is the output I receive:
iex(15)> A3.test
[%{"<html" => 1}, %{"<body" => 1}, %{"<p" => 1}, %{"<a" => 1}, %{"<p" => 1},
%{"<a" => 1}, %{"<p" => 1}, %{"<a" => 1}, %{"<p" => 1}, %{"<a" => 1}]
As you can see, there are duplicate entries and it's turned into a list of dictionaries. For now I'm not even trying to get the count working, so long as the dictionary doesn't duplicate entries (which is why the value is always just "1").
Thanks for any help.
EDIT: ------------------
Okay so I figured out that I need to use Enum.reduce
The following code produces the output I'm looking for (for now):
def tags(page) do
rawTagList = Regex.scan(~r/<[a-zA-Z0-9]+/, page)
tagList = Enum.map(rawTagList, fn(tag) -> String.to_atom(hd(tag)) end)
Enum.reduce(tagList, %{}, fn(tag, acc) ->
Map.put_new(acc, tag, 1)
end)
end
Output:
%{"<a": 1, "<body": 1, "<html": 1, "<p": 1}
Now I have to complete the challenge of actually counting the tags as I go...If anyone can offer any insight on that I'd be grateful!
First of all, it is not the best idea to parse html with regexes. See this question for more details (especially the accepted answer).
Secondly, you are trying to write imperative code in functional language (this is about first version of your code). Variables in Elixir are immutable. dict will always be an empty map. Enum.map takes a list and always returns new list of the same length with all elements transformed. Your transformation function takes an empty map and puts one key-value pair into it.
As a result you get a list with one element maps. The line:
Map.put_new(dict, tag, 1)
doesn't update dict in place, but creates new one using old one, which is empty. In your example it is exactly the same as:
%{tag => 1}
You have couple of options to do it differently. Closest approach would be to use Enum.reduce. It takes a list, an initial accumulator and a function elem, acc -> new_acc.
taglist
|> Enum.reduce(%{}, fn(tag, acc) -> Map.update(acc, tag, 1, &(&1 + 1)) end)
It looks a little bit complicated, because there are couple of nice syntactic sugars. taglist |> Enum.reduce(%{}, fun) is the same as Enum.reduce(taglist, %{}, fun). &(&1 + 1) is shorthand for fn(counter) -> counter + 1 end.
Map.update takes four arguments: a map to update, key to update, initial value if key doesn't exist and a function that does something with the key if it exists.
So, those two lines of code do this:
iterate over list Enum.reduce
starting with empty map %{}
take current element and map fn(tag, acc) and either:
if key doesn't exist insert 1
if it exists increment it by one &(&1 + 1)

CSV Parser through angularJS

I am building a CSV file parser through node and Angular . so basically a user upload a csv file , on my server side which is node the csv file is traversed and parsed using node-csv
. This works fine and it returns me an array of object based on csv file given as input , Now on angular end I need to display two table one is csv file data itself and another is cross tabulation analysis. I am facing problem while rendering data, so for a table like
I am getting parse responce as
For cross tabulation we need data in a tabular form as
I have a object array which I need to manipulate in best possible way so as to make easily render on html page . I am not getting a way how to do calculation on data I get so as to store cross tabulation result .Any idea on how should I approach .
data json is :
[{"Sample #":"1","Gender":"Female","Handedness;":"Right-handed;"},{"Sample #":"2","Gender":"Male","Handedness;":"Left-handed;"},{"Sample #":"3","Gender":"Female","Handedness;":"Right-handed;"},{"Sample #":"4","Gender":"Male","Handedness;":"Right-handed;"},{"Sample #":"5","Gender":"Male","Handedness;":"Left-handed;"},{"Sample #":"6","Gender":"Male","Handedness;":"Right-handed;"},{"Sample #":"7","Gender":"Female","Handedness;":"Right-handed;"},{"Sample #":"8","Gender":"Female","Handedness;":"Left-handed;"},{"Sample #":"9","Gender":"Male","Handedness;":"Right-handed;"},{"Sample #":";"}
There are many ways you can do this and since you have not been very specific on the usage, I will go with the simplest one.
Assuming you have an object structure such as this:
[
{gender: 'female', handdness: 'lefthanded', id: 1},
{gender: 'male', handdness: 'lefthanded', id: 2},
{gender: 'female', handdness: 'righthanded', id: 3},
{gender: 'female', handdness: 'lefthanded', id: 4},
{gender: 'female', handdness: 'righthanded', id: 5}
]
and in your controller you have exposed this with something like:
$scope.members = [the above array of objects];
and you want to display the total of female members of this object, you could filter this in your html
{{(members | filter:{gender:'female'}).length}}
Now, if you are going to make this a table it will obviously make some ugly and unreadable html so especially if you are going to repeat using this, it would be a good case for making a directive and repeat it anywhere, with the prerequisite of providing a scope object named tabData (or whatever you wish) in your parent scope
.directive('tabbed', function () {
return {
restrict: 'E',
template: '<table><tr><td>{{(tabData | filter:{gender:"female"}).length}}</td></tr><td>{{(tabData | filter:{handedness:"lefthanded"}).length}}</td></table>'
}
});
You would use this in your html like so:
<tabbed></tabbed>
And there are ofcourse many ways to improve this as you wish.
This is more of a general data structure/JS question than Angular related.
Functional helpers from Lo-dash come in very handy here:
_(data) // Create a chainable object from the data to execute functions with
.groupBy('Gender') // Group the data by its `Gender` attribute
// map these groups, using `mapValues` so the named `Gender` keys persist
.mapValues(function(gender) {
// Create named count objects for all handednesses
var counts = _.countBy(gender, 'Handedness');
// Calculate the total of all handednesses by summing
// all the values of this named object
counts.Total = _(counts)
.values()
.reduce(function(sum, num) { return sum + num });
// Return this named count object -- this is what each gender will map to
return counts;
}).value(); // get the value of the chain
No need to worry about for-loops or anything of the sort, and this code also works without any changes for more than two genders (even for more than two handednesses - think of the aliens and the ambidextrous). If you aren't sure exactly what's happening, it should be easy enough to pick apart the single steps and their result values of this code example.
Calculating the total row for all genders will work in a similar manner.