I have a CSV file that look something like below: i.e. not in Prolog format
james,facebook,intel,samsung
rebecca,intel,samsung,facebook
Ian,samsung,facebook,intel
I am trying to write a Prolog predicate that reads the file and returns a list that looks like
[[james,facebook,intel,samsung],[rebecca,intel,samsung,facebook],[Ian,samsung,facebook,intel]]
to be used further in other predicates.
I am still a beginner and have found some good information from SO and modified them to see if I can get it but I`m stuck because I only generate a list that looks like this
[[(james,facebook,intel,samsung)],[(rebecca,intel,samsung,facebook)],[(Ian,samsung,facebook,intel)]]
which means when I call the head of the inner lists I get (james,facebook,intel,samsung) and not james.
Here is the code being used :- (seen on SO and modified)
stream_representations(Input,Lines) :-
read_line_to_codes(Input,Line),
( Line == end_of_file
-> Lines = []
; atom_codes(FinalLine, Line),
term_to_atom(LineTerm,FinalLine),
Lines = [[LineTerm] | FurtherLines],
stream_representations(Input,FurtherLines)
).
main(Lines) :-
open('file.txt', read, Input),
stream_representations(Input, Lines),
close(Input).
The problem lies with term_to_atom(LineTerm,FinalLine).
First we read a line of the CSV file into a list of character codes in
read_line_to_codes(Input,Line).
Let's simulate input with atom_codes/2:
?- atom_codes('james,facebook,intel,samsung',Line).
Line = [106, 97, 109, 101, 115, 44, 102, 97, 99|...].
Then we recompose the original atom read in into FinalLine (this seems wasteful, there must be a way to hoover up a line into an atom directly)
?- atom_codes('james,facebook,intel,samsung',Line),
atom_codes(FinalLine, Line).
Line = [106, 97, 109, 101, 115, 44, 102, 97, 99|...],
FinalLine = 'james,facebook,intel,samsung'.
The we try to map this atom in FinalLine into a term, LineTerm, using term_to_atom/2
?- atom_codes('james,facebook,intel,samsung',Line),
atom_codes(FinalLine, Line),
term_to_atom(LineTerm,FinalLine).
Line = [106, 97, 109, 101, 115, 44, 102, 97, 99|...],
FinalLine = 'james,facebook,intel,samsung',
LineTerm = (james, facebook, intel, samsung).
You see the problem here: LineTerm is not quite a list, but a nested term using the functor , to separate elements:
?- atom_codes('james,facebook,intel,samsung',Line),
atom_codes(FinalLine, Line),
term_to_atom(LineTerm,FinalLine),
write_canonical(LineTerm).
','(james,','(facebook,','(intel,samsung)))
Line = [106, 97, 109, 101, 115, 44, 102, 97, 99|...],
FinalLine = 'james,facebook,intel,samsung',
LineTerm = (james, facebook, intel, samsung).
This ','(james,','(facebook,','(intel,samsung))) term will thus also be in the final result, just written differently: (james,facebook,intel,samsung) and packed into a list:
[(james,facebook,intel,samsung)]
You do not want this term, you want a list. You could use atomic_list_concat/2 to create a new atom that can be read as a list:
?- atom_codes('james,facebook,intel,samsung',Line),
atom_codes(FinalLine, Line),
atomic_list_concat(['[',FinalLine,']'],ListyAtom),
term_to_atom(LineTerm,ListyAtom),
LineTerm = [V1,V2,V3,V4].
Line = [106, 97, 109, 101, 115, 44, 102, 97, 99|...],
FinalLine = 'james,facebook,intel,samsung',
ListyAtom = '[james,facebook,intel,samsung]',
LineTerm = [james, facebook, intel, samsung],
V1 = james,
V2 = facebook,
V3 = intel,
V4 = samsung.
But that's rather barbaric.
We must do this whole processing in fewer steps:
Read a line of comma-separated strings on input.
Transform this into a list of either atoms or strings directly.
DCGs seem like the correct solution. Maybe someone can add a two-liner.
I'm getting a ValueError: Lock objects should only be shared between processes through inheritance when writing a xarray.DataArray to_netcdf().
Everything works until writing to disk. But I found a workaround which is to use dask.config.set(scheduler='single-threaded').
Is everyone supposed to use dask.config.set(scheduler='single-threaded') before writing to disk?
Am I missing something?
I tested two schedulers:
1) from dask.distributed import Client; client = Client()
2) import dask.multiprocessing; dask.config.set(scheduler=dask.multiprocessing.get)
python=2.7, xarray=0.10.9, traceback:
File "/home/py_user/miniconda2/envs/v0/lib/python2.7/site-packages/xarray/core/dataarray.py", line 1746, in to_netcdf
return dataset.to_netcdf(*args, **kwargs)
File "/home/py_user/miniconda2/envs/v0/lib/python2.7/site-packages/xarray/core/dataset.py", line 1254, in to_netcdf
compute=compute)
File "/home/py_user/miniconda2/envs/v0/lib/python2.7/site-packages/xarray/backends/api.py", line 724, in to_netcdf
unlimited_dims=unlimited_dims, compute=compute)
File "/home/py_user/miniconda2/envs/v0/lib/python2.7/site-packages/xarray/core/dataset.py", line 1181, in dump_to_store
store.sync(compute=compute)
...
File "/home/py_user/miniconda2/envs/v0/lib/python2.7/multiprocessing/synchronize.py", line 95, in __getstate__
assert_spawning(self)
File "/home/py_user/miniconda2/envs/v0/lib/python2.7/multiprocessing/forking.py", line 52, in assert_spawning
' through inheritance' % type(self).__name__
As #jhamman mentioned in the comments. This may have been fixed in a newer version of Xarray.
I am trying to load a *.csv file into neo4j and in the same load statement split the line (which has no delimiters but has a set location for the data that I need to create nodes from). I want to use the substring function, I can't figure out how to get it to work. The data reads in as a single line:
0067011990999991958051507004+68750+023550FM-12+038299999V0203301N00671220001CN9999999N9+00001+99999999999
I have tried using the following code:
LOAD CSV WITH HEADERS FROM "file:/c:/itw/Ltemps.csv" AS line
WITH line
WHERE line.year IS split((substring(line, 15, 19))) and line.temp IS split((substring(line, 88, 92))) and line.qlfr IS split((substring(line, 87, 88))) and line.qual IS split((substring(line, 92, 93)))
MERGE (y:Year {year:line.year})
MERGE (t:Temp {temp:line.temp})
MERGE (f:Qlfr {qlfr:line.qlfr})
MERGE (q:Qual {qual:line.qual})
CREATE (y)-[r:HAS_TEMP]->(t);
I am looking to get 4 nodes: year, temp (an absolute value), a qualifier (positive or negative symbol), and a quality number. The indexes on for where the data lies in the string should be accurate.
First, let's try to get the indexes and types right. To convert numeric substrings to integers, we use the toInteger function:
WITH '0067011990999991958051507004+68750+023550FM-12+038299999V0203301N00671220001CN9999999N9+00001+99999999999' AS line
RETURN
toInteger(substring(line, 15, 4)) AS year,
toInteger(substring(line, 88, 2)) AS temp,
substring(line, 87, 1) AS qlfr,
toInteger(substring(line, 92, 1)) AS qual
This gives:
╒══════╤══════╤══════╤══════╕
│"year"│"temp"│"qlfr"│"qual"│
╞══════╪══════╪══════╪══════╡
│1958 │0 │"+" │1 │
└──────┴──────┴──────┴──────┘
If the results good look, add back LOAD CSV the MERGE clauses. Two things:
I don't think it makes sense to use WITH HEADERS, as headers are useless in this case. Simply load the row and use row[0] as the line for splitting.
It is possible to simplify the MERGE by combining the your first two MERGE clauses with the CREATE clause.
So the loader code is the following:
LOAD CSV FROM 'file:/c:/itw/Ltemps.csv' AS row
WITH row[0] AS line
WITH
toInteger(substring(line, 15, 4)) AS year,
toInteger(substring(line, 88, 2)) AS temp,
substring(line, 87, 1) AS qlfr,
toInteger(substring(line, 92, 1)) AS qual
MERGE (y:Year {year: year})-[r:HAS_TEMP]->(t:Temp {temp: temp})
MERGE (f:Qlfr {qlfr: qlfr})
MERGE (q:Qual {qual: qual})
The following query, executed against an old MySQL database, should reveal a single UTF-8 character 'yama' for mountain.
select convert(sc_cardname using binary) as cn
from mtg.mtg_cdb_set_cards where setcardid = 214400
Instead it yields the following 15 byte array:
[195, 165, 194, 177, 194, 177, 195, 168, 226, 128, 158, 226, 128, 176, 32]
What are these values and how do I get from there to a character identity?
For reference, the expected binary aray would be the following:
[229, 177, 177]
Update: the following code fixes the yama problem, but I don't know why:
var iconv = new Iconv('utf8','ISO-8859-1');
shortBuffer = buffer.slice(0,-9);
result = iconv.convert(shortBuffer).toString('utf8');
The answer was this, everything was actually encoded in LATIN1... changing the connection properties to reflect that solved the problem
I am trying to learn few basic functions in Igraph- But, I am having problems computing the degrees from a gragph: see example below (I copied the following example from this site):
Example of data set:
edges <- matrix(c(103, 86, 24, 103, 103, 2, 92, 103, 87, 103, 103, 101, 103, 44), ncol=2, byrow=T)
Create graph
g <- graph(as.vector(t(edges)))
I can compute the degrees from the matrix edges:
degree(edges)
[1] 378 254 210 390 380 408 294 1230 1084
But I cannot compute the degrees from the graph g:
degree(g)
I am getting the following error:
Error in FUN(X[[1L]], ...) :
as.edgelist.sna input must be an adjacency matrix/array, edgelist matrix, network, or sparse matrix, or list thereof.
Anyone knows why I am getting this error?
So what happened here is igraph::degree is masked by sna::degree.
Just use:
igraph::degree
and it should work
I ran into same issue.
This worked for me:
net <- make_ring(10)
deg <- centralization.degree(net)$res