Related
I am having some data coming from csv which has \n character in it and I expect neo4j to add a new line when assigning that string to some attribute in node. Apparently its not working. I can see \n character as it is added in the string.
How to make it work? Thanks in Advance.
Following is one such string example from CSV:
Combo 4 4 4 5 \n\nSpare Fiber Inventory. \nMultimode Individual fibers from 9927/9928 to FDB.\nNo available spares from either BTS to FDB - New conduits would be required\n\nFrom FDB to tower top. 9 of 9 Spares available on 2.5 riser cables.
My load command:
USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS
FROM 'file:///abc.csv' AS line
WITH line WHERE line.parent <> "" AND line.type = 'LSD' AND line.parent_type = 'XYZ'
This is a hack that I made to replace the occurrences of \n with a newline. The character \ is an escape character so it will replace \n with a new line in line 4. Do not remove line 5 and combine with line 4.
LOAD CSV WITH HEADERS
FROM 'file:///abc.csv' AS line
WITH line WHERE line.parent <> ""
WITH replace(line.parent,'\\n',"
") as parent
MERGE (p:Parent {parent: parent})
RESULT:
{
"identity": 16,
"labels": [
"Parent"
],
"properties": {
"parent": "Combo 4 4 4 5
Spare Fiber Inventory.
Multimode Individual fibers from 9927/9928 to FDB.
No available spares from either BTS to FDB - New conduits would be required
From FDB to tower top. 9 of 9 Spares available on 2.5 riser cables."
}
}
here is how my dataset looks like, I am trying to filter out country that the 4th column is >= 1000.
Marshall Islands,53127,77,41
Vanuatu,276244,25,70
Solomon Islands,611343,23,142
Sao Tome and Principe,204327,72,147
Belize,374681,46,171
Maldives,436330,39,172
Guyana,777859,27,206
Eswatini,1367254,24,323
Timor-Leste,1296311,30,392
Lesotho,2233339,28,619
Guinea-Bissau,1861283,43,799
Namibia,2533794,49,1242
Gambia,2100568,61,1273
.
.
.
Zimbabwe,16529904,32,5329
(total 77 lines of data)
I have tried to run the following command on my terminal, but it only output 1 line of the dataset to new file.
awk -F, '$4 > 999' original.csv > new.csv
*update, all line except Zimbabwe are ending with ^M$.
Here is desired output
Namibia,2533794,49,1242
Gambia,2100568,61,1273
Burundi,10864245,13,1380
Armenia,2930450,63,1849
Rwanda,12208407,17,2091
Mongolia,3075647,68,2103
Kyrgyzstan,6045117,36,2184
Mauritania,4420184,53,2335
Lao People's Democratic Republic,6858160,34,2357
Liberia,4731906,51,2399
Tajikistan,8921343,27,2407
Sierra Leone,7557212,42,3147
Togo,7797694,41,3210
Chad,14899994,23,3406
Congo,5260750,66,3496
Cambodia,16005373,23,3678
Paraguay,6811297,61,4175
El Salvador,6377853,71,4546
Guinea,12717176,36,4552
Benin,11175692,47,5227
Zimbabwe,16529904,32,5329
Azerbaijan,9827589,55,5439
Burkina Faso,19193383,29,5517
Nepal,29304998,19,5666
Haiti,10981229,54,5968
Somalia,14742523,44,6544
Zambia,17094131,43,7346
Senegal,15850567,47,7409
Bolivia (Plurinational State of),11051600,69,7634
Mali,18541980,42,7708
Tunisia,11532127,69,7916
Guatemala,16913504,51,8572
Dominican Republic,10766998,80,8643
Cuba,11484636,77,8841
Afghanistan,35530082,25,8971
Syrian Arab Republic,18269867,54,9774
Uganda,42862957,23,9942
Yemen,28250420,36,10175
Kazakhstan,18204498,57,10438
Ecuador,16624857,64,10585
Côte d'Ivoire,24294750,50,12227
Kenya,49699863,27,13201
Cameroon,24053727,56,13416
Sudan,40533328,34,13931
Ghana,28833629,55,15976
Myanmar,53370609,30,16183
United Republic of Tanzania,57310020,33,18943
Angola,29784193,65,19312
Ethiopia,104957438,20,21317
Peru,32165484,78,24999
Iraq,38274617,70,26899
Algeria,41318141,72,29771
Viet Nam,95540797,35,33643
Thailand,69037516,49,33966
Democratic Republic of the Congo,81339984,44,35692
South Africa,56717156,66,37348
Colombia,49065613,80,39471
Egypt,97553148,43,41660
Philippines,104918094,47,48978
Bangladesh,164669750,36,59047
Pakistan,197015953,36,71797
Nigeria,190886313,50,94525
Mexico,129163273,80,103159
Indonesia,263991375,55,144295
India,1339180125,34,449965
Does anyone have suggestions on how to fix this issue?
Assuming that your Input_file's last field may have spaces in it. You can also check it by doing cat -e Input_file it will show you where is line ending including hidden spaces at the line end. If this is the case then try following command.
awk 'BEGIN{FS=","} $4+0 > 999' Input_file
sorry if this is a newbie question, but I didn't find the answer to this particular question on stackoverflow.
I have a (very large) fixed-width data file that looks like this:
simplefile.txt
ratno fdate ratname typecode country
12346 31/12/2010 HARTZ 4 UNITED STATES
12444 31/12/2010 CHRISTIE 5 UNITED STATES
12527 31/12/2010 HILL AIR 4 UNITED STATES
15000 31/12/2010 TOKUGAVA INC. 5 JAPAN
37700 31/12/2010 HARTLAND 1 UNITED KINGDOM
37700 31/12/2010 WILDER 1 UNITED STATES
18935 31/12/2010 FLOWERS FINAL SERVICES INC 5 UNITED STATES
37700 31/12/2010 MAPLE CORPORATION 1 CANADA
48614 31/12/2010 SERIAL MGMT L.P. 5 UNITED STATES
1373 31/12/2010 AMORE MGMT GROUP N A 1 UNITED STATES
I am trying to convert it into a csv file using the terminal (the file is too big for Excel) that would look like this:
ratno,fdate,ratname,typecode,country
12346,31/12/2010,HARTZ,4,UNITED STATES
12444,31/12/2010,CHRISTIE,5,UNITED STATES
12527,31/12/2010,HILL AIR,4,UNITED STATES
15000,31/12/2010,TOKUGAVA INC.,5,JAPAN
37700,31/12/2010,HARTLAND,1,UNITED KINGDOM
37700,31/12/2010,WILDER,1,UNITED STATES
18935,31/12/2010,FLOWERS FINAL SERVICES INC,5,UNITED STATES
37700,31/12/2010,MAPLE CORPORATION,1,CANADA
48614,31/12/2010,SERIAL MGMT L.P.,5,UNITED STATES
1373,31/12/2010,AMORE MGMT GROUP N A,1,UNITED STATES
I dug a bit around on this site and found a possible solution that relies on the awk shell command:
awk -v FIELDWIDTHS="5 11 31 9 16" -v OFS=',' '{$1=$1;print}' "simpletestfile.txt"
However, when I execute the above command in the terminal, it inadvertently also inserts commas in all white spaces, inside the separate words of what is supposed to remain a single field. The result of the above execution is as follows:
ratno,fdate,ratname,typecode,country
12346,31/12/2010,HARTZ,4,UNITED,STATES
12444,31/12/2010,CHRISTIE,5,UNITED,STATES
12527,31/12/2010,HILL,AIR,4,UNITED,STATES
15000,31/12/2010,TOKUGAVA,INC.,5,JAPAN
37700,31/12/2010,HARTLAND,1,UNITED,KINGDOM
37700,31/12/2010,WILDER,1,UNITED,STATES
18935,31/12/2010,FLOWERS,FINAL,SERVICES,INC,5,UNITED,STATES
37700,31/12/2010,MAPLE,CORPORATION,1,CANADA
48614,31/12/2010,SERIAL,MGMT,L.P.,5,UNITED,STATES
1373,31/12/2010,AMORE,MGMT,GROUP,N,A,1,UNITED,STATES
How can I avoid inserting commas in white spaces outside of delineated fieldwidths? Thank you!
Your attempt was good, but requires gawk (gnu awk) for the FIELDWIDTHS built-in variable. With gawk:
$ gawk -v FIELDWIDTHS="5 11 31 9 16" -v OFS=',' '{$1=$1;print}' file
ratno, fdate, ratname , typecode, country
12346, 31/12/2010, HARTZ , 4 , UNITED STATES
12444, 31/12/2010, CHRISTIE , 5 , UNITED STATES
12527, 31/12/2010, HILL AIR , 4 , UNITED STATES
Assuming you don't want the extra spaces, you can do instead:
$ gawk -v FIELDWIDTHS="5 11 31 9 16" -v OFS=',' '{for (i=1; i<=NF; ++i) gsub(/^ *| *$/, "", $i)}1' file
ratno,fdate,ratname,typecode,country
12346,31/12/2010,HARTZ,4,UNITED STATES
12444,31/12/2010,CHRISTIE,5,UNITED STATES
12527,31/12/2010,HILL AIR,4,UNITED STATES
If you don't have gnu awk, you can achieve the same results with:
$ awk -v fieldwidths="5 11 31 9 16" '
BEGIN { OFS=","; split(fieldwidths, widths) }
{
rec = $0
$0 = ""
start = 1;
for (i=1; i<=length(widths); ++i) {
$i = substr(rec, start, widths[i])
gsub(/^ *| *$/, "", $i)
start += widths[i]
}
}1' file
ratno,fdate,ratname,typecode,country
12346,31/12/2010,HARTZ,4,UNITED STATES
12444,31/12/2010,CHRISTIE,5,UNITED STATES
12527,31/12/2010,HILL AIR,4,UNITED STATES
perl is handy here:
perl -nE ' # read this bottom to top
say join ",",
map {s/^\s+|\s+$//g; $_} # trim leading/trailing whitespace
/^(.{5}) (.{10}) (.{30}) (.{8}) (.*)/ # extract the fields
' simplefile.txt
ratno,fdate,ratname,typecode,country
12346,31/12/2010,HARTZ,4,UNITED STATES
12444,31/12/2010,CHRISTIE,5,UNITED STATES
12527,31/12/2010,HILL AIR,4,UNITED STATES
15000,31/12/2010,TOKUGAVA INC.,5,JAPAN
37700,31/12/2010,HARTLAND,1,UNITED KINGDOM
37700,31/12/2010,WILDER,1,UNITED STATES
18935,31/12/2010,FLOWERS FINAL SERVICES INC,5,UNITED STATES
37700,31/12/2010,MAPLE CORPORATION,1,CANADA
48614,31/12/2010,SERIAL MGMT L.P.,5,UNITED STATES
1373,31/12/2010,AMORE MGMT GROUP N A,1,UNITED STATES
Although, for proper CSV, we need to be a bit cautious about fields containing commas or quotes. If I was feeling less secure about the contents of the file, I'd use this map block:
map {s/^\s+|\s+$//g; s/"/""/g; qq("$_")}
which outputs
"ratno","fdate","ratname","typecode","country"
"12346","31/12/2010","HARTZ","4","UNITED STATES"
"12444","31/12/2010","CHRISTIE","5","UNITED STATES"
"12527","31/12/2010","HILL AIR","4","UNITED STATES"
"15000","31/12/2010","TOKUGAVA INC.","5","JAPAN"
"37700","31/12/2010","HARTLAND","1","UNITED KINGDOM"
"37700","31/12/2010","WILDER","1","UNITED STATES"
"18935","31/12/2010","FLOWERS FINAL SERVICES INC","5","UNITED STATES"
"37700","31/12/2010","MAPLE CORPORATION","1","CANADA"
"48614","31/12/2010","SERIAL MGMT L.P.","5","UNITED STATES"
"1373","31/12/2010","AMORE MGMT GROUP N A","1","UNITED STATES"
I would like to use a webservice who deliver a binary file with some data, I know the result but I don't know how I can decode this binary with a script.
Here is my binary :
https://pastebin.com/3vnM8CVk
0a39 0a06 3939 3831 3438 1206 4467 616d
6178 1a0b 6361 7264 6963 6f6e 5f33 3222
0d54 6865 204f 6c64 2047 7561 7264 2a02
....
Some part are in ASCII so it easy to catch, at the end of the file you got vehicle name in ASCII and some data, it should be kill/victory/battle/XP/Money data but I don't understand how I can decode these hexa value, I tried to compare 2 vehicles who got same kills but I don't see any match.
There is a way to decode these data ?
Thanks :)
Hello guys, after 1 year I started again to find a solution, so here is the structure of the packet I guessed : (the part between [ ] I still don't know what is it for)
[52 37 08 01 10] 4E [18] EA [01 25] AB AA AA 3E [28] D4 [01 30] EC [01 38] 88 01 [40] 91 05 [48] 9F CA 22 [50] F5 C2 9A 02 [5A 12]
| | | | | | | | |
Victories Victory Ratio| | Air target| Xp Money earned
| | | Ground Target
Battles Deaths Respawns
So here is the result :
Victory : 78
Battles : 234
Victory Ratio : ? (should be arround 33%)
Deaths : 212
Respawns : 236
Air Target : 136
Ground Target : 657
Xp : ? (should be arround 566.56k)
Money : ? (should be arround 4.63M)
Is there a special way to calculate the result of a long hex like this ?
F5 C2 9A 02 (should be arround 4.63M)
I tell you a bit more :
I know the result, but I don't know how to calculate it with these hex from the packet.
If I check a packet with a small amout of money or XP to be compatible with one hex :
[52 1E 08 01 10] 01 [18] [01 25] 00 00 80 3F [28] 01 [30] 01 [48] 24 [50] 6E [5A 09]
6E = 110 Money earned
24 = 36 XP earned
Another exemple :
[52 21 08 01 10] 02 [18] 03 [25] AB AA 2A 3F [28] 02 [30] 03 [40] 01 [48] 78 [50] C7 08 [5A 09]
XP earned = hex 78 = 120
Money earned = hex C7 08 = 705
How C7 08 can do 705 decimal ?
Here is the full content in case but I know how to isolate just these part I don't need to decode all these hex data :
https://pastebin.com/vAKPynNb
What you have asked is nothing but how to reverse engineer a binary file. Lot of threads already on SO
Reverse engineer a binary dictionary file to extract strings
Tools to help reverse engineer binary file formats
https://reverseengineering.stackexchange.com/questions/3495/what-tools-exist-for-excavating-data-structures-from-flat-binary-files
http://www.iwriteiam.nl/Ha_HTCABFF.html
The final take out on all is that no single solution for you, you need to spend effort to figure it out. There are tools to help you, but don't expect a magic wand tool to give you the structure/data.
Any kind of file read operation is done in text or binary format with basic file handlers. And some languages offer type reading of int, float etc. or arrays of them.
The complex operations behind these reading are almost always kept hidden from normal users. And then the user has to learn more when it comes to read/write operations of data structures.
In this case, OFFSET and SEEK are the words one must find value and act accordingly. whence data read, it must be converted to suitable data type too.
The following code shows basics for these operations to write data and read blocks to get numbers back. It is written in PHP as the OP has commented in the question he uses PHP.
Offset is calculated with these byte values to be 11: char: 1 byte, short: 2 bytes, int: 4 bytes, float: 4 bytes.
<?php
$filename = "testdata.dat";
$filehandle = fopen($filename, "w+");
$data=["test string","another test string",77,777,77777,7.77];
fwrite($filehandle,$data[0]);
fwrite($filehandle,$data[1]);
$numbers=array_slice($data,2);
fwrite($filehandle,pack("c1s1i1f1",...$numbers));
fwrite($filehandle,"end"); // gives 3 to offset
fclose($filehandle);
$filename = "testdata.dat";
$filehandle = fopen($filename, "rb+");
$offset=filesize($filename)-11-3;
fseek($filehandle,$offset);
$numberblock= fread($filehandle,11);
$numbersback=unpack("c1a/s1b/i1c/f1d",$numberblock);
var_dump($numbersback);
fclose($filehandle);
?>
Once this example understood, the rest is to find the data structure in the requested file. I had written another example but it uses assumptions. I leave the rest to readers to find what assumptions I made here. Be careful though: I know nothing about real structure and values will not be correct.
<?php
$filename = "testfile";
$filehandle = fopen($filename, "rb");
$offset=17827-2*41; //filesize minus 2 user area
fseek($filehandle,$offset);
print $user1 = fread($filehandle, 41);echo "<br>";
$user1pr=unpack("s1kill/s1victory/s1battle/s1XP/s1Money/f1Life",$user1);
var_dump($user1pr); echo "<br>";
fseek($filehandle,$offset+41);
print $user2 = fread($filehandle, 41);echo "<br>";
$user2pr=unpack("s1kill/s1victory/s1battle/i1XP/i1Money/f1Life",$user2);
var_dump($user2pr); echo "<br>";
echo "<br><br>";
$repackeduser2=pack("s3i2f1",$user2pr["kill"],$user2pr["victory"],
$user2pr["battle"],$user2pr["XP"],$user2pr["Money"],
$user2pr["Life"]
);
print $user2 . "<br>" .$repackeduser2;
print "<br>3*s1=6bytes, 2*i=6bytes, 1*f=*bytes (machine dependent)<br>";
print pack("s1",$user2pr["kill"]) ."<br>";
print pack("s1",$user2pr["victory"]) ."<br>";
print pack("s1",$user2pr["battle"]) ."<br>";
print pack("i1",$user2pr["XP"]) ."<br>";
print pack("i1",$user2pr["Money"]) ."<br>";
print pack("f1",$user2pr["Life"]) ."<br>";
fclose($filehandle);
?>
PS: pack and unpack uses machine dependent size for some data types such as int and float, so be careful with working them. Read Official PHP:pack and PHP:unpack manuals.
This looks more like the hexdump of a binary file. Some methods of converting hex to strings resulted in the same scrambled output. Only some lines are readable like this...
Dgamaxcardicon_32" The Old Guard
As #Tarun Lalwani said, you would have to know the structure of this data to get the in plaintext.
If you have access to the raw binary, you could try using strings https://serverfault.com/questions/51477/linux-command-to-find-strings-in-binary-or-non-ascii-file
I have a list "newdetails" that is a list of lists and it needs to be written to a csv file. Each field needs to take up a cell (without the trailing characters and commas) and each sublist needs to go on to a new line.
The code I have so far is:
file = open(s + ".csv","w")
file.write(str(newdetails))
file.write("\n")
file.close()
This however, writes to the csv in the following, unacceptable format:
[['12345670' 'Iphone 9.0' '500' 2 '3' '5'] ['12121212' 'Samsung Laptop' '900' 4 '3' '5']]
The format I wish for it to be in is as shown below:
12345670 Iphone 9.0 500 5 3 5
12121212 Samsung Laptop 900 5 3 5
You can use csv module to write information to csv file.
Please check below links:
csv module in Python 2
csv module in Python 3
Code:
import csv
new_details = [['12345670','Iphone 9.0','500',2,'3','5'],
['12121212','Samsung Laptop','900',4,'3','5']]
import csv
with open("result.csv","w",newline='') as fh
writer = csv.writer(fh,delimiter=' ')
for data in new_details:
writer.writerow(data)
Content of result.csv:
12345670 "Iphone 9.0" 500 2 3 5
12121212 "Samsung Laptop" 900 4 3 5