EOFError: Ran out of input with pickle while the file is not empty - pickle

This is a part of my code where I open the file and face this error while the file is not empty. Because I opened it and wrote it hundreds of time during my code and it didn't have this problem. The file size on my disk is 25MB so it's not definitely empty. Here is were I open it:
file_to_read = open("attribute.pickle", "rb")
unpickler = pickle.Unpickler(file_to_read);
attribute = unpickler.load()
file_to_read.close()
The attribute file is updated (appended) during the code and here is where I save it to the file:
geeky_file = open("attribute.pickle", 'wb')
pickle.dump(attribute, geeky_file)
geeky_file.close()
So this openning, appending, and saving has been done in loop hundreds of time without any problem. Now that I wanted to rerun the code, it faced the "EOFError: Ran out of input" error when loading at
attribute = unpickler.load()
I tried reading the post for the same error but all answers say the file might be empty while mine definitely isn't.

Related

What is causing a csv load error in weka?

Im receiving the following error when trying to open a CSV file in Weka version 3.8.5
File not recognized as an 'CSV data files' file Reason: wrong number
of values. Read 2, expected 12, read Token [EOL], Line 2 Problem
encountered on Line:2
I have read solutions to similar errors on this site and can't seem to find what is wrong with my particular file. However, as a very newbie weka user, it may just be my misunderstanding of the issue. Can someone take a look at the sample csv data below and let me know if you see what I am not understnding or missing?
LossMonth,LossYear,ClaimNumber,PolicyNumber,ClaimBranch,Agency,LocationCounty,CATCode,CauseCode,IncurredLoss,CurrentReserves,"
City",State,ZIPCODE,"
COLLISIONTYPECD","
CLOSEDDT",DaystoCLose,"
FATALITYCNT","
FATALITYIND","
FAULTRATINGIND","
AUTOGLASSIND","
DEERLOSSIND","
WEATHERRELATEDIND","
POLICYTIERCD",ClaimStatus,AgencyHandled,VEHICLEYEAR,DRIVERRELATIONTOINSUREDDESC,TOTALLOSSIND,INSURANCESCORE,Age
10,2016,4125858,20169200,4,113,73,1,comp,2525,0,PADUCAH,KY,42001,x,42692,18,0,0,0,0,0,1,70,1,0,2004,Other third party,0,703,73
1,2018,4265645,20137828,13,106,37,1,hail,3164,0,BAGDAD,KY,40003,x,43214,88,0,0,0,0,0,0,50,1,0,2010,Named Insured,1,799,63
12,2016,4136759,20322058,5,105,105,1,hail,2547,0,GEORGETOWN,KY,40324,x,42713,2,0,0,0,0,0,0,10,1,0,2010,Named Insured,0,999,68
1,2016,4033032,20175699,13,106,106,1,comp,15327,0,SIMPSONVILLE,KY,40067,x,42469,73,0,0,0,0,0,1,80,1,0,2000,Named Insured,1,999,34
9,2016,4116782,20133146,2,115,115,1,wind,7529,0,SPRINGFIELD,KY,40069,x,42649,8,0,0,0,0,0,0,10,1,0,2003,Named Insured,0,783,47
2,2016,4038442,20170355,7,148,10,1,hail,3631,0,ASHLAND,KY,41101,x,42417,1,0,0,0,0,0,0,50,1,0,2010,Named Insured,0,778,42
2,2016,4039439,20218265,7,45,10,1,hail,3579,0,FLATWOODS,KY,41139,x,42444,25,0,0,0,0,0,0,40,1,0,2013,Named Insured,0,820,52
2,2016,4039440,20218265,7,45,10,1,hail,570,0,FLATWOODS,KY,41139,x,42422,3,0,0,0,0,0,0,40,1,0,2012,Named Insured,0,820,52
3,2018,4275810,20126522,15,40,40,1,hail,3747,0,LANCASTER,KY,40444,x,43216,55,0,0,0,0,0,0,10,1,0,2009,Named Insured,1,999,74
5,2016,4071936,20461965,15,40,40,1,hail,525,0,LANCASTER,KY,40444,x,42521,7,0,0,0,0,0,0,50,1,0,2006,Named Insured,0,999,68
3,2016,4046685,20226270,7,35,35,1,hail,3558,0,FLEMINGSBURG,KY,41041,x,42447,2,0,0,0,0,0,0,80,1,0,2012,Named Insured,0,842,69
4,2016,4055942,20439287,7,35,35,1,hail,2551,0,EWING,KY,41039,x,42475,1,0,0,0,0,0,0,70,1,0,2006,Named Insured,0,867,48
1,2016,4026514,20394097,7,148,10,1,hail,1350,0,ASHLAND,KY,41101,x,42376,3,0,0,0,0,0,0,40,1,0,2007,Named Insured,0,637,65
3,2016,4047152,20212062,15,141,76,1,hail,1739,0,BEREA,KY,40403,x,42473,27,0,0,0,0,0,0,80,1,0,2008,Named Insured,0,777,77
2,2016,4035512,20103029,15,40,40,1,hail,2008,0,LANCASTER,KY,40444,x,42405,1,0,0,0,0,0,0,0,1,0,2000,Named Insured,1,885,72
1,2016,4030456,20385643,15,120,40,1,hail,1497,0,LANCASTER,KY,40444,x,42450,62,0,0,0,0,0,0,20,1,0,2013,Named Insured,0,839,65
4,2016,4053299,20251610,5,69,11,1,hail,1535,0,DANVILLE,KY,40422,x,42514,48,0,0,0,0,0,0,100,1,0,2013,Insured,0,999,64
6,2016,4076264,20337992,17,140,1,1,hail,1799,0,MILLTOWN,KY,42728,x,42529,2,0,0,0,0,0,0,50,1,0,2002,Named Insured,0,999,84
8,2017,4217498,20596983,8,86,86,1,hail,660,0,TOMPKINSVILLE,KY,42167,x,42954,0,0,0,0,0,0,0,100,1,0,2012,Named Insured,0,999,45
1,2016,4026053,20511114,4,113,113,1,hail,1310,0,STURGIS,KY,42459,x,42376,3,0,0,0,0,0,0,100,1,0,2003,Named Insured,0,694,44
1,2016,4026766,20656586,4,113,113,1,hail,2360,0,MORGANFIELD,KY,42437,x,42383,9,0,0,0,0,0,0,20,1,0,2010,Named Insured,0,999,89
1,2016,4027473,20085251,6,42,42,1,hail,1699,0,MAYFIELD,KY,42066,x,42381,5,0,0,0,0,0,0,90,1,0,2008,Named Insured,0,747,50
1,2016,4029284,20167051,17,109,109,1,wind,3133,0,CAMPBELLSVILLE,KY,42718,x,42387,5,0,0,0,0,0,0,10,1,0,1993,Named Insured,0,886,78
1,2016,4031937,20326278,3,81,12,1,comp,3385,0,FOSTER,KY,41043,x,42402,8,0,0,0,0,0,1,40,1,0,2003,Named Insured,0,723,79
1,2016,4027931,20339366,8,107,107,1,wind,5858,0,FRANKLIN,KY,42134,x,42447,70,0,0,0,0,0,0,20,1,0,2014,Named Insured,0,940,80
1,2016,4028456,20453076,15,87,87,1,comp,2056,0,JEFFERSONVILLE,KY,40337,x,42387,7,0,0,0,0,0,1,100,1,0,2013,Named Insured,0,999,51
1,2016,4028597,20051661,4,113,113,1,hail,5320,0,WAVERLY,KY,42462,x,42712,332,0,0,0,0,0,0,20,1,0,2014,Named Insured,0,717,58
3,2016,4046687,20018268,6,42,42,1,hail,2736,0,MAYFIELD,KY,42066,x,42450,5,0,0,0,0,0,0,110,1,0,2012,Named Insured,0,735,73
9,2016,4116499,20128172,3,96,59,1,glss,320,0,TAYLOR MILL,KY,41015,x,42660,20,0,0,0,0,0,1,0,1,0,1997,Spouse,0,923,81
1,2016,4026247,20086164,4,113,113,1,hail,1611,0,MORGANFIELD,KY,42437,x,42376,3,0,0,0,0,0,0,10,1,0,2013,Named Insured,0,902,61
1,2016,4027222,20033936,6,79,79,1,glss,105,0,CALVERT CITY,KY,42029,x,42389,14,0,0,0,0,0,1,110,1,0,2001,Named Insured,0,772,57
1,2016,4028311,20059964,4,75,75,1,comp,1040,0,SACRAMENTO,KY,42372,x,42382,2,0,0,0,0,0,1,10,1,0,1996,Named Insured,0,999,64
1,2016,4029164,20541039,6,42,42,1,wind,1495,0,SEDALIA,KY,42079,x,42382,0,0,0,0,0,0,0,0,1,0,2008,Named Insured,0,756,67
1,2016,4027475,20085251,6,42,42,1,hail,940,0,MAYFIELD,KY,42066,x,42381,5,0,0,0,0,0,0,90,1,0,2013,Named Insured,0,747,50
1,2016,4030356,20007300,4,117,117,1,hail,6550,0,DIXON,KY,42409,x,42436,49,0,0,0,0,0,0,40,1,0,2009,Named Insured,0,864,34
Weka's CSVLoader cannot handle rows that span multiple lines (despite quoting). Once all your rows (header and data) are one per line, you should be fine.
The common-csv (unofficial) Weka package should be able to handle rows spanning multiple lines.

Python String replacement in a file each time gives different result

I have a JSON file and I want to do some replacements in it. I've made a code, it works but it's wonky.
This is where the replacement gets done.
replacements1 = {builtTelefon:'Isim', builtIlce:'Isim', builtAdres:'Isim', builtIsim:'Isim'}
replacements3 = {builtYesterdayTelefon:'Isim', builtYesterdayIlce:'Isim', builtYesterdayAdres:'Isim', builtYesterdayIsim:'Isim'}
with open('veri3.json', encoding='utf-8') as infile, open('veri2.json', 'w') as outfile:
for line in infile:
for src, target in replacements1.items():
line = line.replace(src, target)
for src, target in replacements3.items():
line = line.replace(src, target)
outfile.write(line)
Here's some examples to what builtAdres and builtYesterdayAdres looks like:
01 Temmuz 2018 Pazar.1
30 Haziran 2018 Cumartesi.1
I run this on my data but it results in many different outputs each time. Please do check the screenshot below because I don't know how else I can tell about it.
This is the very same code and I run the same thing everytime but it results in with different outcomes each time.
Here is the original JSON file:
What it should do is testing entire file against 01 Temmuz 2018 Pazar and if it finds just replaces it with string Isim without touching anything else. On a second run checks if anything is 30 Haziran 2018 Cumartesi and replaces them with string Isim too.
What's causing this?
Example files for re-testing:
pastebin - veri3.json
pastebin - code.py
I think you have just one problem: you're trying to use "Isim" as key name multiple times within the same object, and this will botch the JSON.
The reason why you might be "getting different results" might have to do with the client you're using to display the JSON. I think that if you look at the raw data, the JSON should have been fully altered (I ran your script and it seems to be altered). However, the client will not handle very well the repeated key, and will display all objects as well as it can.
In fact, I'm not sure how you get "Isim.1", "Isim.2" as keys, since you actually use "Isim" for all. The client must be trying to cope with the duplicity there.
Try this code, where I use "Isim.1", "Isim.2" etc.:
replacements1 = {builtTelefon:'Isim.3', builtIlce:'Isim.2', builtAdres:'Isim.1', builtIsim:'Isim'}
replacements3 = {builtYesterdayTelefon:'Isim.3', builtYesterdayIlce:'Isim.2', builtYesterdayAdres:'Isim.1', builtYesterdayIsim:'Isim'}
I think you should be able to have all the keys displayed now.
Oh and PS: to use your code with my locale I had to change line 124 to specify 'utf-8' as encoding for the outfile:
with open('veri3.json', encoding='utf-8') as infile, open('veri2.json', 'w', encoding='utf-8') as outfile:

Stata doesn't save output file

I'm having difficulties to save the output of my regression. Stata is supposed to save the file as "output.dta" in the defined directory, however the file is not saved in this folder (and also nowhere else on my PC). Here is the final piece of code, where I want it to be saved:
if (`counter'==1) {
save "C:\Users\Milla\Code\output", replace
local counter = `counter' + 1
}
if (`counter'!=1) {
cap append using "C:\Users\Milla\Code\output"
duplicates drop *, force
cap save "C:\Users\Milla\Code\output", replace
}
Does anyone have an idea why could this happen? The code runs well and throws no errors or warnings. But it also doesn't say "output.dta is saved" as it normally does, when one saves anything in Stata.
Thanks ahead and best regards,
Milla
try
save "C:\Users\Milla\Code\output.dta", r

JSON-file without line breaks, cant import file to SAS

I have a large json file (250 Mb) that has no line breaks in it when opening the file in notepad or SAS. But if I open it in Wordpad, I get the correct line breaks. I suppose this could mean the json file uses unix line breaks, which notapad can't read, but wordpad can read, from what I have read.
I need to import the file to SAS. One way of doing this migth be to open the file in wordpad, save it as a text file, which will hopefully retain the correct line breaks, so that I can read the file in SAS. I have tried reading the file, but without line breaks, I only get the first observation, and I can't get the program to find the next observation.
I have tried getting wordpad to save the file, but wordpad crashes each time, probably because of the file size. Also tried doing this through powershell, but can't figure out how to save the file once it is opened, and I see no reason why it should work seeing as wordpad crashes when i try it through point and click.
Is there another way to fix this json-file? Is there a way to view the unix code for line breaks and replace it with windows line breaks, or something to that effect?
EDIT:
I have tried adding the TERMSTR=LF option both in filename and infile, without any luck:
filename test "C:\path";
data datatest;
infile test lrecl = 32000 truncover scanover TERMSTR=LF;
input #'"Id":' ID $9.;
run;
However, If I manually edit a small portion of the file to have line breaks, it works. The TERMSTR option doesn't seem to do much for me
EDIT 2:
Solved using RECFM=F
data datatest;
infile test lrecl = 42000 truncover scanover RECFM=F ;
input #'"Id":' ID $9.;
run;
EDIT 3:
Turn out it didnt solve the problem after all. RECFM=F means all records have a fixed length, which they don't, so my data gets mixed up and a lot of info is skipped. Tried RECFM=V(ariable), but this is not working either.
I guess you're using windows, so try:
TYPE input_filename | MORE /P > output_filename
this should replace unix style text file with windows/dos one.
250 Mbytes is not too long to treat as a single record.
data want ;
infile json lrecl=250000000; *250 Mb ;
input #'"Id":' ID :$9. ##;
run;

Trouble following Encrypted Big-Query tutorial document

I wanted to try out the encrypted big query client for google big query and I've been having some trouble.
I'm following the instructions outlined in this PDF:
https://docs.google.com/file/d/0B-WB8hYCrhZ6cmxfWFpBci1lOVE/edit
I get to the point where I'm running this command:
ebq load --master_key_filename="key_file" testdataset.cars cars.csv cars.schema
And I'm getting an error string which ends with:
raise ValueError("No JSON object could be decoded")
I've tried a few different formats for my .csv and .schema files but none have worked. Here are my latest versions.
cars.schema:
[{"name": "Year", "type": "integer", "mode": "required", "encrypt": "none"}
{"name": "Make", "type": "string", "mode": "required", "encrypt": "pseudonym"}
{"name": "Model", "type": "string", "mode": "required", "encrypt": "probabilistic_searchwords"}
{"name": "Description", "type": "string", "mode": "nullable", "encrypt": "searchwords"}
{"name": "Website", "type": "string", "mode": "nullable", "encrypt": "searchwords","searchwords_separator": "/"}
{"name": "Price", "type": "float", "mode": "required", "encrypt": "probabilistic"}
{"name": "Invoice_Price", "type": "integer", "mode": "required", "encrypt": "homomorphic"}
{"name": "Holdback_Percentage", "type": "float", "mode": "required", "encrypt":"homomorphic"}]
cars.csv:
1997,Ford,E350, "ac\xc4a\x87, abs, moon","www.ford.com",3000.00,2000,1.2
1999,Chevy,"Venture ""Extended Edition""","","www.cheverolet.com",4900.00,3800,2.3
1999,Chevy,"Venture ""Extended Edition, Very Large""","","www.chevrolet.com",5000.00,4300,1.9
1996,Jeep,Grand Cherokee,"MUST SELL! air, moon roof,loaded","www.chrysler.com/jeep/grand­cherokee",4799.00,3950,2.4
I believe the issue may be that you need to move the --master_key_filename argument before the load argument. If that doesn't work, can you send the output of adding --apilog=- as the first argument?
Also, there is an example script file of running ebq here:
https://code.google.com/p/bigquery-e2e/source/browse/#git%2Fsamples%2Fch13