I use Stata 12.
I want to add some country code identifiers from file df_all_cities.csv onto my working data.
However, this line of code:
merge 1:1 city country using "df_all_cities.csv", nogen keep(1 3)
Gives me the error:
. run "/var/folders/jg/k6r503pd64bf15kcf394w5mr0000gn/T//SD44694.000000"
file df_all_cities.csv not Stata format
r(610);
This is an attempted solution to my previous problem of the file being a dta file not working on this version of Stata, so I used R to convert it to .csv, but that also doesn't work. I assume it's because the command itself "using" doesn't work with csv files, but how would I write it instead?
Your intuition is right. The command merge cannot read a .csv file directly. (using is technically not a command here, it is a common syntax tag indicating a file path follows.)
You need to read the .csv file with the command insheet. You can use it like this.
* Preserve saves a snapshot of your data which is brought back at "restore"
preserve
* Read the csv file. clear can safely be used as data is preserved
insheet using "df_all_cities.csv", clear
* Create a tempfile where the data can be saved in .dta format
tempfile country_codes
save `country_codes'
* Bring back into working memory the snapshot saved at "preserve"
restore
* Merge your country codes from the tempfile to the data now back in working memory
merge 1:1 city country using `country_codes', nogen keep(1 3)
See how insheet is also using using and this command accepts .csv files.
I have a 3 column csv file. The 2nd column contains numbers with a leading zero. For example:
044934343
I need to convert a .csv file into a .xls and to do that I'm using the command line tool called 'unoconv'.
It's converting as expected, however when I load up the .xls in Excel instead of showing '04493434', the cell shows '4493434' (the leading 0 has been removed).
I have tried surrounding the number in the .csv file with a single quote and a double quote however the leading 0 is still removed after conversion.
Is there a way to tell unoconv that a particular column should be of a TEXT type? I've tried to read the man page of unocov however the options are little confusing.
Any help would be greatly appreciated.
Perhaps I came too late at the scene, but just in case someone is looking for an answer for a similar question this is how to do:
unoconv -i FilterOptions=44,34,76,1,1/1/2/2/3/1 --format xls <csvFileName>
The key here is "1/1/2/2/3/1" part, which tells unoconv that the second column's type should be "TEXT", leaving the first and third as "Standard".
You can find more info here: https://wiki.openoffice.org/wiki/Documentation/DevGuide/Spreadsheets/Filter_Options#Token_7.2C_csv_import
BTW this is my first post here...
Is there an efficient command-line tool for prepending lines to a file inside a ZIP archive?
I have several large ZIP files containing CSV files missing their header, and I need to insert the header line. It's easy enough to write a script to extract them, prepend the header, and then re-compress, but the files are so large, it takes about 15 minutes to extract each one. Is there some tool that can edit the ZIP in-place without extracting?
Fast answer, no.
A zip file contains 1 to N file entries inside and all of them works as un splitable units, meaning that if you want to do something on an entry, you need to process this entry completely (i.e. extracting).
The only fast operation you can do is adding a new file to your archive. It will create a new entry and append it to the file, but this is probably not what you need
I am having troubles converting a xlsx file to csv format. Somehow it does not copy the contents of the columns that contain text.
I tried : python xlsx2csv-0.20/xlsx2csv.py -s 2 -d ';' 'testin.xlsx' 'testout.csv'
The result should look like:
"www.vistaheads.com";"http://www.vistaheads.com/forums/microsoft-public-windows-vista-general/200274-vista-mbr-vs-xp-mbr-4.html";"YahooBossAPIv2";;"eng";"ie";;9/8/2010;TRUE;FALSE;;0;-8.2666666667;0;0;0;0
"www.drpletsch.com";"http://www.drpletsch.com/elos-acne-treatment.html";"Oxyme.Searchv3.0.0";;"eng";;;7/31/2012;TRUE;FALSE;;;;0;0;0;0
"www.charterhouse-aquatics.co.uk";"http://www.charterhouse-aquatics.co.uk/catalog/elos-systemmini-marine-litre-aquarium-black-p-7022.html";"YahooBossAPIv2";;"eng";"us";;7/11/2012;TRUE;FALSE;;1;5.6666666667;0;0;0;0
"www.proz.com";"http://www.proz.com/kudoz/latin_to_english/religion/4794760-concio_melos_tinnulo.html";"YahooBossAPIv2";;"eng";"in";;5/7/2012;TRUE;FALSE;;1;3;0;0;0;0
"schoee.blogspot.co.uk";"http://schoee.blogspot.co.uk/2010/08/review-body-shop-vitamin-c-facial.html";"YahooBossAPIv2";;"eng";;;8/1/2010;TRUE;FALSE;;1;1;0;0;0;0
But instead I get:
;;;;;;;09-08-10;TRUE;FALSE;;0.0;-8.266666666666666;0.0;0.0;0.0;0.0;
;;;;;;;07-31-12;TRUE;FALSE;;;;0.0;0.0;0.0;0.0;
;;;;;;;07-11-12;TRUE;FALSE;;1.0;5.666666666666667;0.0;0.0;0.0;0.0;
;;;;;;;05-07-12;TRUE;FALSE;;1.0;3.0;0.0;0.0;0.0;0.0;
;;;;;;;08-01-10;TRUE;FALSE;;1.0;1.0;0.0;0.0;0.0;0.0;
;;;;;;;09-08-10;TRUE;FALSE;;0.0;0.033333333333333354;0.0;0.0;0.0;0.0;
;;;;;;;07-03-12;TRUE;FALSE;;1.0;2.0;0.0;0.0;0.0;0.0;
;;;;;;;10-18-11;TRUE;FALSE;;1.0;4.666666666666667;0.0;0.0;0.0;0.0;
I also tried using ssconvert, but here I get similar outcomes i.e. :
ssconvert -S 'testin.xlsx' testout2.csv
Also here the textual contents somehow vanished:
2010/09/08,TRUE,FALSE,,0,-8.26666666666667,0,0,0,0
"2012/07/31 09:58:39.823",TRUE,FALSE,,,,0,0,0,0
"2012/07/11 13:35:09.220",TRUE,FALSE,,1,5.66666666666667,0,0,0,0
2012/05/07,TRUE,FALSE,,1,3,0,0,0,0
2010/08/01,TRUE,FALSE,,1,1,0,0,0,0
2010/09/08,TRUE,FALSE,,0,0.03333333333333,0,0,0,0
"2012/07/03 22:24:03.467",TRUE,FALSE,,1,2,0,0,0,0
2011/10/18,TRUE,FALSE,,1,4.66666666666667,0,0,0,0
"2012/07/22 02:10:58.313",TRUE,FALSE,,1,2,0,0,0,0
"2012/08/02 17:01:39.637",TRUE,FALSE,,1,1,0,0,0,0
2010/06/05,TRUE,FALSE,,1,4,0,0,0,0
"2012/07/25 16:11:47.843",TRUE,FALSE,,1,2,0,0,0,0
2012/09/26,TRUE,TRUE,1,,,1,0,0,1
2012/04/29,TRUE,TRUE,2,,,8,3,1,4
2012/07/22,TRUE,FALSE,,0,0.03333333333333,0,0,0,0
2012/05/01,TRUE,FALSE,,1,14,0,0,0,0
"2012/08/07 06:17:39.647",TRUE,FALSE,,1,1,0,0,0,0
"2012/07/18 15:15:19.283",TRUE,FALSE,,1,3,0,0,0,0
2012/07/27,TRUE,FALSE,,1,0.33333333333333,0,0,0,0
2010/09/08,TRUE,FALSE,,1,0.33333333333333,0,0,0,0
"2012/07/21 18:10:57.700",TRUE,FALSE,,1,0.33333333333333,0,0,0,0
The Excel file looks fine to me. Any ideas what could be going wrong ?
The Excel file is generated using Apache POI, maybe that's a clue?
Kind regards,
Rianne