I have a csv file from a legacy dbase dbf file. The data contains a few columns which have number values with hyphens. Like this '661-'. I am trying to import the csv into MySql using 'Import External Data' in MySql Yog. The issue is, the columns that have values with hyphens are getting imported as decimals resulting in '-661.0000'.
This is odd as the column format in the csv (via MS Excel) is 'general' not 'number' and I am trying to import these values into varchar fields. Seems MySql is ignoring the settings.
Has anyone faced something like this or have any suggestions on how I can get the data in as a string not a decimal?
Thanks
ANSWER - sorry to be answering my own question but I did solve it with some group input.
The file needs to be saved with all fields (that you want to be treated as string) in quotes. MS Excel DOES NOT seem to have an option for this. Apache Open Office does. Open the file in AOO and save as text.csv. From there you can edit the filters and set all cells to be in quotes. Problem solved.
Related
So I've seen this question asked many times but I have not found an answer to my issue. I'm using phpmyadmin, I have 1 table with 2 columns and I have a .csv with 2 columns. My csv does not have headers and my columns are separated by ";", already changed it in phpmyadmin to "Columns separated with ;", but I still got the same error. Can anyone help?
edit: I'm using the "Import" option of phpMyAdmin to import my csv
edit2: So I decided to export my table to see how the csv was generated. It exports like this:
"1", "1001"
"2", "1002"
Do you know why? or how can I create an csv file with the same format?
export table
As it says in the manual
When importing data into a table from a CSV file where the table has an ‘auto_increment’ field, make the ‘auto_increment’ value for each record in the CSV field to be ‘0’ (zero). This allows the ‘auto_increment’ field to populate correctly.
see https://docs.phpmyadmin.net/en/latest/import_export.html
So, what worked for me was: I exported the table as an csv and worked over that. After I was done, I imported this modified csv and it worked. It must've been what nbk said about termination, so I guess in a way that was the answer
I'm trying to import some data from a CSV into my database with phpmyadmin.
Here's a row from the CSV:
20101,1,grams,Good,AU,0.9999,Caesar,2017-06-14,12:33:44,RP
The first number I have set as a unique bigint(16). Somehow though, this gets imported as "101" instead of "20101", which causes a duplicate error because i already have a "101".
Why wouldn't the number fully read as "20101"?
I think I figured this out. When saving the CSV from Excel, I was using the "CSV (UTF-8)" option. Don't know why that would make a difference, but when I switched to saving it as a plain CSV comma delineated, the file imported with no problems.
I am trying to import data from excel csv to MS Access. A column in csv has majority of values like "F0000123". Few Values are "E0000123". While importing this using transfertext to Text column in Access, F0000123 has changed to 123 and E0000123 has been imported as blank with datatype conversion failure. If importing to new blank (no columns defined) table F0000123 importing as $123 and again E0000123 has been imported as blank with datatype conversion failure. Please help why value starting with F have this issue.
Link the csv file as a table. Then create a query to read and convert (purify) the data.
Use this query as source for further processing of the date like appending data to other tables.
I am downloading CSV files which are comma-separated. The problem i'm having is that the commas are screwing-up my import into a database table (SQL Server). For example, I have a header row called hotel_name, but some of the names are like the following:
HOTEL_NAME
hilton
cambridge,the
The problem is that fields containing a comma in the hotel name will move to the adjacent column, like this I'm wondering if converting from CSV to a pipe-delimited format will work.
The problem i'm having is that i'm not sure how to get started. I've tried following the Powershell documentation but get basic errors. I think this is because i'm new to Powershell and not understanding something. Can someone please post a script of how to change the comma-separated file to a pipe-delimited file?
Sorry if this is confusing, i'm finding the formatting on StackOverflow to be a bit crazy.
Taken from Dealing with commas in a CSV file
Use " to wrap data that contains a comma.
For example
Server000,"Microsoft(R) Windows(R) Server 2003, Enterprise Edition"
And so we found a 3.6GB csv that we have uploaded onto S3 and now want to import into Redshift, then do the querying and analysis from iPython.
Problem 1:
This comma delimited file contains values free text that also contains commas and this is interfering with the delimiting so can’t upload to Redshift.
When we tried opening the sample dataset in Excel, Excel surprisingly puts them into columns correctly.
Problem 2:
A column that is supposed to contain integers have some records containing alphabets to indicate some other scenario.
So, the only way to get the import through is to declare this column as varchar. But then we can do calculations later on.
Problem 3:
The datetime data type requires the date time value to be in the format YYYY-MM-DD HH:MM:SS, but the csv doesn’t contain the SS and the database is rejecting the import.
We can’t manipulate the data on a local machine because it is too big, and we can’t upload onto the cloud for computing because it is not in the correct format.
The last resort would be to scale the instance running iPython all the way up so that we can read the big csv directly from S3, but this approach doesn’t make sense as a long-term solution.
Your suggestions?
Train: https://s3-ap-southeast-1.amazonaws.com/bucketbigdataclass/stack_overflow_train.csv (3.4GB)
Train Sample: https://s3-ap-southeast-1.amazonaws.com/bucketbigdataclass/stack_overflow_train-sample.csv (133MB)
Try having different delimiter or use escape characters.
http://docs.aws.amazon.com/redshift/latest/dg/r_COPY_preparing_data.html
For second issue, if you want to extract only numbers from the column after loading into char use regexp_replace or other functions.
For third issue, you can as well load it into VARCHAR field and then use substring cast(left(column_name, 10)||' '||right(column_name, 6)||':00' as timestamp)
to load it into final table from staging table
For the first issue, you need to find out a way to differentiate between the two types of commas - the delimiter and the text commas. Once you have done that, replace the delimiters with a different delimiter and use the same as delimiter in the copy command for Redshift.
For the second issue, you need to first figure out if this column needs to be present for numerical aggregations once loaded. If yes, you need to get this data cleaned up before loading. If no, you can directly load this as char/ varchar field. All your queries will still work but you will not be able to do any aggregations (sum/ avg and the likes) on this field.
For problem 3, you can use Text(date, "yyyy-mm-dd hh:mm:ss") function in excel to do a mass replace for this field.
Let me know if this works out.