I've been given a mySQL database from a custom-coded CMS and I need to get its data into a CSV file for importing into Excel for further futzing.
The problem is that the data in the database has a lot of HTML code in it (<p class="foo"> and that type of thing), so exporting as a CSV gets screwed up as some of the text has commas and other control characters in it.
Looked at all the export options via phpMyAdmin but couldn't really find anything that would work.
How can I get this into Excel?
Try tu use MySQL Workbench and there's excel xml export format, it should work also with html inside.
Actually, by searching here I found the answer tho had to take a bit of a different approach.
I used \t as the separator (tab character), Columns enclosed with ~, Columns escaped by \, put column names in the first row.
I had to use LibreOffice to import the sheet as Excel didn't have the flexibility needed. Got it working now!
Related
I am downloading CSV files which are comma-separated. The problem i'm having is that the commas are screwing-up my import into a database table (SQL Server). For example, I have a header row called hotel_name, but some of the names are like the following:
HOTEL_NAME
hilton
cambridge,the
The problem is that fields containing a comma in the hotel name will move to the adjacent column, like this I'm wondering if converting from CSV to a pipe-delimited format will work.
The problem i'm having is that i'm not sure how to get started. I've tried following the Powershell documentation but get basic errors. I think this is because i'm new to Powershell and not understanding something. Can someone please post a script of how to change the comma-separated file to a pipe-delimited file?
Sorry if this is confusing, i'm finding the formatting on StackOverflow to be a bit crazy.
Taken from Dealing with commas in a CSV file
Use " to wrap data that contains a comma.
For example
Server000,"Microsoft(R) Windows(R) Server 2003, Enterprise Edition"
(SQL Server 2008)
So here's my task ..
I need to export query results to file, and then import that file using SSIS to another DB.
Specific to the task, the data contains every awkward unicode character you can think of, so delimiting with commas, pipes etc is out of the question.
Here are the options SSMS gives me for export format:
Column Aligned
Comma/Tab/Space delimited
Custom delimiter
And here are the options SSIS gives me for a flat file data source:
Delimited (custom)
Fixed Width
Ragged Right
So given that a delimiter character is out of the question ... I cannot see another method that both SSMS & SSIS agree on.
Such as fixed width ?
Seems strange that the 2 closely related MS products have such different options.
Or have I missed something here ?
Any advice appreciated !!
It seems you need to try out different combination of options while creating delimited flat file(for your exported query result).
Try setting Code page to UTF-8 with and without Unicode. Also use Text qualifier as " or any of your choice which you thought might work. Also try using different option for column delimiter.
Once you are able to create delimited file then you have to apply same setting on file while importing to another DB.
I've been trying to export data from a Virtuemart installation into an excel file, so that it can be easily imported into Magento. The problem I'm having is that any fields containing HTML are causing line breaks and breaking the formatting of the file.
I've tried using semicolon as the delimiter as well as tab, but that didn't seem to address the issue because the odd line breaks were still there.
Is removing the line breaks and praying for it to work the only way around this?
Thanks!
It's not clear whether the problem comes from unescaped commas or newlines in your CSV file, but either way there should be a means to properly escape them so they don't affect your import.
I'm also not quite clear what programs you're using in what ways; you've tagged this as phpMyAdmin and in the title ask about an export from phpMyAdmin, but reference Virtuemart and Magento in the post, so I'm guessing you're using phpMyAdmin to do the import/export of the database used by those other ecommerce programs.
Can I strongly suggest using the SQL file type instead?
Within phpMyAdmin, you can select custom values for "Lines terminated with" on both import and export for CSV files. Perhaps you can leverage that to make, for instance, ยง your line termination value. Incidentally, my understanding is that as long as each field is properly escaped ("Columns enclosed with" and/or "Columns escaped with"), an extra newline or comma in your content shouldn't matter to your import/export. Open up the exported file in a text editor and look at a few of the entries to make sure they're properly escaped and perhaps post a few lines that fail as an example here (obscuring any sensitive information, of course).
I have a CSV file that I need to format (i.e., turn into) a SQL file for ingestion into MySQL. I am looking for a way to add the text delimiters (single quote) to the text, but not to the numbers, booleans, etc. I am finding it difficult because some of the text that I need to enclose in single quotes have commas themselves, making it difficult to key in to the commas for search and replace. Here is an example line I am working with:
1239,1998-08-26,'Severe Storm(s)','Texas,Val Verde,"DEL RIO, PARKS",'No',25,"412,007.74"
This is FEMA data file, with 131246 lines, I got off of data.gov that I am trying to get into a MySQL database. As you can see, I need to insert a single quote after Texas and before Val Verde, so I tried:
s/,/','/3
But that only replaced the first occurrence of the comma on the first three lines of the file. Once I get past that, I will need to find a way to deal with "DEL RIO, PARKS", as that has a comma that I do not want to place a single quote around.
So, is there a "nice" way to manipulate this data to get it from plain CSV to a proper SQL format?
Thanks
CSV files are notoriously dicey to parse. Different programs export CSV in different ways, possibly including strangeness like embedding new lines within a quoted field or different ways of representing quotes within a quoted field. You're better off using a tool specifically suited to parsing CSV -- perl, python, ruby and java all have CSV parsing libraries, or there are command line programs such as csvtool or ffe.
If you use a scripting language's CSV library, you may also be able to leverage the language's SQL import as well. That's overkill for a one-off, but if you're importing a lot of data this way, or if you're transforming data, it may be worthwhile.
I think that I would also want to do some troubleshooting to find out why the CSV import into MYSql failed.
I would take an approach like this:
:%s/,\("[^"]*"\|[^,"]*\)/,'\1'/g
:%s/^\("[^"]*"\|[^,"]*\)/'\1'/g
In words, look for a double quoted set of characters or , \|, a non-double quoted set of characters beginning with a comma and replace the set of characters in a single quotation.
Next, for the first column in a row, look for a double quoted set of characters or , \|, a non-double quoted set of characters beginning with a comma and replace the set of characters in a single quotation.
Try the csv plugin. It allows to convert the data into other formats. The help includes an example, how to convert the data for importing it into a database
Just to bring this to a close, I ended up using #Eric Andres idea, which was the MySQL load data option:
LOAD DATA LOCAL INFILE '/path/to/file.csv'
INTO TABLE MYTABLE FIELDS TERMINATED BY ',' LINES TERMINATED BY '\r\n';
The initial .csv file still took a little massaging, but not as much as I were to do it by hand.
When I commented that the LOAD DATA had truncated my file, I was incorrect. I was treating the file as a typical .sql file and assumed the "ID" column I had added would auto-increment. This turned out to not be the case. I had to create a quick script that prepended an ID to the front of each line. After that, the LOAD DATA command worked for all lines in my file. In other words, all data has to be in place within the file to load before the load, or the load will not work.
Thanks again to all who replied, and #Eric Andres for his idea, which I ultimately used.
I have 12 excel files, each one with lots of data organized in 2 fields (columns): id and text.
Each excel file uses a diferent language for the text field: spanish, italian, french, english, german, arabic, japanese, rusian, korean, chinese, japanese and portuguese.
The id field is a combination of letters and numbers.
I need to import every excel into a different MySQL table, so one table per language.
I'm trying to do it the following way:
- Save the excel as a CSV file
- Import that CSV in phpMyAdmin
The problem is that I'm getting all sorts of problems and I can't get to import them properly, probably because of codification issues.
For example, with the Arabic one, I set everything to UTF-8 (the database table field and the CSV file), but when I do the import, I get weird characters instead of the normal arabic ones (if I manually copy them, they show fine).
Other problems I'm getting are that some texts have commas, and since the CSV file uses also commas to separate fields, in texts that are imported are truncated whenever there's a comma.
Other problems are that, when saving as CSV, the characters get messed up (like the chinese one), and I can't find an option to tell excel what encoding I want to use in the CSV file.
Is there any "protocol" or "rule" that I can follow to make sure that I do it the right way? Something that works for each different language? I'm trying to pay attention to the character encoding, but even with that I still get weird stuff.
Maybe I should try a different method instead of CSV files?
Any advice would be much appreciated.
OK, how do I solved all my issues? FORGET ABOUT EXCEL!!!
I uploaded the excels to Googledocs spreadsheets, downloaded them as CSV, and all the characters were perfect.
Then I just imported into their corresponding fields of the tables, using a "utf_general_ci" collation, and now everything is uploaded perfectly in the database.
One standard thing to do in a CSV is to enclose fields containing commas with double quotes. So
ABC, johnny cant't come out, can he?, newfield
becomes
ABC, "johnny cant't come out, can he?", newfield
I believe Excel does this if you choose to save as file type CSV. A problem you'll have is that CSV is ANSI-only. I think you need to use the "Unicode Text" save-as option and live with the tab delimiters or convert them to commas. The Unicode text option also quotes comma-containing values. (checked using Excel 2007)
EDIT: Add specific directions
In Excel 2007 (the specifics may be different for other versions of Excel)
Choose "Save As"
In the "Save as type:" field, select "Unicode Text"
You'll get a Unicode file. UCS-2 Little Endian, specifically.