MySql getting specific lines from TEXT - mysql

Ok so what i wish to do is to get specific lines from TEXT without loading all the data from TEXT into memory.
So lets say i have 100k lines of text in TEXT and i wish to get lines 9000-9100 from there.
I can do it with files but is it possible with mysql as well?
Or is it better to use file for this?

To my knowledge it is not possible to address values in a column with type TEXT by line. (You can do it by BYTE with SUBSTRING)
So do it by file or use another structure to save it in the database... e.g. save each line in a separate row with line number .. than you can easily query your wanted lines

Related

Exporting to CSV file in SSIS from query is mixing up rows and columns

I am trying to export data from a query to CSV in SSIS. But the resulting CSV file is mixing up rows and columns.
I am using |as a delimiting character.The problem comes when the column data is few i think. Could it be a datatype size issue?
Update:
Here is an example of the text i a trying to export.
at MyGeneration.dOOdads.BusinessEntity.Save() at HCMIS.Desktop.Forms.WorkFlow.Receipt.HandleReceiveDocShortage(DataRowView dr, ReceiveDoc rec, Int32 receiveDocID) at HCMIS.Desktop.Forms.WorkFlow.Receipt.SaveReceive() at HCMIS.Desktop.Forms.WorkFlow.Receipt.btnSaveReceipt_Click(Object sender, EventArgs e)
but such long texts are put on different columns.
I have removed new line, tab and '|', as i am using '|' as a column delimiter.
I have also tried using different string type for the file system.
Data length shouldn't be an issue. You'd get a truncation error in your Flat File Destination if the data were longer than your defined field size.
Usually when I've had things like this happen, it's because the source schema changed and I wasn't told about it. SSIS doesn't do well when it thinks there are 10 columns and there are really 11. Check to be sure you have the right number and order of columns in your Flat File Destination and Flat File connection. Make sure the column mappings are right.
Then take a look at your Flat File Connection in SSIS. Check the Columns page and verify the Row delimiter and the Column delimiter. Also, on the General page, check the Header row delimiter and the Text qualifier. Then make sure there are no identifiers embedded in your data.

How to tell what type of newline is being used in a txt file?

I have a txt file which contains quoted, comma deliminated text, and i am trying to figure out what type of newline is being used.
The reason is because i am trying to import it into mysql server, using local infile, and obviously i need to tell it the correct LINES TERMINATED BY
When i use either \n or \r\n it imports exactly half, of the records only, skipping a line each time.
But when i use \r it imports exactly double, giving me the exact number of rows as all values null, as there are records.
When i open the file in notepad, there is no space in between lines, however, if i open it in a browser, there is a blank line in between each line, as though there is a paragraph there somewhere. Like wise if i choose "open with > Excel" it does not put into columns, and has a blank line between each. The only way to open properly in excel is to use "get external data > From text" and choose comma deliminator.
I provide a couple of lines below exactly by just copying and pasting, and obviously it would be great if someone could let me know the correct settings to use for importing. But i it would be even more great, if there was a way for me to quickly know what type of newline any particular file is using (there is also a blank line at the very end of the file as per the other rows).
"Item No.","Description","Description 2","Customers Price","Home stock","Brand Name","Expected date for delivery","Item Group No.","Item Group Name","Item Product Link (Web)","Item Picture Link (Web)","EAN/UPC","Weight","UNSPSC code","Product type code","Warranty"
"/5PL0006","Drum Unit","DK-23","127.00","32","Kyocera","04/11/2013","800002","Drums","http://uk.product.com/product.aspx?id=%2f5PL0006","http://s.product.eu/products/2_PICTURE-TAKEN.JPG","5711045360824","0.30","44103109","","3M"
"/DK24","DK-24 Drum Unit FS-3750","","119.00","8","Dell","08/11/2013","800002","Drums","http://uk.product.com/product.aspx?id=%2fDK24","http://s.product.eu/products/2_PICTURE-TAKEN.JPG","5711045360718","0.20","44103109","","3M"

How can I quickly reformat a CSV file into SQL format in Vim?

I have a CSV file that I need to format (i.e., turn into) a SQL file for ingestion into MySQL. I am looking for a way to add the text delimiters (single quote) to the text, but not to the numbers, booleans, etc. I am finding it difficult because some of the text that I need to enclose in single quotes have commas themselves, making it difficult to key in to the commas for search and replace. Here is an example line I am working with:
1239,1998-08-26,'Severe Storm(s)','Texas,Val Verde,"DEL RIO, PARKS",'No',25,"412,007.74"
This is FEMA data file, with 131246 lines, I got off of data.gov that I am trying to get into a MySQL database. As you can see, I need to insert a single quote after Texas and before Val Verde, so I tried:
s/,/','/3
But that only replaced the first occurrence of the comma on the first three lines of the file. Once I get past that, I will need to find a way to deal with "DEL RIO, PARKS", as that has a comma that I do not want to place a single quote around.
So, is there a "nice" way to manipulate this data to get it from plain CSV to a proper SQL format?
Thanks
CSV files are notoriously dicey to parse. Different programs export CSV in different ways, possibly including strangeness like embedding new lines within a quoted field or different ways of representing quotes within a quoted field. You're better off using a tool specifically suited to parsing CSV -- perl, python, ruby and java all have CSV parsing libraries, or there are command line programs such as csvtool or ffe.
If you use a scripting language's CSV library, you may also be able to leverage the language's SQL import as well. That's overkill for a one-off, but if you're importing a lot of data this way, or if you're transforming data, it may be worthwhile.
I think that I would also want to do some troubleshooting to find out why the CSV import into MYSql failed.
I would take an approach like this:
:%s/,\("[^"]*"\|[^,"]*\)/,'\1'/g
:%s/^\("[^"]*"\|[^,"]*\)/'\1'/g
In words, look for a double quoted set of characters or , \|, a non-double quoted set of characters beginning with a comma and replace the set of characters in a single quotation.
Next, for the first column in a row, look for a double quoted set of characters or , \|, a non-double quoted set of characters beginning with a comma and replace the set of characters in a single quotation.
Try the csv plugin. It allows to convert the data into other formats. The help includes an example, how to convert the data for importing it into a database
Just to bring this to a close, I ended up using #Eric Andres idea, which was the MySQL load data option:
LOAD DATA LOCAL INFILE '/path/to/file.csv'
INTO TABLE MYTABLE FIELDS TERMINATED BY ',' LINES TERMINATED BY '\r\n';
The initial .csv file still took a little massaging, but not as much as I were to do it by hand.
When I commented that the LOAD DATA had truncated my file, I was incorrect. I was treating the file as a typical .sql file and assumed the "ID" column I had added would auto-increment. This turned out to not be the case. I had to create a quick script that prepended an ID to the front of each line. After that, the LOAD DATA command worked for all lines in my file. In other words, all data has to be in place within the file to load before the load, or the load will not work.
Thanks again to all who replied, and #Eric Andres for his idea, which I ultimately used.

How do I remove DB entries based on text file in SQL Server 2008 R2?

I have a list of words in a text file. Each word separated by a new line. I want to read all of the words, and then, for each word I have to look up the DB and remove rows that contain the words that were read from the text file. How do i do that? I am a newbie to DB programming and I guess we dont have loops in SQL, right?
1 - Read all the words from the text file
2 - For each word from the text file
3 - Remove entry from db e.d. delete from TABLE where ITEMNAME is like ' WORDFROMFILE'
Thanks
Here's the general idea:
Step 1: Import the text file into a table.
Step 2: Write a query that DELETEs from the target table WHERE the keyword = the keyword in the target table, using an INNER JOIN.
You could use this technique to read text from file. If you want to do more complicated stuff, I'd suggest doing it from the front end (eg c#/vb etc.) rather than the db

loading large tables of students, but school only identified on first line

I'm loading large text file of high school students into MySQL, but the school itself is only identified in the first line of each text file. Like so:
897781234Metropolitan High
340098 1001X 678 AS Reading 101KAS DOE KEITH A1 340089 A 7782...
Using SQL code, how can I generate a column of the school number (e.g., 897781234) in the first column of the receiving table so that the school will be identified with each row?
To load the text files, I'm using:
LOAD DATA INFILE "f:/school_files/school897781234.txt"
INTO TABLE my_table FIELDS TERMINATED BY ''
IGNORE 1 LINES;
Thanks!
Hmmm ... looks like you're doing this under Windows. I prefer Unix/Linux for large text manipulation, but you should be able to use similar techniques under Windows (try installing Cygwin). PowerShell also has some useful capabilities, if you're familiar with that. With that in mind, here are some ideas for you:
Write a script that will munge your data files to make them MySQL friendly, by creating a new file that has the contents of all but the first line with the school information prepended on every line. Do your data load from the munged file.
(munge-schools.sh)
#!/bin/bash
ifile=$1
ofile=$2
school=$(head -1 ${ifile})
tail --lines=+2 ${ifile} | sed "s/^/${school}/" > ${ofile}
./munge-schools school897781234.txt school897781234.munged
For each school, do the load as is (skipping the first line), but load it into a temporary table, then add a column for the school defaulting to the school information. Copy from the temp table into your final table.
Given a choice, I will always go with doing text manipulation outside of the database to make the input files friendlier -- there are lots of text manipulation tools that will be much faster at reformatting your data than your database's bulk load tools.