How i get fast data in MySQL? - mysql

I have a table of message having 100,00,000 or more records how i can run query and get fast data. simple query is select title,message from messages please tell me the proper query to retrieve quickly data.

Filter your result by some column and place an index on that filter-column, e.g.,
SELECT title,message from messages
WHERE `date` > somedate
and the according index
CREATE INDEX dateIndex ON messages ( `date` );
I don't think you want to get all the 10^7 rows, do you?

on such a broad question, the answer can't be really specific - so heres the broad answer:
set proper indexing colums (title and message for your example) and use EXPLAIN on your query to see if those indexes are used.

Add an index on columns title and message.
CREATE INDEX title_message_index ON messages (title, message);

Using INDEX will greatly help to optimize the sql operation. Check this

Related

Check before to insert record into Database Table

I have a problem with inserting data into the database table using CRON script that is executing every day.
So, I am using that script to insert orders into order tables that have Order ID and Order number as keys.
But in the code, I am generating ID of order dynamically and article number is for each order incrementing for one.
But, with this solution, I cannot add check in my SQL query (IF NOT EXIST...), so orders will be duplicated all the time...And I don't have an idea at the moment for some smart solution...
Could someone provide me any suggestion for the problem?
Thank you.
I think you can use INSERT ... ON DUPLICATE KEY UPDATE Syntax.
Here you can find more information on how to use it.
Hope this will help you

Read-only after inner join (MySQL)

In MySQL workbench (Mac OS), I wanted to join two tables so that I can update the second one. The code I put in was as follows
select f.company, f.remarks, c.pic
from feedback f, customers c
where f.col = c.col
order by f.company;
The output is a read only table, which prevented me from updating table "customers" based on the f.remarks column.
Your advice/suggestion is appreciated. Thank you.
By hovering above the "Read Only" icon, I got the following message:
"Statement must be a SELECT from a single table with a primary key for its results to be editable".
After some research based on the advice given by fellow coders, here are some points to note:
In MySQL workbench, one cannot edit the results obtained from any JOINs because it's not from a single table;
In using SELECT from a single table, the primary key must be included in order for the result to be editable.
Thank you to everyone who contributed to the question. I appreciate it.
The problem is because, as you mentioned, SELECT only returns a "read only" result set.
So basically you cant use MySQL workbench to update a field in a read only result set that is returned when using a JOIN statement.
from what i understand you want to update table "customers" based on a query.
maybe this post will help you:
update table based on subquery of table

Optimized SELECT query in MySQL

I have a very large number of rows in my table, table_1. Sometimes I just need to retrieve a particular row.
I assume, when I use SELECT query with WHERE clause, it loops through the very first row until it matches my requirement.
Is there any way to make the query jump to a particular row and then start from that row?
Example:
Suppose there are 50,000,000 rows and the id which I want to search for is 53750. What I need is: the search can start from 50000 so that it can save time for searching 49999 rows.
I don't know the exact term since I am not expert of SQL!
You need to create an index : http://dev.mysql.com/doc/refman/5.1/en/create-index.html
ALTER TABLE_1 ADD UNIQUE INDEX (ID);
The way I understand it, you want to select a row with id 53750. If you have a field named id you could do this:
SELECT * FROM table_1 WHERE id = 53750
Along with indexing the id field. That's the fastest way to do so. As far as I know.
ALTER table_1 ADD UNIQUE INDEX (<collumn>)
Would be a great first step if it has not been generated automatically. You can also use:
EXPLAIN <your query here>
To see which kind of query works best in this case. Note that if you want to change the where statement (anywhere in the future) but see a returning value in there it will be a good idea to put an index on that aswell.
Create an index on the column you want to do the SELECT on:
CREATE INDEX index_1 ON table_1 (id);
Then, select the row just like you would before.
But also, please read up on databases, database design and optimization. Your question is full of false assumptions. Don't just copy and paste our answers verbatim. Get educated!
There are several things to know about optimizing select queries like Range and Where clause Optimization, the documentation is pretty informative baout this issue, read the section: Optimizing SELECT Statements. Creating an index on the column you evaluate is very helpfull regarding performance too.
One possible solution You can create View then query from view. here is details of creating view and obtain data from view
http://www.w3schools.com/sql/sql_view.asp
now you just split that huge number of rows into many view (i. e row 1-10000 in one view then 10001-20000 another view )
then query from view.
I am pretty sure that any SQL database with a little respect for themselves does not start looping from the first row to get the desired row. But I am also not sure how they makes it work, so I can't give an exact answer.
You could check out what's in your WHERE-clause and how the table is indexed. Do you have a proper primary key? Like using a numeric data type for that. Do you have indexes on more columns, that is used in your queries?
There is also alot to concider when installing the database server, like where to put the data and log files, how much memory to give the server and setting the growth. There's a lot you can do to tune your server.
You could try and split your tables in partitions
More about alter tables to add partitions
Selecting from a specific partition
In your case you could create a partition on ID for every 50.000 rows and when you want to skip the first 50.000 you just select from partition 2. How to do this ies explained quite well in the MySQL documentation.
You may try simple as this one.
query = "SELECT * FROM tblname LIMIT 50000,0
i just tried it with phpmyadmin. WHERE the "50,000" is the starting row to look up.
EDIT :
But if i we're you i wouldn't use this one, because it will lapses the 1 - 49999 records to search.

Index of a record in a sorted relations

Given I have a sorted relation (perhaps done by
SELECT id
FROM model
WHERE type = 'a'
ORDER BY name`
...), now I want to quickly get the index of a specific record e.g record id#15003.
How should I do it in MySql [I'm a Rails developer]?
Assuming by index you mean row number. That is, if the results come back '1, 7, 9,...' then the "index of 9" is 3, it is the third row.
You want what are variously called "Window functions" or "statistical functions" like row_number().
MySQL doesn't have them. Sorry.
However, though I'm not a RAILS developer, I have to assume you can get results in an array and search the array and return an index number?
EDIT: Based on your comment to Brad's answer, if you are doing this for the sake of paginating results, then look at the LIMIT and OFFSET. http://dev.mysql.com/doc/refman/5.0/en/select.html
Are you looking for something such as this: http://www.xaprb.com/blog/2006/12/02/how-to-number-rows-in-mysql/
I believe this is going to require you to select all results up to your desired result. So, if your record is at index 15003, you must select records 1-15003. Obviously this is not efficient or scalable... Perhaps there is a better way to approach the problem? Why do you need the index anyway? Why not place a more restrictive WHERE condition on the original SQL?

Generate number id from text/url for fast "SELECT"

I have the following problem:
I have a feed capturer that captures news from different sources every half an hour.
I only insert entries that don't have their URLs already in the database (URL is used to see if the record is already in database).
Even with that, I get some repeated entries, because some sites report the same news (that usually are from a news source like Reuters). I could look for these repeated entries during insertion, but i think this would slow the insertion time even more.
So, I can later find these repeated entries by the title. But I think this search is slow. Then, my idea is to generate a numeric field from the title and then search by this number for repeated titles.
What kind of encoding could I use (I thought in something reverse to base64) to encode the titles?
I'm suposing that searching for repeated numbers is a lot faster than searching for repeated words. Is that true or not?
Do you suggest a better solution for this problem?
Well, I don't care to have the repeated entries in the database, I just don't want to show then to the user. Like google, that filters the repeated results, but shows then if you want.
I hope I explained It well. Thanks in advance.
Fill the MD5 hash of the URL and title and build a UNIQUE index on it:
CREATE UNIQUE INDEX ux_mytable_title_url ON (title_hash, url_hash)
INSERT
INTO mytable (url, title, url_hash, title_hash)
VALUES ('url', 'title', MD5('url'), MD5('title'))
To select like Google (one result per title), use this query:
SELECT *
FROM (
SELECT DISTINCT title_hash
FROM mytable
) md
JOIN mytable mo
ON mo.url_title = md.title_hash
AND mo.url_hash =
(
SELECT url_hash
FROM mytable mi
WHERE mi.title_hash = md.title_hash
ORDER BY
mi.title_hash, mi.url_hash
LIMIT 1
)
so you can use a new table containing only the encoded keys based on title and url, you have then to add a key on it to accelerate search. But i don't think that you can use an effecient algorytm to transform strings to numbers ..
for the encryption use
SELECT MD5(CONCAT('title', 'url'));
and before every insertion you test if the encoded concatenation of title and url exists on this table.
#Quassnoi can explain better than I, but I think there is no visible difference in performance if you use a VARCHAR/CHAR or INT in a index to use it later for GROUPing or other method to find the duplicates. That way you could use the solution proposed by him but use a normal INDEX instead of a UNIQUE index and keep the duplicates in the database, filtering out only when showing to users.