Calculate sum from MySQL server or from application - mysql

I have a MySQL table which has two columns : ID and count. It has an index on ID field.
Now if i have to get sum of all the count between two IDs, I can write a query like:
Select SUM(count) from table where id between x and y
or i can get
select count from table where id between x and y
And then loop through the result and calculate the sum of the count on my application code
Which one is better, considering the speed is the essential thing here. Will indexing on the count help?? Or can i write a different SQL?
Would indexing on the count column help in any way?
I have around 10000 requests per second coming in and I am using a load balancer and 5 servers for this.

The second one is the correct one. There's no need to sum a count, as the count comes back as a single value. It only needs to be run once.
Unless you have a column named count, in which you want to sum all the values...
EDIT
Because you are saying you have a column named Count, you would use the first query:
Select SUM(count) from table where id between x and y

Use approach 1 as you would save on fetching data from MySQL and iterating over it.
The time taken by MySQL to execute either of your queries would be nearly the same but the second approach would require looping through the results and summing them; unnecessary overhead.

Related

Combine 2 MySQL queries into 1 query

I'm trying to combine these two queries into a single query but I'm not having much luck and I'm starting to think that it's not possible.
The value returned in transfer_dtl from the first query is passed into the second query FROM. This value is an actual table name that is stored in the first query. This is currently done with php and loops through, but I wanted to eliminate that step if possible.
SELECT id, client, loc, transfer_dtl, update_processed FROM tbl_master WHERE downtime_update='Y'
transfer_dtl
-------------
load_dtl_1
load_dtl_5
load_dtl_7
load_dtl_9
SELECT added_datetime FROM $transfer_dtl WHERE client='1' AND loc='2' AND scale='3' AND type='D' ORDER BY added_datetime DESC LIMIT 1
pseudo code...
if($added_datetime > $update_processed)
then add to a list of scales to update
Basically by combining these two queries together what I need is a distinct list of any scale that is marked to update.

Dynamically Pivoting Table Data | mySQL

Essentially I have a table in my database called Table1 with the following data:
The table has a ProductID that repeats because the values of AssignedColour, ColourFinding and ColourPower vary.
I would like to present all ProductID data in one single row, meaning if there is more than one AssignedColour, ColourFinding and ColourPower listed, it will contain a number at the end.
The final result I of the SELECT query should look like the following:
The number of columns presented horizontally is based on the number of AssignedColour per ProductID
Is something like this possible to accomplish in a mySQL SELECT Query?
An SQL query cannot expand the number of columns of the result set depending on the data values it discovers during query execution. The columns in the SELECT-list must be fixed at the time the query is prepared, before it reads any data.
Also the column names cannot be changed during the query execution. They must be set at the time the query is prepared.
There's no way to do what you are describing in a single SQL query. Your options are:
Do two queries: one to enumerate the colors per product, and then use the result of the first to format a second query with the columns you want.
Do one query to fetch the data in rows as it exists in your table, then write code in your app to display it in rows however you think is best.
Either way, you have to write at least a bit of code in the client. You can't do this in one query.

Questions on how to randomly Query multiple rows from Mysql without using "ORDER BY RAND()"

I need to query the MYSQL with some condition, and get five random different rows from the result.
Say, I have a table named 'user', and a field named 'cash'. I can compose a SQL like:
SELECT * FROM user where cash < 1000 order by RAND() LIMIT 5.
The result is good, totally random, unsorted and different from each other, exact what I want.
But I got from google that the efficiency is bad when the table get large because MySQL creates a temporary table with all the result rows and assigns each one of them a random sorting index. The results are then sorted and returned.
Then I go on searching and got a solution like:
SELECT * FROM `user` AS t1 JOIN (SELECT ROUND(RAND() * ((SELECT MAX(id) FROM `user`)- (SELECT MIN(id) FROM `user`))+(SELECT MIN(id) FROM `user`)) AS id) AS t2 WHERE t1.id >= t2.id AND cash < 1000 ORDER BY t1.id LIMIT 5;
This method uses JOIN and MAX(id). And the efficiency is better than the first one according to my testing. However, there is a problem. Since I also needs a condition "cash<1000" and if the the RAND() is so big that no row behind it has the cash<1000, then no result will return.
Anyone has good idea of how to compose the SQL that has have the same effect as the first one but has better efficiency?
Or, shall I just do simple query in MYSQL and let PHP randomly pick 5 different rows from the query result?
Your help is appreciated.
To make first query faster, just SELECT id - that will make the temporary table rather small (it will contain only IDs and not all fields of each row) and maybe it will fit in memory (temp table with text/blob are always created on-disk for example). Then when you get a result, run another query SELECT * FROM xy WHERE id IN (a,b,c,d,...). As you mentioned this approach is not very efficient, but as a quick fix this modification will make it several times faster.
One of the best approaches seems to be getting the total number of rows, choosing random numbers and for each result run a new query SELECT * FROM xy WHERE abc LIMIT $random,1. It should be quite efficient for random 3-5, but not good if you want 100 random rows each time :)
Also consider caching your results. Often you don't need different random rows to be displayed on each page load. Generate your random rows only once per minute. If you will generate the data for example via cron, you can live also with query which takes several seconds, as users will see the old data while new data are being generated.
Here are some of my bookmarks for this problem for reference:
http://jan.kneschke.de/projects/mysql/order-by-rand/
http://www.titov.net/2005/09/21/do-not-use-order-by-rand-or-how-to-get-random-rows-from-table/

Best way to combine multiple advanced mysql select queries

I have multiple select statements from different tables on the same database. I was using multiple, separate queries then loading to my array and sorting (again, after ordering in query).
I would like to combine into one statement to speed up results and make it easier to "load more" (see bottom).
Each query uses SELECT, LEFT JOIN, WHERE and ORDER BY commands which are not the same for each table.
I may not need order by in each statement, but I want the end result, ultimately, to be ordered by a field representing a time (not necessarily the same field name across all tables).
I would want to limit total query results to a number, in my case 100.
I then use a loop through results and for each row I test if OBJECTNAME_ID (ie; comment_id, event_id, upload_id) isset then LOAD_WHATEVER_OBJECT which takes the row and pushes data into an array.
I won't have to sort the array afterwards because it was loaded in order via mysql.
Later in the app, I will "load more" by skipping the first 100, 200 or whatever page*100 is and limit by 100 again with the same query.
The end result from the database would pref look like "this":
RESULT - selected fields from a table - field to sort on is greatest
RESULT - selected fields from a possibly different table - field to sort on is next greatest
RESULT - selected fields from a possibly different table table - field to sort on is third greatest
etc, etc
I see a lot of simpler combined statements, but nothing quite like this.
Any help would be GREATLY appreciated.
easiest way might be a UNION here ( http://dev.mysql.com/doc/refman/5.0/en/union.html ):
(SELECT a,b,c FROM t1)
UNION
(SELECT d AS a, e AS b, f AS c FROM t2)
ORDER BY a DESC

randomizing large dataset

I am trying to find a way to get a random selection from a large dataset.
We expect the set to grow to ~500K records, so it is important to find a way that keeps performing well while the set grows.
I tried a technique from: http://forums.mysql.com/read.php?24,163940,262235#msg-262235 But it's not exactly random and it doesn't play well with a LIMIT clause, you don't always get the number of records that you want.
So I thought, since the PK is auto_increment, I just generate a list of random id's and use an IN clause to select the rows I want. The problem with that approach is that sometimes I need a random set of data with records having a spefic status, a status that is found in at most 5% of the total set. To make that work I would first need to find out what ID's I can use that have that specific status, so that's not going to work either.
I am using mysql 5.1.46, MyISAM storage engine.
It might be important to know that the query to select the random rows is going to be run very often and the table it is selecting from is appended to frequently.
Any help would be greatly appreciated!
You could solve this with some denormalization:
Build a secondary table that contains the same pkeys and statuses as your data table
Add and populate a status group column which will be a kind of sub-pkey that you auto number yourself (1-based autoincrement relative to a single status)
Pkey Status StatusPkey
1 A 1
2 A 2
3 B 1
4 B 2
5 C 1
... C ...
n C m (where m = # of C statuses)
When you don't need to filter you can generate rand #s on the pkey as you mentioned above. When you do need to filter then generate rands against the StatusPkeys of the particular status you're interested in.
There are several ways to build this table. You could have a procedure that you run on an interval or you could do it live. The latter would be a performance hit though since the calculating the StatusPkey could get expensive.
Check out this article by Jan Kneschke... It does a great job at explaining the pros and cons of different approaches to this problem...
You can do this efficiently, but you have to do it in two queries.
First get a random offset scaled by the number of rows that match your 5% conditions:
SELECT ROUND(RAND() * (SELECT COUNT(*) FROM MyTable WHERE ...conditions...))
This returns an integer. Next, use the integer as an offset in a LIMIT expression:
SELECT * FROM MyTable WHERE ...conditions... LIMIT 1 OFFSET ?
Not every problem must be solved in a single SQL query.