At the end of this process I need to have a maximum of 15 records for each type in a table
My (hypothetical) table "stickorder" has 3 columns: StickColor, OrderNumber, PrimeryKey. (OrderNumber, PrimeryKey are unique)
I can only handle 15 orders for each stick color So I need to delete all the extra orders (They will be processed another day and are in a master table so I don't need them in this table.)
I have tried some similar solutions on this site but nothing seem to work, this is the closest
INSERT INTO stickorder2
(select posts_ordered.*
from (
select
stickorder.*,
#row:=if(#last_order=stickorder.OrderNumber, #row+1, 1) as row,
#last_orders:=stickorder.OrderNumber
from
stickorder inner join
(select OrderNumber from
(select distinct OrderNumber
from stickorder
order by OrderNumber) limit_orders
) limit_orders
on stickorder.OrderNumber = limit_orders.OrderNumber,
(select #last_order:=0, #row:=0) r
) posts_ordered
where row<=15);
When using insert, you should always list the columns. Alternatively, you might really want create table as.
Then, there are lots of other issues with your query. For instance, you say you want a limit on the number for each color, and yet you have no reference to StickColor in your query. I think you want something more along these lines:
INSERT INTO stickorder2(col1, . . . col2)
select so.*
from (select so.*,
#row:=if(#lastcolor = so.StickColor, #row+1,
if(#lastcolor := so.lastcolor, 1, 1)
) as row
from stickorders so cross join
(select #lastcolor := 0, #row := 0) vars
order by so.StickColor
) so
where row <= 15;
I have two tables: Races and RacesTimes, I want to extract all from Races and from RacesTimes only Finisher and Time, only the best RacesTimes.TotalTime (ordered ASC with LIMIT 1) from each RaceID (a column from RacesTimes).
So the result would be:
Races.*, RacesTimes.Finisher, RacesTimes.Time
This is what I made:
SELECT
Races.*,
(
SELECT
`TotalTime`
FROM
`RacesTimes`
WHERE
`RaceID` = Races.ID
ORDER BY
`TotalTime` ASC
LIMIT 1
) AS `BestTime`,
(
SELECT
`Time`
FROM
`RacesTimes`
WHERE
`RaceID` = Races.ID
ORDER BY
`TotalTime` ASC
LIMIT 1
) AS `BestTimeS`,
(
SELECT
`Finisher`
FROM
`RacesTimes`
WHERE
`RaceID` = Races.ID
ORDER BY
`TotalTime` ASC
LIMIT 1
) AS `BestFinisher`
FROM `Races`
It is extracting corectly all, but the query is way too long, can't it be simplified ? I think the simplified version uses LEFT JOIN or other thing like that, I don't know how to use queries with JOIN.
The approach here is to aggregate RaceTimes by race. The trick is to get the finisher with the minimum time.
MySQL offers a solution for this, by using group_concat() and substring_index() in a clever way. group_concat() takes an order by argument, so it can order the results by the time. Then the best finisher is in the first position.
The SQL looks like this:
select r.*, rtr.mintt as TotalTime, rtr.Finisher
from Races r join
(select RaceId, MIN(TotalTime) as mintt,
substring_inde(group_concat(finisher separator ',' order by totaltime), 1) as Finisher
from RaceTimes rt
group by RaceId
) rtr
on rtr.RaceId = r.id
I managed to write the query below, which nearly works, the problem being any null values zeroed out by the IFNULL(likes.num, 0) get put at the end of the result table.
SELECT t.*, IFNULL(likes.num, 0)
FROM `textagname` as t
LEFT JOIN likes
ON t.tex = likes.tex
ORDER BY num DESC
Is there another way to write this query, not like this:
SELECT *
FROM (
SELECT t.*, IFNULL(likes.num, 0)
FROM `textagname` as t
LEFT JOIN likes
ON t.tex = likes.tex
)
ORDER BY d.num DESC
Preferably a way that doesn't make it take much longer.
The trick was to order by the generated value not the one coming from the table. Notice the addition of numLikes below.
SELECT t.*, IFNULL(likes.num, 0) as numLikes
FROM `textagname` as t
LEFT JOIN likes
ON t.tex = likes.tex
ORDER BY numLikes DESC
I've got a mySQL statement that selects some data and also ranks.
I want to have the record for 'Bob', for example, selected but NOT included in the ranking. So, I need Bob's row returned in the main select statement, but I need Bob excluded from the sub-SELECT which handles the ranking. I needs Bob's data but, he should not be counted in the rankings.
I tried AND t.name !='Bob' after WHERE x.category = t.category But, that's not working.
SELECT t.name,
t.category,
t.score1,
t.score2,
(SELECT COUNT(*)
FROM my_table x
WHERE x.category = t.category
AND (x.score1 + x.score2) >= (t.score1 + t.score2)
) AS rank
FROM my_table t ORDER BY rank ASC
Any suggestions?
Thank you.
-Laxmidi
You should probably use AND x.name != 'Bob' instead of t.name.
I'd like to optimize my queries so I look into mysql-slow.log.
Most of my slow queries contains ORDER BY RAND(). I cannot find a real solution to resolve this problem. Theres is a possible solution at MySQLPerformanceBlog but I don't think this is enough. On poorly optimized (or frequently updated, user managed) tables it doesn't work or I need to run two or more queries before I can select my PHP-generated random row.
Is there any solution for this issue?
A dummy example:
SELECT accomodation.ac_id,
accomodation.ac_status,
accomodation.ac_name,
accomodation.ac_status,
accomodation.ac_images
FROM accomodation, accomodation_category
WHERE accomodation.ac_status != 'draft'
AND accomodation.ac_category = accomodation_category.acat_id
AND accomodation_category.acat_slug != 'vendeglatohely'
AND ac_images != 'b:0;'
ORDER BY
RAND()
LIMIT 1
Try this:
SELECT *
FROM (
SELECT #cnt := COUNT(*) + 1,
#lim := 10
FROM t_random
) vars
STRAIGHT_JOIN
(
SELECT r.*,
#lim := #lim - 1
FROM t_random r
WHERE (#cnt := #cnt - 1)
AND RAND(20090301) < #lim / #cnt
) i
This is especially efficient on MyISAM (since the COUNT(*) is instant), but even in InnoDB it's 10 times more efficient than ORDER BY RAND().
The main idea here is that we don't sort, but instead keep two variables and calculate the running probability of a row to be selected on the current step.
See this article in my blog for more detail:
Selecting random rows
Update:
If you need to select but a single random record, try this:
SELECT aco.*
FROM (
SELECT minid + FLOOR((maxid - minid) * RAND()) AS randid
FROM (
SELECT MAX(ac_id) AS maxid, MIN(ac_id) AS minid
FROM accomodation
) q
) q2
JOIN accomodation aco
ON aco.ac_id =
COALESCE
(
(
SELECT accomodation.ac_id
FROM accomodation
WHERE ac_id > randid
AND ac_status != 'draft'
AND ac_images != 'b:0;'
AND NOT EXISTS
(
SELECT NULL
FROM accomodation_category
WHERE acat_id = ac_category
AND acat_slug = 'vendeglatohely'
)
ORDER BY
ac_id
LIMIT 1
),
(
SELECT accomodation.ac_id
FROM accomodation
WHERE ac_status != 'draft'
AND ac_images != 'b:0;'
AND NOT EXISTS
(
SELECT NULL
FROM accomodation_category
WHERE acat_id = ac_category
AND acat_slug = 'vendeglatohely'
)
ORDER BY
ac_id
LIMIT 1
)
)
This assumes your ac_id's are distributed more or less evenly.
It depends on how random you need to be. The solution you linked works pretty well IMO. Unless you have large gaps in the ID field, it's still pretty random.
However, you should be able to do it in one query using this (for selecting a single value):
SELECT [fields] FROM [table] WHERE id >= FLOOR(RAND()*MAX(id)) LIMIT 1
Other solutions:
Add a permanent float field called random to the table and fill it with random numbers. You can then generate a random number in PHP and do "SELECT ... WHERE rnd > $random"
Grab the entire list of IDs and cache them in a text file. Read the file and pick a random ID from it.
Cache the results of the query as HTML and keep it for a few hours.
Here's how I'd do it:
SET #r := (SELECT ROUND(RAND() * (SELECT COUNT(*)
FROM accomodation a
JOIN accomodation_category c
ON (a.ac_category = c.acat_id)
WHERE a.ac_status != 'draft'
AND c.acat_slug != 'vendeglatohely'
AND a.ac_images != 'b:0;';
SET #sql := CONCAT('
SELECT a.ac_id,
a.ac_status,
a.ac_name,
a.ac_status,
a.ac_images
FROM accomodation a
JOIN accomodation_category c
ON (a.ac_category = c.acat_id)
WHERE a.ac_status != ''draft''
AND c.acat_slug != ''vendeglatohely''
AND a.ac_images != ''b:0;''
LIMIT ', #r, ', 1');
PREPARE stmt1 FROM #sql;
EXECUTE stmt1;
(Yeah, I will get dinged for not having enough meat here, but can't you be a vegan for one day?)
Case: Consecutive AUTO_INCREMENT without gaps, 1 row returned
Case: Consecutive AUTO_INCREMENT without gaps, 10 rows
Case: AUTO_INCREMENT with gaps, 1 row returned
Case: Extra FLOAT column for randomizing
Case: UUID or MD5 column
Those 5 cases can be made very efficient for large tables. See my blog for the details.
This will give you single sub query that will use the index to get a random id then the other query will fire getting your joined table.
SELECT accomodation.ac_id,
accomodation.ac_status,
accomodation.ac_name,
accomodation.ac_status,
accomodation.ac_images
FROM accomodation, accomodation_category
WHERE accomodation.ac_status != 'draft'
AND accomodation.ac_category = accomodation_category.acat_id
AND accomodation_category.acat_slug != 'vendeglatohely'
AND ac_images != 'b:0;'
AND accomodation.ac_id IS IN (
SELECT accomodation.ac_id FROM accomodation ORDER BY RAND() LIMIT 1
)
The solution for your dummy-example would be:
SELECT accomodation.ac_id,
accomodation.ac_status,
accomodation.ac_name,
accomodation.ac_status,
accomodation.ac_images
FROM accomodation,
JOIN
accomodation_category
ON accomodation.ac_category = accomodation_category.acat_id
JOIN
(
SELECT CEIL(RAND()*(SELECT MAX(ac_id) FROM accomodation)) AS ac_id
) AS Choices
USING (ac_id)
WHERE accomodation.ac_id >= Choices.ac_id
AND accomodation.ac_status != 'draft'
AND accomodation_category.acat_slug != 'vendeglatohely'
AND ac_images != 'b:0;'
LIMIT 1
To read more about alternatives to ORDER BY RAND(), you should read this article.
I am optimizing a lot of existing queries in my project. Quassnoi's solution has helped me speed up the queries a lot! However, I find it hard to incorporate the said solution in all queries, especially for complicated queries involving many subqueries on multiple large tables.
So I am using a less optimized solution. Fundamentally it works the same way as Quassnoi's solution.
SELECT accomodation.ac_id,
accomodation.ac_status,
accomodation.ac_name,
accomodation.ac_status,
accomodation.ac_images
FROM accomodation, accomodation_category
WHERE accomodation.ac_status != 'draft'
AND accomodation.ac_category = accomodation_category.acat_id
AND accomodation_category.acat_slug != 'vendeglatohely'
AND ac_images != 'b:0;'
AND rand() <= $size * $factor / [accomodation_table_row_count]
LIMIT $size
$size * $factor / [accomodation_table_row_count] works out the probability of picking a random row. The rand() will generate a random number. The row will be selected if rand() is smaller or equals to the probability. This effectively performs a random selection to limit the table size. Since there is a chance it will return less than the defined limit count, we need to increase probability to ensure we are selecting enough rows. Hence we multiply $size by a $factor (I usually set $factor = 2, works in most cases). Finally we do the limit $size
The problem now is working out the accomodation_table_row_count.
If we know the table size, we COULD hard code the table size. This would run the fastest, but obviously this is not ideal. If you are using Myisam, getting table count is very efficient. Since I am using innodb, I am just doing a simple count+selection. In your case, it would look like this:
SELECT accomodation.ac_id,
accomodation.ac_status,
accomodation.ac_name,
accomodation.ac_status,
accomodation.ac_images
FROM accomodation, accomodation_category
WHERE accomodation.ac_status != 'draft'
AND accomodation.ac_category = accomodation_category.acat_id
AND accomodation_category.acat_slug != 'vendeglatohely'
AND ac_images != 'b:0;'
AND rand() <= $size * $factor / (select (SELECT count(*) FROM `accomodation`) * (SELECT count(*) FROM `accomodation_category`))
LIMIT $size
The tricky part is working out the right probability. As you can see the following code actually only calculates the rough temp table size (In fact, too rough!): (select (SELECT count(*) FROM accomodation) * (SELECT count(*) FROM accomodation_category)) But you can refine this logic to give a closer table size approximation. Note that it is better to OVER-select than to under-select rows. i.e. if the probability is set too low, you risk not selecting enough rows.
This solution runs slower than Quassnoi's solution since we need to recalculate the table size. However, I find this coding a lot more manageable. This is a trade off between accuracy + performance vs coding complexity. Having said that, on large tables this is still by far faster than Order by Rand().
Note: If the query logic permits, perform the random selection as early as possible before any join operations.
My recommendation is to add a column with a UUID (version 4) or other random value, with a unique index (or just the primary key).
Then you can simply generate a random value at query time and select rows greater than the generated value, ordering by the random column.
Make sure if you receive less than the expected number of rows, you repeat the query without the greater than clause (to select rows at the "beginning" of the result set).
uuid = generateUUIDV4()
select * from foo
where uuid > :uuid
order by uuid
limit 42
if count(results) < 42 {
select * from foo
order by uuid
limit :remainingResultsRequired
}
function getRandomRow(){
$id = rand(0,NUM_OF_ROWS_OR_CLOSE_TO_IT);
$res = getRowById($id);
if(!empty($res))
return $res;
return getRandomRow();
}
//rowid is a key on table
function getRowById($rowid=false){
return db select from table where rowid = $rowid;
}