I have two tables,Writer and Books. A writer can pruduce many books. I want to get the all writers who produce maximal number of books.
Firstly, my sql query is like:
SELECT Name FROM(
SELECT Writer.Name,COUNT(Book.ID) AS NUMBER FROM Writer,Book
WHERE
Writer.ID=Book.ID
GROUP BY Writer.Name
)
WHERE NUMBER=(SELECT MAX(NUMBER) FROM
(SELECT Writer.Name,COUNT(Book.ID) AS NUMBER FROM Writer,Book
WHERE Writer.ID=Book.ID
GROUP BY Writer.Name
)
It works. However I think this query is too long and there exists some duplications. I want to make this query shorter. So I try another query like this:
SELECT Name FROM(
SELECT Writer.Name,COUNT(Book.ID) AS NUMBER FROM Writer,Book
WHERE
Writer.ID=Book.ID
GROUP BY Writer.Name
HAVING NUMBER = MAX(NUMBER)
)
However, this HAVING clause doesn't work and my sqlite says its an error.
I don't know why. Can anyone explain to me ? Thank you!
The HAVING clause provides filtering on the final set (typically after a group by) and does not provide additional grouping functionality. Think of it just like a WHERE clause, but can be applied after a GROUP BY.
Your query with the HAVING NUMBER = MAX(NUMBER) implies grouping of the set of NUMBER values across all records and doesn't make sense in this example (even though we all get what you want it to do).
Each query provides you with one level of aggregation, so you cannot use Max on COUNT in the same query. You need a sub-query like you did in your first query.
However, your first query can be simplified on MySQL to:
SELECT Writer.Name
FROM Writer, Book
WHERE Writer.ID = Book.ID
GROUP BY Writer.Name
HAVING COUNT(Book.ID) = (SELECT COUNT(Book.ID) AS n
FROM Writer, Book
WHERE Writer.ID = Book.ID
GROUP BY Writer.Name
ORDER BY n DESC
LIMIT 1)
In MySQL (but not SQLite), you can use variables to reduce the amount of work and make a simpler query. However, there are nuances there, because variables with group by require an extra level of subqueries:
SELECT name
FROM (SELECT t.*, (#m := if(#m = 0, NUMBER, #m)) as maxn
FROM (SELECT w.Name, COUNT(b.ID) AS NUMBER
FROM Writer w JOIN
Book b
ON w.ID = b.ID
GROUP BY w.Name
) t CROSS JOIN
(SELECT #m := 0) params
ORDER BY NUMBER desc
) t
WHERE maxn = number;
It looks like you are nesting aggregate functions, which is not allowed.
HAVING NUMBER = MAX(NUMBER) is like HAVING COUNT(Book.ID) = MAX(COUNT(Book.ID))
Nesting COUNT inside MAX seems to be the issue here
I have rows that get updated automatically. Sometimes rows are updated (via a new insert - an almost duplicate row) where some columns remain the same - and other columns have new values. I want to pull the most recent up to date row; all the values. Here's what I've got
SELECT * FROM
(SELECT * FROM
(SELECT * FROM entries
WHERE dataset_id = xxx
ORDER BY time_added DESC
) alias1 GROUP BY title
) alias2 ORDER BY timestamp
Work backwards on this list:
SELECT #1 > Reorders these to be displayed based on the timestamp initially created (not added)
SELECT #2 > Filters Select #3 to select distinct title values (most recent title)
SELECT #3 > First query actually executed. Gets the dataset orderd by timestamp added
Is there a more efficient way to do this? I get serious code bad smell from it.
Use a group by and join:
select e.*
from entries e join
(select title, max(time_added) as maxta
from entries e
where dataset_id = xxx
group by title
) emax
on emax.title = e.title and e.time_added = emax.maxta
where dataset_id = xxx
order by e.timestamp;
Your method uses a MySQL extension to group by, where you have columns in the select list that are not in the group by. This is explicitly documented to return indeterminate results. Don't use features that are documented not to work, even if they seem to under some circumstances.
Let's say we've got high scores table with columns app_id, best_score, best_time, most_drops, longest_something and couple more.
I'd like to collect top three results ON EACH CATEGORY grouped by app_id?
For now I'm using separate rank queries on each category in a loop:
SELECT app_id, best_something1,
FIND_IN_SET( best_something1,
(SELECT GROUP_CONCAT( best_something1
ORDER BY best_something1 DESC)
FROM highscores )) AS rank
FROM highscores
ORDER BY best_something1 DESC LIMIT 3;
Two things worth to add:
All columns for specific app are being updated at the same time (can consider creating a helper table).
the result of prospective "turbo query" might be requested quite often - as often as updating the values.
I'm quite basic with SQL and suspect that it has many more commands that combined together could do the magic?
What I'd expect from this post is that some wise owl would at least point the direction where to go or how to go.
The sample table:
http://sqlfiddle.com/#!2/eef053/1
Here is sample result too (already in json format, sry):
{"total_blocks":[["13","174","1"],["9","153","2"],["10","26","3"]],"total_games":[["13","15","1"],["9","12","2"],["10","2","3"]],"total_score":[["13","410","1"],["9","332","2"],["11","88","3"]],"aver_pps":[["11","4.34011","1"],["13","2.64521","2"],["12","2.60623","3"]],"aver_drop_per_game":[["11","20","1"],["10","13","2"],["9","12.75","3"]],"aver_drop_val":[["11","4.4","1"],["13","2.35632","2"],["9","2.16993","3"]],"aver_score":[["11","88","1"],["9","27.6667","2"],["13","27.3333","3"]],"best_pps":[["13","4.9527","1"],["11","4.34011","2"],["9","4.13076","3"]],"most_drops":[["11","20","1"],["9","16","2"],["13","16","2"]],"longest_drop":[["9","3","1"],["13","2","2"],["11","2","2"]],"best_drop":[["11","42","1"],["13","36","2"],["9","30","3"]],"best_score":[["11","88","1"],["13","78","2"],["9","58","3"]]}
When I encounter this scenario, I prefer to employ the UNION clause, and combine the queries tailored to each ORDERing and LIMIT.
http://dev.mysql.com/doc/refman/5.1/en/union.html
UNION combines the result rows vertically (top 3 rows for 5 sort categories yields 15 rows).
For your specific purpose, you might then pivot them as sub-SELECTs, rolling them up with GROUP_CONCAT GROUPed on user so that each has the delimited list.
I'd test something like this query, to see if the performance is any better or not. I think this comes pretty close to satisfying the specification:
( SELECT 99 AS seq_
, a.category
, CONVERT(a.val,DOUBLE) AS val
, FIND_IN_SET(a.val,r.highest_vals) AS rank
, a.user_id
FROM ( SELECT 'total_blocks' AS category
, b.`total_blocks` AS val
, b.user_id
FROM app b
ORDER BY b.`total_blocks` DESC
LIMIT 3
) a
CROSS
JOIN ( SELECT GROUP_CONCAT(s.val ORDER BY s.val DESC) AS highest_vals
FROM ( SELECT t.`total_blocks` AS val
FROM app t
ORDER BY t.`total_blocks` DESC
LIMIT 3
) s
) r
ORDER BY a.val DESC
)
UNION ALL
( SELECT 97 AS seq_
, a.category
, CONVERT(a.val,DOUBLE) AS val
, FIND_IN_SET(a.val,r.highest_vals) AS rank
, a.user_id
FROM ( SELECT 'XXX' AS category
, b.`XXX` AS val
, b.user_id
FROM app b
ORDER BY b.`XXX` DESC
LIMIT 3
) a
CROSS
JOIN ( SELECT GROUP_CONCAT(s.val ORDER BY s.val DESC) AS highest_vals
FROM ( SELECT t.`XXX` AS val
FROM app t
ORDER BY t.`XXX` DESC
LIMIT 3
) s
) r
ORDER BY a.val DESC
)
ORDER BY seq_ DESC, val DESC
To unpack this a little bit... this is essentially separate queries that are combined with UNION ALL set operator.
Each of the queries returns a literal value to allow for ordering. (In this case, I've given the column a rather anonymous name seq_ (sequence)... if the specific order isn't important, then this could be removed.
Each query is also returning a literal value that tells which "category" the row is for.
Because some of the values returned are INTEGER, and others are FLOAT, I'd cast all of those values to floating point, so the datatypes of each query line up.
For the FLOAT (floating point) type values, there can be a problem with comparison. So I'd go with casting those to decimal and stringing them together into a list using GROUP_CONCAT (as the original query does).
Since we are returning only three rows from each query, we only need to concatenate together the three largest values. (If there's a two way "tie" for first place, we'll return rank values of 1, 1, 3.)
Suitable indexes for each query will improve performance for large sets.
... ON app (total_blocks, user_id)
... ON app (best_pps,user_id)
... ON app (XXX,user_id)
I have a lot of users with websites and I want to select all websites and sort them by visitor amount. The users can specify the visitor amount in 2 ways. Either they can input it manually as a string that is stored in fb.visitor in the query below.
The second way is that he user install a Javascript Tracking Code on their site that then adds entries to the table tracking_visits and the total amount of visits is count(tv.id) below.
I want to be able to sort this result in 2 ways.
1) I want to get the highest result on top and lowest at bottom, using both columns. Example the Result should be:
99'947 ( COUNT(tv.id) )
75'412 ( COUNT(tv.id) )
40'000 ( fb.visitors )
37'482 ( COUNT(tv.id) )
30'000 ( fb.visitors )
2) Second sort I would like to be able to get all COUNT(tv.id) on top, highest first, and then get fb.visitors with highest first below. Example:
99'947 ( COUNT(tv.id) )
75'412 ( COUNT(tv.id) )
37'482 ( COUNT(tv.id) )
40'000 ( fb.visitors )
30'000 ( fb.visitors )
My current Query looks like this:
SELECT cs.userid, fb.visitors, COUNT( tv.id )
FROM campaigns_signups cs
INNER JOIN fe_blogs fb ON cs.userid = fb.userid
INNER JOIN tracking_visits tv ON tv.blogid = cs.userid
WHERE tv.visitdate
BETWEEN "2013-09-04"
AND "2013-10-04"
AND cs.campaignid = "97"
AND cs.status < "4"
GROUP BY tv.blogid
ORDER BY COUNT( tv.id ) , fb.visitors DESC
Note that the Dates and Integers in the Query is just examples.
The problem with this query is that it only selects the result that has entries in tracking_visits. I want to select a result where I get BOTH bloggers who have visitor amount in tracking_visits AND blogs who have visitor amount in fb.visitors.
For your first task, you can use ORDER BY GREATEST(COUNT(tv.id), fb.visitors) DESC. Documentation on GREATEST. For your second, you will want to use UNION. Documentation on UNION.
If for your first task you want each site to yield two rows (one for the greatest of the two values and the other for the least), you can again achieve this using UNION.
You are looking for greatest
select greatest(ifnull(fb.visitors,0),count(tv_id)) from.... order by 1
select greatest(ifnull(fb.visitors,0),count(tv_id)) from....
order by
case when greatest(ifnull(fb.visitors,0),count(tv_id))=fb.bisitors then 2 else 1 end, greatest(ifnull(fb.visitors,0),count(tv_id))
the second order by case orders by source of value and then by value size
For the second option of selecting the COUNT(tv.id) first, I was able to accomplish this by the following query:
SELECT *, tv.tracked_visits
FROM campaigns_signups cs
INNER JOIN fe_blogs fb ON cs.userid = fb.userid
LEFT JOIN (
SELECT blogid, COUNT( id ) AS "tracked_visits"
FROM tracking_visits
WHERE visitdate
BETWEEN "2013-09-04"
AND "2013-10-04"
GROUP BY blogid
) AS tv ON tv.blogid = cs.userid
WHERE cs.campaignid = :campaignid
AND cs.status < :status
ORDER BY tv.tracked_visits DESC , fb.visitors DESC
I run a website where users can post items (e.g. pictures). The items are stored in a MySQL database.
I want to query for the last ten posted items BUT with the constraint of a maximum of 3 items can come from any single user.
What is the best way of doing it? My preferred solution is a constraint that is put on the SQL query requesting the last ten items. But ideas on how to set up the database design is very welcome.
Thanks in advance!
BR
It's pretty easy with a correlated sub-query:
SELECT `img`.`id` , `img`.`userid`
FROM `img`
WHERE 3 > (
SELECT count( * )
FROM `img` AS `img1`
WHERE `img`.`userid` = `img1`.`userid`
AND `img`.`id` > `img1`.`id` )
ORDER BY `img`.`id` DESC
LIMIT 10
The query assumes that larger id means added later
Correlated sub-queries are a powerful tool! :-)
This is difficult because MySQL does not support the LIMIT clause on sub-queries. If it did, this would be rather trivial... But alas, here is my naïve approach:
SELECT
i.UserId,
i.ImageId
FROM
UserSuppliedImages i
WHERE
/* second valid ImageId */
ImageId = (
SELECT MAX(ImageId)
FROM UserSuppliedImages
WHERE UserId = i.UserId
)
OR
/* second valid ImageId */
ImageId = (
SELECT MAX(ImageId)
FROM UserSuppliedImages
WHERE UserId = i.UserId
AND ImageId < (
SELECT MAX(ImageId)
FROM UserSuppliedImages
WHERE UserId = i.UserId
)
)
/* you get the picture...
the more "per user" images you want, the more complex this will get */
LIMIT 10;
You did not comment on having a preferred result order, so this selects the latest images (assuming ImageId is an ascending auto-incrementing value).
For comparison, on SQL Server the same would look like this:
SELECT TOP 10
img.ImageId,
img.ImagePath,
img.UserId
FROM
UserSuppliedImages img
WHERE
ImageId IN (
SELECT TOP 3 ImageId
FROM UserSuppliedImages
WHERE UserId = img.UserId
)
I would first select 10 distinct users, then selecting images from each of those users with a LIMIT 3, possibly by a union of all those and limit that to 10.
That would atleast narrow down the data you need to process to a fair amount.