Suppose we have a table like the one below.
Id | Name | Group
-----------------
1 | John | 1
2 | Zayn | 2
3 | Four | 2
4 | Ben_ | 3
5 | Joe_ | 2
6 | Anna | 1
The query below will select all of them.
SELECT `Name` FROM `Table` WHERE 1;
How would I select only one person from each group? Who it is doesn't really matter, as long as there's only one name from group 1 and one name from group 2 etc.
The GROUP BY clause isn't fit for this (according to my error console) because I am selecting non aggregated values, which makes sense.
The DISTINCT clause isn't great here either, since I don't want to select the "Group" and definitely not group by their names.
If is not important the resulting name You can anawy leverage some group functions eg with max or min..
leverage the group functions
select max(name) from your_table
group by Group;
otherwise you can use subquery
select name from your_table
where Id in (select min(Id) from your_table group by Group);
Related
I am not very good at sql but I am getting there. I have searched stackoverflow but I can't seem to find the solution and I hope someone out there can help me. I have a table (users) with data like the following. The book_id column is a key to another table that contains a book the user is subscribed to.
|--------|---------------------|------------------|
| id | book_id | name |
|--------|---------------------|------------------|
| 1 | 1 | jim |
| 2 | 1 | joyce |
| 3 | 1 | mike |
| 4 | 1 | eleven |
| 5 | 2 | max |
| 6 | 2 | dustin |
| 7 | 2 | lucas |
|--------|---------------------|------------------|
I have a function in my PHP code that returns two random users from a specific book id (either 1 or 2). Query one returns the result in column 1 and result two returns the results in column 2 like:
|---------------------|------------------|
| 1 | 2 |
|---------------------|------------------|
| jim | max |
| joyce | dustin |
|---------------------|------------------|
I have achieved this by running two separate queries as seen below. I want to know if it's possible to achieve this functionality with one query and how.
$random_users_with_book_id_1 = SELECT name FROM users WHERE book_id=1 LIMIT 2
$random_users_with_book_id_2 = SELECT name FROM users WHERE book_id=2 LIMIT 2
Again, I apologise if it's too specific. The query below has been closest to what I was trying to achieve.:
SELECT a.name AS book_id_1, b.name AS book_id_2
FROM users a, users b
WHERE a.book_id=1 AND b.book_id = 2
LIMIT 2
EDIT: I have created a fiddle to play around with his. I appreciate any help! Thank you!! http://sqlfiddle.com/#!9/7fcbca/1
It is easy actually :)
you can use UNION like this:
SELECT * FROM (
(SELECT * FROM user WHERE n_id=1 LIMIT 2)
UNION
(SELECT * FROM user WHERE n_id=2 LIMIT 2))
collection;
if you read this article about the documentation you can use the () to group the individual queries and the apply the union in the middle. Without the parenthesis it would still LIMIT 2 and show only the two first. Ref. "To apply ORDER BY or LIMIT to an individual SELECT, place the clause inside the parentheses that enclose the SELECT:"
If you want to combine the queries in MySQL, you can just use parentheses:
(SELECT name
FROM users
WHERE n_id = 1
LIMIT 2
) UNION ALL
(SELECT name
FROM users
WHERE n_id = 2
LIMIT 2
);
First, only use UNION if you specifically want to incur the overhead of removing duplicates. Otherwise, use UNION ALL.
Second, this does not return random rows. This returns arbitrary rows. In many cases, this might be two rows near the beginning of the data. If you want random rows, then use ORDER BY rand():
(SELECT name
FROM users
WHERE n_id = 1
ORDER by rand()
LIMIT 2
) UNION ALL
(SELECT name
FROM users
WHERE n_id = 2
ORDER BY rand()
LIMIT 2
);
There are other methods that are more efficient, but this should be fine for up to a few thousand rows.
I want to show first two top voted Posts then others sorted by id
This is table
+----+-------+--------------+--------+
| Id | Name | Post | Votes |
+====+=======+==============+========+
| 1 | John | John's msg | -6 |
| 2 |Joseph |Joseph's msg | 8 |
| 3 | Ivan | Ivan's msg | 3 |
| 4 |Natalie|Natalie's msg | 10 |
+----+-------+--------------+--------+
After query result should be:
+----+-------+--------------+--------+
| Id | Name | Post | Votes |
+====+=======+==============+========+
| 4 |Natalie|Natalie's msg | 10 |
| 2 |Joseph |Joseph's msg | 8 |
-----------------------------------------------
| 1 | John | John's msg | -6 |
| 3 | Ivan | Ivan's msg | 3 |
+----+-------+--------------+--------+
I have 1 solution but i feel like there is better and faster way to do it.
I run 2 queries, one to get top 2, then second to get others:
SELECT * FROM table order by Votes desc LIMIT 2
SELECT * FROM table order by Id desc
And then in PHP i make sure that i show 1st query as it is, and on displaying 2nd query i remove entry's that are in 1st query so they don't double.
Can this be done in single query to select first two top voted, then others?
You would have to use subqueries or union - meaning you have a single outer query, which contains multiple queries inside. I would simply retrieve the IDs from the first query and add a id not in (...) criterion to the where clause of the 2nd query - thus filtering out the posts retrieved in the first query:
SELECT * FROM table WHERE Id NOT IN (...) ORDER BY Id DESC
With union the query would look like as follows:
(SELECT table.*, 1 as o FROM table order by Votes desc LIMIT 2)
UNION
(SELECT table.*, 0 FROM table
WHERE Id NOT IN (SELECT Id FROM table order by Votes desc LIMIT 2))
ORDER BY o DESC, if(o=1,Votes,Id) DESC
As you can see, it wraps 3 queries into one and has a more complicated ordering as well because in union the order of the records retrieved is not guaranteed.
Two simple queries seem to be a lot more efficient to me in this particular case.
There could be different ways to write a query that returns the rows in the order you want. My solution is this:
select
table.*
from
table left join (select id from table order by votes desc limit 2) l
on table.id = l.id
order by
case when l.id is not null then votes end desc,
tp.id
the subquery will return the first two id ordered by votes desc, the join will succeed whenever the row is one of the first two otherwise l.id will be null instead.
The order by will order by number of votes desc whenever the row is the first or the second (=l.id is not null), when l.id is null it will put the rows at the bottom and order by id instead.
I have a table tbl with three columns:
id | fk | dateof
1 | 1 | 2016-01-01
2 | 1 | 2016-01-02
3 | 2 | 2016-02-01
4 | 2 | 2016-03-01
5 | 3 | 2016-04-01
I want to get the results like this
Id count of Id max(dateof)
2 | 2 | 2016-01-02
4 | 2 | 2016-03-01
5 | 1 | 2016-04-01
My try
SELECT id,tbl.dateof dateof
FROM tbl
INNER JOIN
(SELECT fk, MAX(dateof) dateof ,
count(id) cnt_of_id -- How to get this count value in the result
FROM tbl
GROUP BY fk) temp
ON tbl.fk = temp.fk AND tbl.dateof = temp.dateof
This is an aggregation query, but you don't seem to want the column being aggregated. That is ok (although you cannot distinguish the rk that defines each row):
select count(*) as CountOfId, max(dateof) as maxdateof
from t
group by fk;
In other words, your subquery is pretty much all you need.
If you have a reasonable amount of data, you can use a MySQL trick:
select substring_index(group_concat(id order by dateof desc), ',', 1) as id
count(*) as CountOfId, max(dateof) as maxdateof
from t
group by fk;
Note: this is limited by the maximum intermediate size for group_concat(). This parameter can be changed and it is typically large enough for this type of query on a moderately sized table.
You obviously want one result row per fk, so group by it. Then you want the max ID, the row count and the max date for each fk:
select
max(id) as max_id,
count(*) as cnt,
max(date_of) as max_date_of
from tbl
group by fk;
I have a table like this:
Table: p
+----------------+
| id | w_id |
+---------+------+
| 5 | 8 |
| 5 | 10 |
| 5 | 8 |
| 5 | 10 |
| 5 | 8 |
| 6 | 5 |
| 6 | 8 |
| 6 | 10 |
| 6 | 10 |
| 7 | 8 |
| 7 | 10 |
+----------------+
What is the best SQL to get the following result? :
+-----------------------------+
| id | most_used_w_id |
+---------+-------------------+
| 5 | 8 |
| 6 | 10 |
| 7 | 8 |
+-----------------------------+
In other words, to get, per id, the most frequent related w_id.
Note that on the example above, id 7 is related to 8 once and to 10 once.
So, either (7, 8) or (7, 10) will do as result. If it is not possible to
pick up one, then both (7, 8) and (7, 10) on result set will be ok.
I have come up with something like:
select counters2.p_id as id, counters2.w_id as most_used_w_id
from (
select p.id as p_id,
w_id,
count(w_id) as count_of_w_ids
from p
group by id, w_id
) as counters2
join (
select p_id, max(count_of_w_ids) as max_counter_for_w_ids
from (
select p.id as p_id,
w_id,
count(w_id) as count_of_w_ids
from p
group by id, w_id
) as counters
group by p_id
) as p_max
on p_max.p_id = counters2.p_id
and p_max.max_counter_for_w_ids = counters2.count_of_w_ids
;
but I am not sure at all whether this is the best way to do it. And I had to repeat the same sub-query two times.
Any better solution?
Try to use User defined variables
select id,w_id
FROM
( select T.*,
if(#id<>id,1,0) as row,
#id:=id FROM
(
select id,W_id, Count(*) as cnt FROM p Group by ID,W_id
) as T,(SELECT #id:=0) as T1
ORDER BY id,cnt DESC
) as T2
WHERE Row=1
SQLFiddle demo
Formal SQL
In fact - your solution is correct in terms of normal SQL. Why? Because you have to stick with joining values from original data to grouped data. Thus, your query can not be simplified. MySQL allows to mix non-group columns and group function, but that's totally unreliable, so I will not recommend you to rely on that effect.
MySQL
Since you're using MySQL, you can use variables. I'm not a big fan of them, but for your case they may be used to simplify things:
SELECT
c.*,
IF(#id!=id, #i:=1, #i:=#i+1) AS num,
#id:=id AS gid
FROM
(SELECT id, w_id, COUNT(w_id) AS w_count
FROM t
GROUP BY id, w_id
ORDER BY id DESC, w_count DESC) AS c
CROSS JOIN (SELECT #i:=-1, #id:=-1) AS init
HAVING
num=1;
So for your data result will look like:
+------+------+---------+------+------+
| id | w_id | w_count | num | gid |
+------+------+---------+------+------+
| 7 | 8 | 1 | 1 | 7 |
| 6 | 10 | 2 | 1 | 6 |
| 5 | 8 | 3 | 1 | 5 |
+------+------+---------+------+------+
Thus, you've found your id and corresponding w_id. The idea is - to count rows and enumerate them, paying attention to the fact, that we're ordering them in subquery. So we need only first row (because it will represent data with highest count).
This may be replaced with single GROUP BY id - but, again, server is free to choose any row in that case (it will work because it will take first row, but documentation says nothing about that for common case).
One little nice thing about this is - you can select, for example, 2-nd by frequency or 3-rd, it's very flexible.
Performance
To increase performance, you can create index on (id, w_id) - obviously, it will be used for ordering and grouping records. But variables and HAVING, however, will produce line-by-line scan for set, derived by internal GROUP BY. It isn't such bad as it was with full scan of original data, but still it isn't good thing about doing this with variables. On the other hand, doing that with JOIN & subquery like in your query won't be much different, because of creating temporery table for subquery result set too.
But to be certain, you'll have to test. And keep in mind - you already have valid solution, which, by the way, isn't bound to DBMS-specific stuff and is good in terms of common SQL.
Try this query
select p_id, ccc , w_id from
(
select p.id as p_id,
w_id, count(w_id) ccc
from p
group by id,w_id order by id,ccc desc) xxx
group by p_id having max(ccc)
here is the sqlfidddle link
You can also use this code if you do not want to rely on the first record of non-grouping columns
select p_id, ccc , w_id from
(
select p.id as p_id,
w_id, count(w_id) ccc
from p
group by id,w_id order by id,ccc desc) xxx
group by p_id having ccc=max(ccc);
Had a good read through similar topics but I can't quite a) find one to match my scenario, or b) understand others enough to fit / tailor / tweek to my situation.
I have a table, the important fields being;
+------+------+--------+--------+
| ID | Name | Price |Status |
+------+------+--------+--------+
| 1 | Fred | 4.50 | |
| 2 | Fred | 4.50 | |
| 3 | Fred | 5.00 | |
| 4 | John | 7.20 | |
| 5 | John | 7.20 | |
| 6 | John | 7.20 | |
| 7 | Max | 2.38 | |
| 8 | Max | 2.38 | |
| 9 | Sam | 21.00 | |
+------+------+--------+--------+
ID is an auto-incrementing value as records get added throughout the day.
NAME is a Primary Key field, which can repeat 1 to 3 times in the whole table.
Each NAME will have a PRICE value, which may or may not be the same per NAME.
There is also a STATUS field that need to be populated based on the following, which is actually the part I am stuck on.
Status = 'Y' if each DISTINCT name has only one price attached to it.
Status = 'N' if each DISTINCT name has multiple prices attached to it.
Using the table above, ID's 1, 2 and 3 should be 'N', whilst 4, 5, 6, 7, 8 and 9 should be 'Y'.
I think this may well involve some form of combination of JOINs, GROUPs, and DISTINCTs but I am at a loss on how to put that into the right order for SQL.
In order to get the count of distinct Price values per name, we must use a GROUP BY on the Name field, but since you also want to display all names ungrouped but with an additional Status field, we must first create a subselect in the FROM clause which groups by the name and determines whether the name has multiple price values or not.
When we GROUP BY Name in the subselect, COUNT(DISTINCT price) will count the number of distinct price values for each particular name. Without the DISTINCT keyword, it would simply count the number of rows where price is not null.
In conjunction with that, we use a CASE expression to insert N into the Status column if there is more than one distinct Price value for the particular name, otherwise, it will insert Y.
The subselect only returns one row per Name, so to get all names ungrouped, we join that subselect to the main table on the condition that the subselect's Name = the main table's Name:
SELECT
b.ID,
b.Name,
b.Price,
a.Status
FROM
(
SELECT Name, CASE WHEN COUNT(DISTINCT Price) > 1 THEN 'N' ELSE 'Y' END AS Status
FROM tbl
GROUP BY Name
) a
INNER JOIN
tbl b ON a.Name = b.Name
Edit: In order to facilitate an update, you can incorporate this query using JOINs in the UPDATE like so:
UPDATE
tbl a
INNER JOIN
(
SELECT Name, CASE WHEN COUNT(DISTINCT Price) > 1 THEN 'N' ELSE 'Y' END AS Status
FROM tbl
GROUP BY Name
) b ON a.Name = b.Name
SET
a.Status = b.Status
Assuming you have an unfilled Status column in your table.
If you want to update the status column, you could do:
UPDATE mytable s
SET status = (
SELECT IF(COUNT(DISTINCT price)=1, 'Y', 'N') c
FROM (
SELECT *
FROM mytable
) s1
WHERE s1.name = s.name
GROUP BY name
);
Technically, it should not be necessary to have this:
FROM (
SELECT *
FROM mytable
) s1
but there is a mysql limitation that prevents you to select from the table you're updating. By wrapping it in parenthesis, we force mysql to create a temporary table and then it suddenly is possible.