How to order group by and get first row? - mysql

I have this php mysql statement
SELECT a.*, p.filename, m.`first name`, m.`last name`, m.`mobile number`, m.`status`, m.`email address`
FROM map a
join members m on a.members_id = m.id
join pictures p on m.pictures_id = p.id
WHERE a.active = 1
GROUP BY a.members_id
order by a.`date added` DESC
limit 1;
However it's not working. The map table has records, and many of them can have the same members_id value. I want to group them by the members_id, then order them by date added, so the most recent is on top of each group, then only get the top row (i.e. get most recent of each group).
Does anyone know whats wrong here?
Thanks

Try:
select * from
(SELECT a.*,
p.filename,
m.`first name`, m.`last name`, m.`mobile number`, m.`status`, m.`email address`
FROM map a
join members m on a.members_id = m.id
join pictures p on m.pictures_id = p.id
WHERE a.active = 1
order by a.members_id, a.`date added` DESC) sq
GROUP BY members_id;
Note that the fact that MySQL returns the first row when grouping is not documented and may change in future releases - so although this query should work with current versions of MySQL, it is not guaranteed to do so in future.

If you want to get one result per map, you have to select it in two steps - so with a subquery. The inner query gets the newest map per member and the outer query gets all the data. Be careful with the indices, otherwise it will be very slow.
I think it will be something like:
SELECT a.*, p.filename, m.`first name`, m.`last name`, m.`mobile number`, m.`status`, m.`email address`
FROM map a
inner join members m on a.members_id = m.id
inner join pictures p on m.pictures_id = p.id
inner join (
select max(a.`date added`) as maxdate from map ia where ia.members_id = m.id)
) as sub_a on sub_a.member_id = a.member_id and sub_a.maxdate = a.`date added`
WHERE a.active = 1
That depends on a single maximal date added, otherwise you will need some more tricks.

One approach is to use a where clause to filter out the records you are not interested in:
SELECT a.*, p.filename, m.`first name`, m.`last name`, m.`mobile number`, m.`status`, m.`email address`
FROM map a
join members m on a.members_id = m.id
join pictures p on m.pictures_id = p.id
WHERE a.active = 1 and
a.`date added` = (select max(map.`date added`)
from map
where map.members_id = a.members_id and
map.active = 1
)
GROUP BY a.members_id
order by a.`date added` DESC;

You cannot do this in one query. Replace the from map part with a subselect
select max(map_id) as map_id, members_id, max(date_added) as date_added from map where active = 1 group by members_id
This will give you all the members with the last dates. I have assumed your map_id existing, and being auto_increment. Use this instead of the original map table, and you will not need the group by, order and limit parts at all.

Related

How to properly join these three tables in SQL?

I'm currently creating a small application where users can post a text which can be commented and the post can also be voted (+1 or -1).
This is my database:
Now I want to select all information of all posts with status = 1 plus two extra columns: One column containing the count of comments and one column containing the sum (I call it score) of all votes.
I currently use the following query, which correctly adds the count of the comments:
SELECT *, COUNT(comments.fk_commented_post) as comments
FROM posts
LEFT JOIN comments
ON posts.id_post = comments.fk_commented_post
AND comments.status = 1
WHERE posts.status = 1
GROUP BY posts.id_post
Then I tried to additionally add the sum of the votes, using the following query:
SELECT *, COUNT(comments.fk_commented_post) as comments, SUM(votes_posts.type) as score
FROM posts
LEFT JOIN comments
ON posts.id_post = comments.fk_commented_post
AND comments.status = 1
LEFT JOIN votes_posts
ON posts.id_post = votes_posts.fk_voted_post
WHERE posts.status = 1
GROUP BY posts.id_post
The result is no longer correct for either the votes or the comments. Somehow some of the values seem to be getting multiplied...
This is probably simpler using correlated subqueries:
select p.*,
(select count(*)
from comments c
where c.fk_commented_post = p.id_post and c.status = 1
) as num_comments,
(select sum(vp.type)
from votes_posts vp
where c.fk_voted_post = p.id_post
) as num_score
from posts p
where p.status = 1;
The problem with join is that the counts get messed up because the two other tables are not related to each tother -- so you get a Cartesian product.
You want to join comments counts and votes counts to the posts. So, aggregate to get the counts, then join.
select
p.*,
coalesce(c.cnt, 0) as comments,
coalesce(v.cnt, 0) as votes
from posts p
left join
(
select fk_commented_post as id_post, count(*) as cnt
from comments
where status = 1
group by fk_commented_post
) c on c.id_post = p.id_post
left join
(
select fk_voted_post as id_post, count(*) as cnt
from votes_posts
group by fk_voted_post
) v on v.id_post = p.id_post
where p.status = 1
order by p.id_post;

MySQL - Ordering IN within INNER JOIN

I'm creating a system that allows a user to search a database of photo albums images for a keyword, it's working great, the only issue is that I'm ordering relevancy by the amount of times that keyword appears in an album. I'm doing this using:
SELECT collections_ids.collection_id
FROM `keywords`
INNER JOIN collections_ids ON keywords.id = collections_ids.photo_id
WHERE keywords.`keyword` = 'trees'
GROUP BY collection_id
ORDER BY COUNT(*) DESC
As said, this works great.
The only issue is, when this is included in an "WHERE IN" query, it loses it's order and is returned randomly. For clarity, here is the query:
SELECT collections.id,
collections.title
images.img_small
FROM `collections`
INNER JOIN images ON images.id = collections.cover_photo
WHERE collections.`id` IN
(SELECT collections_ids.collection_id
FROM `keywords`
INNER JOIN collections_ids ON keywords.id = collections_ids.photo_id
WHERE keywords.`keyword` = 'trees'
GROUP BY collection_id
ORDER BY COUNT(*) DESC)
I've tried researching, and people have suggested using the FIELD function, but I don't see that working in this context.
Any suggestions?
you can use sub query as join and take it count(*) as order by like below
SELECT collections.id,
collections.title
images.img_small
FROM `collections`
INNER JOIN images ON images.id = collections.cover_photo
INNER JOIN
(SELECT distinct collections_ids.collection_id As collection_id,COUNT(*) as total
FROM `keywords`
INNER JOIN collections_ids ON keywords.id = collections_ids.photo_id
WHERE keywords.`keyword` = 'trees'
GROUP BY collection_id
) as A
ON A.collection_id =collections.collection_id
order by A.total

Getting max record on varchar field

I have this query
SELECT
s.account_number,
a.id AS 'ASPIRION ID',
a.patient_first_name,
a.patient_last_name,
s.admission_date,
s.total_charge,
astat.name AS 'STATUS',
astat.definition,
latest_note.content AS 'LAST NOTE',
a.insurance_company
FROM
accounts a
INNER JOIN
services s ON a.id = s.account_id
INNER JOIN
facilities f ON f.id = a.facility_id
INNER JOIN
account_statuses astat ON astat.id = a.account_status_id
INNER JOIN
(SELECT
account_id, MAX(content) content, MAX(created)
FROM
notes
GROUP BY account_id) latest_note ON latest_note.account_id = a.id
WHERE
a.facility_id = 56
My problem comes from
(SELECT
account_id, MAX(content) content, MAX(created)
FROM
notes
GROUP BY account_id)
Content is a varchar field and I am needed to get the most recent record. I now understand that MAX will not work on a varchar field the way that I want it. I am not sure how to be able to get the corresponding content with the MAX id and group that by account id on in this join.
What would be the best way to do this?
My notes table looks like this...
id account_id content created
1 1 This is a test 2011-03-16 02:06:40
2 1 More test 2012-03-16 02:06:40
Here are two choices. If your content is not very long and don't have funky characters, you can use the substring_index()/group_concat() trick:
(SELECT account_id,
SUBSTRING_INDEX(GROUP_CONCAT(content ORDER BY created desc SEPARATOR '|'
), 1, '|') as content
FROM notes
GROUP BY account_id
) latest_note
ON latest_note.account_id = a.id
Given the names of the columns and tables, that is likely not to work. Then you need an additional join or a correlated subquery in the from clause. I think that might be easiest in this case:
select . . .,
(select n.content
from notes n
where n.account_id = a.id
order by created desc
limit 1
) as latest_note
from . . .
The advantage to this method is that it only gets the notes for the rows you need. And, you don't need a left join to keep all the rows. For performance, you want an index on notes(account_id, created).
SELECT
s.account_number,
a.id AS 'ASPIRION ID',
a.patient_first_name,
a.patient_last_name,
s.admission_date,
s.total_charge,
astat.name AS 'STATUS',
astat.definition,
latest_note.content AS 'LAST NOTE',
a.insurance_company
FROM
accounts a
INNER JOIN services s ON a.id = s.account_id
INNER JOIN facilities f ON f.id = a.facility_id
INNER JOIN account_statuses astat ON astat.id = a.account_status_id
INNER JOIN
(SELECT account_id, MAX(created) mxcreated
FROM notes GROUP BY account_id) latest_note ON latest_note.account_id = a.id and
latest_note.mxcreated = --datetime column from any of the other tables being used
WHERE a.facility_id = 56
You have to join on the max(created) which would give the latest content.
Or you can change the query to
SELECT account_id, content, MAX(created) mxcreated
FROM notes GROUP BY account_id
as mysql allows you even if you don't include all non-aggregated columns in group by clause. However, unless you join on the max date you wouldn't get the correct results.
The last created record is the one for which does not exist a newer one. Hence:
SELECT
s.account_number,
a.id AS "ASPIRION ID",
a.patient_first_name,
a.patient_last_name,
s.admission_date,
s.total_charge,
astat.name AS "STATUS",
astat.definition,
latest_note.content AS "LAST NOTE",
a.insurance_company
FROM accounts a
INNER JOIN services s ON a.id = s.account_id
INNER JOIN facilities f ON f.id = a.facility_id
INNER JOIN account_statuses astat ON astat.id = a.account_status_id
INNER JOIN
(
SELECT account_id, content
FROM notes
WHERE NOT EXISTS
(
SELECT *
FROM notes newer
WHERE newer.account_id = notes.account_id
AND newer.created > notes.created
)
) latest_note ON latest_note.account_id = a.id
WHERE a.facility_id = 56;

Mysql SUM Float give wrong value [duplicate]

I'm looking for help using sum() in my SQL query:
SELECT links.id,
count(DISTINCT stats.id) as clicks,
count(DISTINCT conversions.id) as conversions,
sum(conversions.value) as conversion_value
FROM links
LEFT OUTER JOIN stats ON links.id = stats.parent_id
LEFT OUTER JOIN conversions ON links.id = conversions.link_id
GROUP BY links.id
ORDER BY links.created desc;
I use DISTINCT because I'm doing "group by" and this ensures the same row is not counted more than once.
The problem is that SUM(conversions.value) counts the "value" for each row more than once (due to the group by)
I basically want to do SUM(conversions.value) for each DISTINCT conversions.id.
Is that possible?
I may be wrong but from what I understand
conversions.id is the primary key of your table conversions
stats.id is the primary key of your table stats
Thus for each conversions.id you have at most one links.id impacted.
You request is a bit like doing the cartesian product of 2 sets :
[clicks]
SELECT *
FROM links
LEFT OUTER JOIN stats ON links.id = stats.parent_id
[conversions]
SELECT *
FROM links
LEFT OUTER JOIN conversions ON links.id = conversions.link_id
and for each link, you get sizeof([clicks]) x sizeof([conversions]) lines
As you noted the number of unique conversions in your request can be obtained via a
count(distinct conversions.id) = sizeof([conversions])
this distinct manages to remove all the [clicks] lines in the cartesian product
but clearly
sum(conversions.value) = sum([conversions].value) * sizeof([clicks])
In your case, since
count(*) = sizeof([clicks]) x sizeof([conversions])
count(*) = sizeof([clicks]) x count(distinct conversions.id)
you have
sizeof([clicks]) = count(*)/count(distinct conversions.id)
so I would test your request with
SELECT links.id,
count(DISTINCT stats.id) as clicks,
count(DISTINCT conversions.id) as conversions,
sum(conversions.value)*count(DISTINCT conversions.id)/count(*) as conversion_value
FROM links
LEFT OUTER JOIN stats ON links.id = stats.parent_id
LEFT OUTER JOIN conversions ON links.id = conversions.link_id
GROUP BY links.id
ORDER BY links.created desc;
Keep me posted !
Jerome
Jeromes solution is actually wrong and can produce incorrect results!!
sum(conversions.value)*count(DISTINCT conversions.id)/count(*) as conversion_value
let's assume the following table
conversions
id value
1 5
1 5
1 5
2 2
3 1
the correct sum of value for distinct ids would be 8.
Jerome's formula produces:
sum(conversions.value) = 18
count(distinct conversions.id) = 3
count(*) = 5
18*3/5 = 9.6 != 8
For an explanation of why you were seeing incorrect numbers, read this.
I think that Jerome has a handle on what is causing your error. Bryson's query would work, though having that subquery in the SELECT could be inefficient.
Use the following query:
SELECT links.id
, (
SELECT COUNT(*)
FROM stats
WHERE links.id = stats.parent_id
) AS clicks
, conversions.conversions
, conversions.conversion_value
FROM links
LEFT JOIN (
SELECT link_id
, COUNT(id) AS conversions
, SUM(conversions.value) AS conversion_value
FROM conversions
GROUP BY link_id
) AS conversions ON links.id = conversions.link_id
ORDER BY links.created DESC
I use a subquery to do this. It eliminates the problems with grouping.
So the query would be something like:
SELECT COUNT(DISTINCT conversions.id)
...
(SELECT SUM(conversions.value) FROM ....) AS Vals
How about something like this:
select l.id, count(s.id) clicks, count(c.id) clicks, sum(c.value) conversion_value
from (SELECT l.id id, l.created created,
s.id clicks,
c.id conversions,
max(c.value) conversion_value
FROM links l
LEFT JOIN stats s ON l.id = s.parent_id
LEFT JOIN conversions c ON l.id = c.link_id
GROUP BY l.id, l.created, s.id, c.id) t
order by t.created
This will do the trick, just divide the sum with the count of conversation id which are duplicate.
SELECT a.id,
a.clicks,
SUM(a.conversion_value/a.conversions) AS conversion_value,
a.conversions
FROM (SELECT links.id,
COUNT(DISTINCT stats.id) AS clicks,
COUNT(conversions.id) AS conversions,
SUM(conversions.value) AS conversion_value
FROM links
LEFT OUTER JOIN stats ON links.id = stats.parent_id
LEFT OUTER JOIN conversions ON links.id = conversions.link_id
GROUP BY conversions.id,links.id
ORDER BY links.created DESC) AS a
GROUP BY a.id
Select sum(x.value) as conversion_value,count(x.clicks),count(x.conversions)
FROM
(SELECT links.id,
count(DISTINCT stats.id) as clicks,
count(DISTINCT conversions.id) as conversions,
conversions.value,
FROM links
LEFT OUTER JOIN stats ON links.id = stats.parent_id
LEFT OUTER JOIN conversions ON links.id = conversions.link_id
GROUP BY conversions.id) x
GROUP BY x.id
ORDER BY x.created desc;
I believe this will give you the answer that you are looking for.

MySQL how to select 1 if MAX() equals current row SUM() aggregate value?

I'd like to select a new column named sliced (value can be 1/0 or true/false it doesn't matter) if area of the current row equals MAX(SUM(c.area)), that is flag the row with highest aggregate value:
SELECT p.name AS name, SUM(c.area) AS area
FROM City AS c
INNER JOIN Province AS p ON c.province_id = p.id
INNER JOIN Region AS r ON p.region_id = r.id
WHERE r.id = ?
GROUP BY p.id
ORDER BY p.name ASC
I've tried adding to the selection area = MAX(area) AS sliced or even area = SUM(MAX(c.area)) AS sliced but i'm getting a syntax error. I've to admit i'm not so good in SQL. Thank you.
As I understand your question, this should do it. Creates a pseudo-column which returns 1 when the area is the same as max(area) without any conditions to restrict your selection.
SELECT name
, area
, case area when max_area then 1 else 0 end as sliced
FROM ( SELECT name
, area
, max(area) over (partition by 1) as max_area
FROM ( SELECT p.name AS name
, SUM(c.area) AS area
FROM City AS c
INNER JOIN Province AS p ON c.province_id = p.id
INNER JOIN Region AS r ON p.region_id = r.id
WHERE r.id = ?
GROUP BY p.id
ORDER BY p.name ASC )
)
EDIT As #Glide says you can't perform nested aggregation so sum(max(area)) won't work and you need to perform these operations one query at a time.
Here's a way to do it with just one group by:
set #row := 0;
select name, area, sliced
from (
select name, area, (#row := #row + 1) = 1 as sliced
from (
SELECT p.name, SUM(c.area) AS area
FROM City AS c
INNER JOIN Province AS p ON c.province_id = p.id
INNER JOIN Region AS r ON p.region_id = r.id
WHERE r.id = ?
GROUP BY 1
ORDER BY 2 desc) t1
) t2
order by 1;
The inner query (t1) does the group by and orders by total area largest first.
The next query (t2) gives the first row a value of true for column sliced, all other rows false.
The outer query orders the rows in the way you want - by name.
Since there's only one table scan and group by, this should be very efficient.
As comments have mentioned, you'd have to check all the values against another query. This is normal practice in SQL.
SELECT
p.name AS name,
SUM(c.area) AS area,
CASE WHEN SUM(c.area) = (SELECT MAX(area) FROM <repeat your query here>) THEN 1 ELSE 0 END
FROM
City AS c
INNER JOIN
Province AS p
ON c.province_id = p.id
INNER JOIN
Region AS r
ON p.region_id = r.id
WHERE
r.id = ?
GROUP BY
p.id
ORDER BY
p.name ASC
The biggest downside to this is that you've had to repeat the code, which is just messy and a maintenance headache.
The alternative is to insert all the data into a temporary table, with the sliced field being 0 for all records. Then update that table, setting sliced to 1 for the record(s) with the highest area.