Mysql count same values and group by max - mysql

I write a report system so users can report any content by choosing some specific reasons.
I have reports table and one of the columns named "reason" it is enum.
I'm trying to list reported posts on moderation page, according to the column (reason) content with highest number in the table.
The reports table:
Reports
"violation" is 2. Others are single. So on top of post there should be written "violence".
The posts table:
Posts
What I try:
SELECT * FROM reports r
INNER JOIN posts p
ON r.contentid=p.id
WHERE r.type=:type
AND (SELECT reason, count(*)
AS NUM FROM reports
GROUP BY reason)
GROUP BY p.id
I have no any data on the moderation page, help pls.

GROUPING should have all columns that are not in the GROUP BY should be in a aggregation function
But i think you want only want an ORDER BY i9nstead of a GROUP BY
SELECT r.*,p.*,NUM
FROM reports r
INNER JOIN posts p
ON r.contentid=p.id
INNER JOIN (SELECT reason, count(*)
AS NUM FROM reports
GROUP BY reason) t1
ON t1.reason = r.reason
WHERE r.type=:type
ORDER BY NUM DESC
LIMIT 10

Related

SQL Query for getting maximum value from a column Joining from Another Table

This is a slight variant of the question I asked here
SQL Query for getting maximum value from a column
I have a Person Table and an Activity Table with the following data
-- PERSON-----
------ACTIVITY------------
I have got this data in the database about users spending time on a particular activity.
I intend to get the data when every user has spent the maximum number of hours.
My Query is
Select p.Id as 'PersonId',
p.Name as 'Name',
act.HoursSpent as 'Hours Spent',
act.Date as 'Date'
From Person p
Left JOIN (Select MAX(HoursSpent), Date from Activity
Group By HoursSpent, Date) act
on act.personId = p.Id
but it is giving me all the rows for Person and not with the Maximum Numbers of Hours Spent.
This should be my result.
You have several issues with your query:
The subquery to get hours is aggregated by date, not person.
You don't have a way to bring in other columns from activity.
You can take this approach -- joins and group by, but it requires two joins:
select p.*, a.* -- the columns you want
from Person p left join
activity a
on a.personId = p.id left join
(select personid, max(HoursSpent) as max_hoursspent
from activity a
group by personid
) ma
on ma.personId = a.personId and
ma.max_hoursspent = a.hoursspent;
Note that this can return duplicates for a given person -- if there are ties for the maximum.
This is written more colloquially using row_number():
select p.*, a.* -- the columns you want
from Person p left join
(select a.*,
row_number() over (partition by a.personid order by a.hoursspent desc) as seqnum
from activity a
) a
on a.personId = p.id and a.seqnum = 1
ma.max_hoursspent = a.hoursspent;

MYSQL count listings for each user where listing published in past

I have a MYSQL query that I am having difficulties getting to do what I want.
I have a users table (userstbl) containing all my user records, and a listings table (listings) contains all listings posted by each user. I am trying to select the name and address of each user and provide a count of listings for each user which was listed between a certain date range, but only count adverts for unique category_id's which is working fine.
The issue is that I only want to count listings that have been published. I have another table which is identical to my listings table called "listings_log" and contains a record for every change made to every listing record. If one of the records in "listings_log" for the listing has a "listings_log.published=1" than the listing was published. Each record in the "listings_log" table has a "listing_id" which is the same as in the "listings" table.
This is the query I have now :
SELECT
userstbl.userid,
userstbl.fullname,
userstbl.fulladdress,
COUNT(DISTINCT(
CASE WHEN listings.ad_type = 1
AND DATE(listings.date_listed) BETWEEN '2018-01-01' AND '2018-04-01'
THEN listings.category_id
END )
) AS Listings_Count_2018,
DATE_FORMAT(userstbl.reg_date, "%d/%m/%Y") AS RegisteredDate
FROM
users
LEFT JOIN listings ON listings.userid = userstbl.user_id
GROUP BY userstbl.userid
This counts the number of unique listings records between the correct dates for each user.
But I somehow only need to count listings records, where there is a corresponding listings_log record for that listing with published set to "1". The "listings_log" table and "listings" table both have a common listing_id column, but the listings_log table can have multiple records for each listing showing every change to each listing.
So I want to also join on the listings_log.listing_id = listings.listing_id and at least one of the "listings_log" records for that "listing_id" has listings_log.published = "1".
As you did not provide sample tables and a minimal reproducible example, a lot of this is guesswork. I am assuming for each user you want the total number of listing records. I built up the SQL with subqueries that are meant to be read "from the inside out."
select u.userid, u.fullname, u.fulladdress, sq.count from usertbl u join (
select u.userid, sum(c.count) as count from usertbl u join (
select count(*) as count, l.userid, l.listing_id from listings l join (
select distinct listing_id from listings_log where listings_log.published = "1"
) ll on l.listing_id = ll.listing_id
and l.ad_type = 1
and date(l.date_listed) between '2018-01-01' and '2018-04-01'
group by l.userid, l.listing_id
) c on u.userid = c.userid
group by u.userid
) sq on u.userid = sq.userid
;
See DB Fiddle

Wrong use of inner join function / group function?

I have the following problem with my query:
I have two tables:
Customer
Subscriber
linked together by customer.id=subscriber.customer_id
in the subscriber table, I have records with id_customer=0 (these are email records, that do not have a full customer account)
Now i want to show how many customers I have per day, and how many subscribers with id_customer, and how many subscribers WITH id_customer=0 (emailonlies i call them)
Somehow, i cannot manage to get those emailonlies.
Perhaps it has something to do with not using the right join type.
When i use left join, i get the right amount of customers, but not the right amount of emailonlies. When I use inner join i get the wrong amount of customers. Am i using the group function correctly? i think it has something to do with that.
THIS IS MY QUERY:
` SELECT DATE(c.date_register),
COUNT(DISTINCT c.id) AS newcustomers,
COUNT(DISTINCT s.customer_id) AS newsubscribedcustomers,
COUNT(DISTINCT s.subscriber_id AND s.customer_id=0) AS emailonlies
FROM customer c
LEFT JOIN subscriber s ON s.customer_id=c.id
GROUP BY DATE(c.date_register)
ORDER BY DATE(c.date_register) DESC
LIMIT 10
;`
I'm not entirely sure, but I think in DISTINCT s.subscriber_id AND s.customer_id=0, it runs the AND before the DISTINCT, so the DISTINCT only ever sees true and false.
Why don't you just take
COUNT(DISTINCT s.subscriber_id) - (COUNT(DISTINCT s.customer_id) - 1)?
(The -1 is there because DISTINCT s.customer_id will count 0.)
Got it, only risk is that i get no email onlies if there are no customers on this day, becuase of the left join. But this one works:
SELECT customers.regdatum,customers.customersqty,subscribers.emailonlies
FROM (
(SELECT DATE(c.date_register) AS regdatum,COUNT(DISTINCT c.id) AS customersqty
FROM customer c
GROUP BY DATE(c.date_register)
) AS customers
LEFT JOIN
(SELECT DATE(s.added) AS voegdatum,COUNT(DISTINCT s.subscriber_id) AS emailonlies
FROM subscriber s
WHERE s.customer_id=0
GROUP BY DATE(s.added)
) AS subscribers
ON customers.regdatum=subscribers.voegdatum
)
ORDER BY customers.regdatum DESC
;

MySQL is not using INDEX in subquery

I have these tables and queries as defined in sqlfiddle.
First my problem was to group people showing LEFT JOINed visits rows with the newest year. That I solved using subquery.
Now my problem is that that subquery is not using INDEX defined on visits table. That is causing my query to run nearly indefinitely on tables with approx 15000 rows each.
Here's the query. The goal is to list every person once with his newest (by year) record in visits table.
Unfortunately on large tables it gets real sloooow because it's not using INDEX in subquery.
SELECT *
FROM people
LEFT JOIN (
SELECT *
FROM visits
ORDER BY visits.year DESC
) AS visits
ON people.id = visits.id_people
GROUP BY people.id
Does anyone know how to force MySQL to use INDEX already defined on visits table?
Your query:
SELECT *
FROM people
LEFT JOIN (
SELECT *
FROM visits
ORDER BY visits.year DESC
) AS visits
ON people.id = visits.id_people
GROUP BY people.id;
First, is using non-standard SQL syntax (items appear in the SELECT list that are not part of the GROUP BY clause, are not aggregate functions and do not sepend on the grouping items). This can give indeterminate (semi-random) results.
Second, ( to avoid the indeterminate results) you have added an ORDER BY inside a subquery which (non-standard or not) is not documented anywhere in MySQL documentation that it should work as expected. So, it may be working now but it may not work in the not so distant future, when you upgrade to MySQL version X (where the optimizer will be clever enough to understand that ORDER BY inside a derived table is redundant and can be eliminated).
Try using this query:
SELECT
p.*, v.*
FROM
people AS p
LEFT JOIN
( SELECT
id_people
, MAX(year) AS year
FROM
visits
GROUP BY
id_people
) AS vm
JOIN
visits AS v
ON v.id_people = vm.id_people
AND v.year = vm.year
ON v.id_people = p.id;
The: SQL-fiddle
A compound index on (id_people, year) would help efficiency.
A different approach. It works fine if you limit the persons to a sensible limit (say 30) first and then join to the visits table:
SELECT
p.*, v.*
FROM
( SELECT *
FROM people
ORDER BY name
LIMIT 30
) AS p
LEFT JOIN
visits AS v
ON v.id_people = p.id
AND v.year =
( SELECT
year
FROM
visits
WHERE
id_people = p.id
ORDER BY
year DESC
LIMIT 1
)
ORDER BY name ;
Why do you have a subquery when all you need is a table name for joining?
It is also not obvious to me why your query has a GROUP BY clause in it. GROUP BY is ordinarily used with aggregate functions like MAX or COUNT, but you don't have those.
How about this? It may solve your problem.
SELECT people.id, people.name, MAX(visits.year) year
FROM people
JOIN visits ON people.id = visits.id_people
GROUP BY people.id, people.name
If you need to show the person, the most recent visit, and the note from the most recent visit, you're going to have to explicitly join the visits table again to the summary query (virtual table) like so.
SELECT a.id, a.name, a.year, v.note
FROM (
SELECT people.id, people.name, MAX(visits.year) year
FROM people
JOIN visits ON people.id = visits.id_people
GROUP BY people.id, people.name
)a
JOIN visits v ON (a.id = v.id_people and a.year = v.year)
Go fiddle: http://www.sqlfiddle.com/#!2/d67fc/20/0
If you need to show something for people that have never had a visit, you should try switching the JOIN items in my statement with LEFT JOIN.
As someone else wrote, an ORDER BY clause in a subquery is not standard, and generates unpredictable results. In your case it baffled the optimizer.
Edit: GROUP BY is a big hammer. Don't use it unless you need it. And, don't use it unless you use an aggregate function in the query.
Notice that if you have more than one row in visits for a person and the most recent year, this query will generate multiple rows for that person, one for each visit in that year. If you want just one row per person, and you DON'T need the note for the visit, then the first query will do the trick. If you have more than one visit for a person in a year, and you only need the latest one, you have to identify which row IS the latest one. Usually it will be the one with the highest ID number, but only you know that for sure. I added another person to your fiddle with that situation. http://www.sqlfiddle.com/#!2/4f644/2/0
This is complicated. But: if your visits.id numbers are automatically assigned and they are always in time order, you can simply report the highest visit id, and be guaranteed that you'll have the latest year. This will be a very efficient query.
SELECT p.id, p.name, v.year, v.note
FROM (
SELECT id_people, max(id) id
FROM visits
GROUP BY id_people
)m
JOIN people p ON (p.id = m.id_people)
JOIN visits v ON (m.id = v.id)
http://www.sqlfiddle.com/#!2/4f644/1/0 But this is not the way your example is set up. So you need another way to disambiguate your latest visit, so you just get one row per person. The only trick we have at our disposal is to use the largest id number.
So, we need to get a list of the visit.id numbers that are the latest ones, by this definition, from your tables. This query does that, with a MAX(year)...GROUP BY(id_people) nested inside a MAX(id)...GROUP BY(id_people) query.
SELECT v.id_people,
MAX(v.id) id
FROM (
SELECT id_people,
MAX(year) year
FROM visits
GROUP BY id_people
)p
JOIN visits v ON (p.id_people = v.id_people AND p.year = v.year)
GROUP BY v.id_people
The overall query (http://www.sqlfiddle.com/#!2/c2da2/1/0) is this.
SELECT p.id, p.name, v.year, v.note
FROM (
SELECT v.id_people,
MAX(v.id) id
FROM (
SELECT id_people,
MAX(year) year
FROM visits
GROUP BY id_people
)p
JOIN visits v ON ( p.id_people = v.id_people
AND p.year = v.year)
GROUP BY v.id_people
)m
JOIN people p ON (m.id_people = p.id)
JOIN visits v ON (m.id = v.id)
Disambiguation in SQL is a tricky business to learn, because it takes some time to wrap your head around the idea that there's no inherent order to rows in a DBMS.

Stepped table not sorting

I'm trying to create a stepped table report using SQL report builder 3.0. The stepped report contains Groups/devices/users along with associated totals for each group/device/user.
I want the entire report to be sorted by these totals along with each individual step sorted this way also.
Currently users are sorted by their totals, but not devices or groups.
Is there a way to sort the other steps?
You can just do this in SQL using some nested queries. Let's assume you have the following tables: Transaction, User, Device, and Group. The transaction table records the transactions of the User on a Device and has an Amount field to sum. A user belongs to a Group.
So you need to sum the Amount for the User, for the Groups and for the Devices used within a Group which will give you SQL that looks like this:
SELECT G.Description AS [Group], D.Description AS Device, U.Description AS UserName, MAX(GT.GroupTotal) AS GroupTotal, MAX(GDT.GroupDeviceTotal) AS GroupDeviceTotal, SUM(T.Amount) AS UserTotal
FROM Transaction AS T
INNER JOIN User AS U ON L.UserId = F.UserId
INNER JOIN Group AS G ON G.GroupId = L.GroupId
INNER JOIN Device AS D ON T.DeviceId = L.DeviceId
INNER JOIN
(SELECT GroupId, SUM(Amount) AS GroupTotal
FROM Transaction
INNER JOIN User ON User.UserId = Transaction.UserId
WHERE (Transaction.TxDate >= '2011-01-01')
GROUP BY User.GroupId) AS GT ON GT.GroupId = U.GroupId
INNER JOIN
(SELECT GroupId, DeviceId, SUM(Amount) AS GroupDeviceTotal
FROM Transaction
INNER JOIN User ON User.UserId = Transaction.UserId
WHERE (TxDate >= '2011-01-01')
GROUP BY GroupId, DeviceId) AS GDT ON GDT.GroupId = U.GroupId AND GDT.DeviceId = T.DeviceId
WHERE (T.TxDate >= '2011-01-01')
GROUP BY G.GroupId, D.DeviceId, U.UserId
ORDER BY GroupTotal DESC, GroupDeviceTotal DESC, UserTotal DESC
Note that the where clause you use has to be the same in the main query and each nested query (this is the "WHERE (T.TxDate >= '2011-01-01')" bit).
You can try going to the Row/Column groups area... then for each group you have, double click the group, select "Sorting" and then add as many sorting fields as you need for the info contained at that group level.
If you have other sorts applied on the data... such as to the tablix/matrix, sometimes SSRS can get confused, so If my suggestion does help you with the effect you're going for but there are some issues, try removing all other sorting you've applied to the data elsewhere in the report besides on those groups... And I would start with the innermost and work out, trying not to repeat a field that is in a lower group's data. (if that makes sense).
edit:
So, let's say we have a report for a vet's office that shows client information, and we want to group by personID, petID and visitID. The tablix as a whole would be sorted by the person's name (or last name, then first name... or whatever). Then your first group would group on the personID and be sorted by the petName. The second, lower group would group on the petID and be sorted by the visitDate. The third level would group on the visitID, and... this doesn't really need to be sorted unless by visitTime if its not included in visitDate.