What's wrong on this query? - mysql

I'm selecting total count of villages, total count of population from my tables to build statistics. However, there is something wrong. It returns me everything (530 pop (there are 530 pop in total), (106 villages (there are 106 users in total)) in first row, next rows are NULLs
SELECT s1_users.id userid, (
SELECT count( s1_vdata.wref )
FROM s1_vdata, s1_users
WHERE s1_vdata.owner = userid
)totalvillages, (
SELECT SUM( s1_vdata.pop )
FROM s1_users, s1_vdata
WHERE s1_vdata.owner = userid
)pop
FROM s1_users
WHERE s1_users.dp >=0
ORDER BY s1_users.dp DESC

Try removing s1_users from inner SELECTS

You're already using INNER JOINs. Whan you list tables separated with comma, it is a shortcut for INNER JOIN.
Now, the most obvious answer is that your subqueries using aggregating functions (COUNT and SUM) are missing a GROUP BY clauses.
SELECT s1_users.id userid, (
SELECT count( s1_vdata.wref )
FROM s1_vdata, s1_users
WHERE s1_vdata.owner = userid
GROUP BY s1_vdata.owner
)totalvillages, (
SELECT SUM( s1_vdata.pop )
FROM s1_users, s1_vdata
WHERE s1_vdata.owner = userid
GROUP BY s1_vdata.owner
)pop
FROM s1_users
WHERE s1_users.dp >=0
ORDER BY s1_users.dp DESC
However, using subqeries in column list is really inefficient. It casues subqueries to be run once for each row in outer query.
Try like this instead
SELECT
s1_users.id AS userid,
COUNT(s1_vdata.wref) AS totalvillages,
SUM(s1.vdata.pop) AS pop
FROM
s1_users, s1_vdata --I'm cheating here! There's hidden INNER JOIN in this line ;P
WHERE
s1_users.dp >= 0
AND s1_users.id = s1_vdata.owner
GROUP BY
s1_users.id
ORDER BY
s1_users.dp DESC

SELECT s1_users.id AS userid,
(
SELECT COUNT(*)
FROM s1_vdata
WHERE s1_vdata.owner = userid
) AS totalvillages,
(
SELECT SUM(pop)
FROM s1_vdata
WHERE s1_vdata.owner = userid
) AS pop
FROM s1_users
WHERE dp >= 0
ORDER BY
dp DESC
Note that this is less efficient than this query:
SELECT s1_users.id AS user_id, COUNT(s1_vdata.owner), SUM(s1_vdata.pop)
FROM s1_users
LEFT JOIN
s1_vdata
ON s1_vdata.owner = s1_users.id
GROUP BY
s1_users.id
ORDER BY
dp DESC
since the aggregation needs to be done twice in the former.

SELECT userid,totalvillages,pop from
(
SELECT s1_users.id as userid, count( s1_vdata.wref ) as totalvillages
FROM s1_vdata, s1_users
WHERE s1_vdata.owner = userid
GROUP BY s1_users.id) tabl1 INNER JOIN
(
SELECT s1_users.id as userid, SUM( s1_vdata.pop ) as pop
FROM s1_users, s1_vdata
WHERE s1_vdata.owner = userid
GROUP BY s1_users.id) tabl2 on tabl1.userid = tabl2.userid

Related

Duplicates in pre-aggregated sub-query sql

I have two tables with many-to-many relationship. I am trying to get values from both of the table where UserId is unique (I'm joining these table on this value)
I am rying to use pre aggregated query, but I get error
Column 'clv.ProbabilityAlive' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
I understand that I should add these all values to group by clause, but then I am getting duplicates because peakClv values repeat.
If i am using simple join then it takes forever because of many to many relationship.
this is my query:
SELECT
distinct(s.userid) as userId,
s.ProbabilityAlive AS ProbabilityAlive,
a.PeakClv as PeakClv
FROM (
SELECT [UserId], ([sb].[ProbabilityAlive]) AS ProbabilityAlive
FROM clv as sb
WHERE sb.[CalculationDate] = '20200311'
GROUP BY [UserId]
) s
LEFT JOIN (
SELECT [UserId], PeakClv
FROM [dbo].[AdditionalClvData] where peakClv > 1
GROUP BY [UserId]
) a ON a.[UserId] = s.[UserId]
I am a bit out of ideas could someone lend a hand?
I also tried using distinct like one answer suggested:
SELECT
distinct (s.userid) as userId,
s.ProbabilityAlive AS ProbabilityAlive,
a.PeakClv as PeakClv
FROM (
SELECT DISTINCT ([UserId]), ([sb].[ProbabilityAlive]) AS
ProbabilityAlive
FROM clv as sb
WHERE sb.[CalculationDate] = '10/09/2020 00:00:00' AND sb.
[EstimatedNumberOfTransactionsLong] >= 0 AND sb.
[EstimatedNumberOfTransactionsLong] <= 5680 AND sb.[ClientId] = '16'
AND sb.[Product] = 'Total'
ORDER BY sb.[userId] asc OFFSET (1 - 1) * 10 ROWS FETCH NEXT 10 ROWS
ONLY
) s
LEFT JOIN (
SELECT DISTINCT [UserId], PeakClv
FROM [dbo].[AdditionalClvData]
) a ON a.[UserId] = s.[UserId]
but I still get duplicates:
If you have not aggregation function like SUM(), MAX() .. you can't use GROUP BY
SELECT
distinct s.userid as userId,
s.ProbabilityAlive AS ProbabilityAlive,
a.PeakClv as PeakClv
FROM (
SELECT DISTINCT [UserId], ([sb].[ProbabilityAlive]) AS ProbabilityAlive
FROM clv as sb
WHERE sb.[CalculationDate] = '20200311'
) s
LEFT JOIN (
SELECT DISTINCT [UserId], PeakClv
FROM [dbo].[AdditionalClvData] where peakClv > 1
) a ON a.[UserId] = s.[UserId]
if you need distinct (not repeated rows) use distinct
but looking to you img seems you need an aggregation function on PeakClv eg max() and group by
SELECT
s.userid as userId,
s.ProbabilityAlive AS ProbabilityAlive,
max(a.PeakClv) as PeakClv
FROM (
SELECT DISTINCT [UserId], ([sb].[ProbabilityAlive]) AS ProbabilityAlive
FROM clv as sb
WHERE sb.[CalculationDate] = '20200311'
) s
LEFT JOIN (
SELECT DISTINCT [UserId], PeakClv
FROM [dbo].[AdditionalClvData] where peakClv > 1
) a ON a.[UserId] = s.[UserId]
GROUP BY s.userid,
s.ProbabilityAlive

Finding top 5 results for multiple values in sql result

I have the following sql query:
SELECT v.venue_id, s.zip, COUNT( * )
FROM bcs_scans s
JOIN bcs_scanners sc ON s.uuid = sc.uuid
JOIN bcs_venues v ON sc.venue_id = v.venue_id
WHERE v.banlist_id = '625'
AND s.del =0
GROUP BY s.zip
ORDER BY COUNT( * ) DESC
Which returns the count of individual zip codes, their count, and associated venue.
How do I go about selecting the top 5 zip codes for each unique venue id?
I believe I can run a subquery that groups results by venue id with the top 5 zip counts, but I am unsure of where to start
Could be you select the result in this way ... a bit complex ..
using the having for extract the value that match the max count group by venue_id from your original query ..
SELECT v.venue_id as venue_id, s.zip as , COUNT( * ) as num
FROM bcs_scans s
JOIN bcs_scanners sc ON s.uuid = sc.uuid
JOIN bcs_venues v ON sc.venue_id = v.venue_id
WHERE v.banlist_id = '625'
AND s.del =0
GROUP BY s.zip
HAVING ( v.venue_id, COUNT( * )) in
(select venue_id, max(num)
from
(SELECT v.venue_id as venue_id, s.zip as , COUNT( * ) as num
FROM bcs_scans s
JOIN bcs_scanners sc ON s.uuid = sc.uuid
JOIN bcs_venues v ON sc.venue_id = v.venue_id
WHERE v.banlist_id = '625'
AND s.del =0
GROUP BY s.zip
ORDER BY COUNT( * ) DESC ) a t
group by venue_id)
ORDER BY COUNT( * ) limit 5

combining multiple sql queries together

I have multiple table for a project (sessions , charges and payments)
To get the sessions i'm doing the following :
SELECT
sess.file_id, SUM(sess.rate * sess.length) AS total
FROM
sess
WHERE sess.sessionDone = 1
GROUP BY sess.file_id
This will return the amount that a specific student should pay
I also have another table "charges"
SELECT
file_charges.file_id, SUM(file_charges.price) AS total_charges
FROM
file_charges
GROUP BY file_charges.file_id
And finally the payment query :
SELECT
file_payments.file_id, SUM(file_payments.paymentAmount) AS total_payment
FROM
file_payments
GROUP BY file_payments.file_id
Can i combine those 3 in a way to have :
Total = Payments - (Session + Charges)
Note that it could be negative so i could have file_id that exists in session , charges but not in payments and i could have a payment without sessions or charges ...
Edit : http://sqlfiddle.com/#!2/a90d9
One issue that needs to be addressed is whether one of these queries can be the "driver", in cases where we don't have rows for a given file_id returned by one or more of the queries. (e.g. there might be rows from sess, but none from file_payments. If we want to be sure to include every possible file_id that appears in any of the queries, we can get a list of all possible file_id with a query like this:
SELECT ss.file_id FROM sess ss
UNION
SELECT fc.file_id FROM file_charges fc
UNION
SELECT fp.file_id FROM file_payments fp
(NOTE: The UNION operator will remove any duplicates)
To get the specified resultset, we can use that query, along with "left joins" of the other three original queries. The outline of the query will be:
SELECT a.file_id, p.total_payment - ( s.total + c.total_charges)
FROM a
LEFT JOIN s ON s.file_id = a.file_id
LEFT JOIN c ON c.file_id = a.file_id
LEFT JOIN p ON p.file_id = a.file_id
ORDER BY a.file_id
In that statement a is a standin for the query that gets the set of all file_id values (as shown above). The s, c and p are standins for your three original queries, on sess, file_charges and file_payments, respectively.
If any of the file_id values is "missing" from any of the queries, we are going to need to substitute a zero for the missing value. We can use the IFNULL function to handle that for us.
This query should return the specified resultset:
SELECT a.file_id
, IFNULL(p.total_payment,0) - ( IFNULL(s.total,0) + IFNULL(c.total_charges,0)) AS t
FROM ( -- all possible values of file_id
SELECT ss.file_id FROM sess ss
UNION
SELECT fc.file_id FROM file_charges fc
UNION
SELECT fp.file_id FROM file_payments fp
) a
LEFT
JOIN ( -- the amount that a specific student should pay
SELECT sess.file_id, SUM(sess.rate * sess.length) AS total
FROM sess
WHERE sess.sessionDone = 1
GROUP BY sess.file_id
) s
ON s.file_id = a.file_id
LEFT
JOIN ( -- charges
SELECT file_charges.file_id, SUM(file_charges.price) AS total_charges
FROM file_charges
GROUP BY file_charges.file_id
) c
ON c.file_id = a.file_id
LEFT
JOIN ( -- payments
SELECT file_payments.file_id, SUM(file_payments.paymentAmount) AS total_payment
FROM file_payments
GROUP BY file_payments.file_id
) p
ON p.file_id = a.file_id
ORDER BY a.file_id
(The EXPLAIN for this query is not going to be pretty, with four derived tables. On really large sets, performance may be horrendous. But the resultset returned should meet the specification.)
Beware of queries that JOIN all three tables together... that will likely give incorrect results when there are (for example) two (or more) rows for the same file_id in the file_payment table.
There are other approaches to getting an equivalent result set, but the query above answers the question: "how can i get the results of these queries joined together into a total".
Using correlated subqueries
Here's another approach, using correlated subqueries in the SELECT list...
SELECT a.file_id
, IFNULL( ( SELECT SUM(file_payments.paymentAmount) FROM file_payments
WHERE file_payments.file_id = a.file_id )
,0)
- ( IFNULL( ( SELECT SUM(sess.rate * sess.length) FROM sess
WHERE sess.file_id = a.file_id )
,0)
+ IFNULL( ( SELECT SUM(file_charges.price) FROM file_charges
WHERE file_charges.file_id = a.file_id )
,0)
) AS tot
FROM ( -- all file_id values
SELECT ss.file_id FROM sess ss
UNION
SELECT fc.file_id FROM file_charges fc
UNION
SELECT fp.file_id FROM file_payments fp
) a
ORDER BY a.file_id
try this
SELECT sess.file_id, SUM(file_payments.paymentAmount) - (SUM(sess.rate * sess.length)+SUM(file_charges.price)) as total_payment FROM sess , file_charges , file_payments
WHERE sess.sessionDone = 1
GROUP BY total_payment
EDIT.
SELECT a.file_id
, IFNULL(p.total_payment,0) - ( IFNULL(s.total,0) + IFNULL(c.total_charges,0)) AS tot
FROM (
SELECT ss.file_id FROM sess ss
UNION
SELECT fc.file_id FROM file_charges fc
UNION
SELECT fp.file_id FROM file_payments fp
) a
LEFT JOIN (
SELECT sess.file_id, SUM(sess.rate * sess.length) AS total
FROM sess
WHERE sess.sessionDone = 1
GROUP BY sess.file_id
) s
ON s.file_id = a.file_id
LEFT JOIN (
SELECT file_charges.file_id, SUM(file_charges.price) AS total_charges
FROM file_charges
GROUP BY file_charges.file_id
) c
ON c.file_id = a.file_id
LEFT JOIN (
SELECT file_payments.file_id, SUM(file_payments.paymentAmount) AS total_payment
FROM file_payments
GROUP BY file_payments.file_id
) p
ON p.file_id = a.file_id
ORDER BY a.file_id
DEMO HERE

MySQL - Combining multiple queries + counts

I'm attempting to combine a few queries and can't seem to nail it down. I was wondering if someone could point me in the right direction.
Here are the statements:
SELECT
I.id,
I.custname,
I.custemail,
I.sku,
DATE_FORMAT(FROM_UNIXTIME(I.ts), '%l:%i:%s %p, %c/%e/%Y') AS ts
FROM images I
WHERE I.stat = 0
SELECT
COUNT(*) AS total1
FROM images
WHERE stat = 1 AND sku = ?
SELECT
COUNT(*) AS total2
FROM images
WHERE stat = 1
AND sku IN (SELECT subsku FROM combo WHERE sku = ?)
Right now I'm using the 3 separate queries and am using code to add the two totals and display them. But I now need to be able to sort by the sum of the totals, so I'd like to get all of that data into one statement.. something like:
SELECT
I.id,
I.custname,
I.custemail,
I.sku,
DATE_FORMAT(FROM_UNIXTIME(I.ts), '%l:%i:%s %p, %c/%e/%Y') AS ts,
SUM(total1+total2)
FROM images I
WHERE I.stat = 0
But I'm unsure how to do that. I tried the code below, but it failed:
SELECT
I.id,
I.custname,
I.custemail,
I.sku,
DATE_FORMAT(FROM_UNIXTIME(I.ts),'%l:%i:%s %p, %c/%e/%Y') AS ts,
(
SELECT COUNT(*) AS total1 FROM images WHERE stat = 1 AND (
sku IN (
SELECT subsku FROM combo WHERE sku = I.sku
) OR sku = I.sku)
) AS skuct
FROM images I
WHERE stat = 0
Any help would be greatly appreciated. Many thanks!
UPDATE
First off thanks to everyone who has offered assistance. I've been working on the query and think I'm getting closer, but I'm now hitting a 'subquery returns more than 1 row' error:
SELECT
I.id,
I.custname,
I.custemail,
I.sku,
DATE_FORMAT(FROM_UNIXTIME(I.ts), '%l:%i:%s %p, %c/%e/%Y') AS ts,
(
SELECT COUNT(*)
FROM images
WHERE stat = 1
AND sku = I.sku
OR sku IN(
SELECT subsku FROM combo WHERE sku = I.sku
)
GROUP BY sku
) AS total
FROM images I
WHERE stat = 0
The problem is that the subquery SELECT subsku FROM combo WHERE... returns a resultset (0+ rows) vs a scalar. If I can figure out that part, I think this will work.
Select I.id, I.custname, I.custemail, I.sku
, DATE_FORMAT(FROM_UNIXTIME(I.ts), '%l:%i:%s %p, %c/%e/%Y') AS ts
, (
Select Sum( Case
When I1.sku = ? Then 1
When I1.sku In( Select subsku From combo As S1 Where S1.sku = ? ) Then 1
Else 0
End ) As Total
From images As I1
Where I1.stat = 1
) As Total
From images As I
Where stat = 0
Another solution
Select I.id, I.custname, I.custemail, I.sku
, DATE_FORMAT(FROM_UNIXTIME(I.ts), '%l:%i:%s %p, %c/%e/%Y') AS ts
, (
Select Count(*)
From images As I1
Where I1.stat = 1
And (
I1.sku = ?
Or I1.sku In ( Select subsku From combo As S1 Where S1.sku = ? )
)
) As Total
From images As I
Where stat = 0
Addition
Another possible solution:
Select I.id, I.custname, I.custemail, I.sku
, DATE_FORMAT(FROM_UNIXTIME(I.ts), '%l:%i:%s %p, %c/%e/%Y') AS ts
, (
Select Sum( Cnt )
From (
Select Count(*) As Cnt
From images As I1
Left Join combo As C1
On C1.sku = I1.sku
Where I1.stat = 1
And I1.sku = ?
And C1.PrimaryKeyCol Is Null
Union All
Select Count( Distinct I1.PrimaryKeyCol )
From images As I1
Join combo As C1
On C1.sku = I1.sku
Where I1.stat = 1
And I1.sku = ?
) As Z
) As Total
From images As I
Where stat = 0
Edit
If you are looking for the count by image in which you correlate the counts to the outer table images sku column, that's entirely different. For that, I would use a derived table:
Select I.id, I.custname, I.custemail, I.sku
, DATE_FORMAT(FROM_UNIXTIME(I.ts), '%l:%i:%s %p, %c/%e/%Y') AS ts
, Counts.Total
From images As I
Join (
Select Z.sku, Sum(Z.Cnt) As Total
From (
Select I1.sku, Count(*) As Cnt
From images As I1
Left Join combo As C1
On C1.sku = I1.sku
Where I1.stat = 1
And C1.PrimaryKeyCol Is Null
Group By I1.sku
Union All
Select I1.sku, Count( Distinct I1.PrimaryKeyCol )
From images As I1
Join combo As C1
On C1.sku = I1.sku
Where I1.stat = 1
Group By I1.sku
) As Z
Group By Z.sku
) As Counts
On Counts.sku = I.sku
Where stat = 0
Obviously in all cases, replace PrimaryKeyCol with the name of the actual primary key column of the images table.
Have you considered changing the WHERE logic to perform the count in one.
Assuming the two counts are mutually exclusive you could just OR the two conditions.
Use the cross join . . .
select t1.*, t2.total1, t3.total2
from
(
SELECT I.id, I.custname, I.custemail, I.sku,
DATE_FORMAT(FROM_UNIXTIME(I.ts), '%l:%i:%s %p, %c/%e/%Y') AS ts
FROM images I
WHERE I.stat = 0
) t1
cross join
(
SELECT COUNT(*) AS total1
FROM images
WHERE stat = 1 AND sku = ?
) t2
cross join
(
SELECT COUNT(*) AS total2
FROM images
WHERE stat = 1 AND sku IN (SELECT subsku FROM combo WHERE sku = ?)
) t3
You might be able to make this more efficient, since they are all going after the same table. A single aggregation with a case statement per WHERE clause would probably be more efficient.

MySQL - "Most social User" with the most comments in multiple tables

I have 2 tables of concern - 'videoComments', 'storyComments'.
I need to find the 'posterID' that has the most entries in videoComments and storyComments. Here's the code I have so far, but it only calls videoComments:
$sql = "SELECT (SELECT posterID
FROM videoComments
GROUP BY posterID
ORDER BY COUNT(posterID) DESC LIMIT 1) ) AS mostSocialUser ";
How do I pull it and compare the COUNT of posterID from both tables?
Use:
SELECT x.posterid,
COUNT(y.posterid) + COUNT(z.posterid) AS numComments
FROM (SELECT vc.posterid
FROM VIDEOCOMMENTS vc
UNION
SELECT sc.posterid
FROM STORYCOMMENTS sc) x
LEFT JOIN VIDEOCOMMENTS y ON y.posterid = x.posterid
LEFT JOIN STORYCOMMENTS z ON z.posterid = x.posterid
GROUP BY x.posterid
ORDER BY numComments DESC
LIMIT 1
Try this:
SELECT (
SELECT posterID FROM (
SELECT posterID FROM videoComments
UNION
SELECT posterID FROM storyComments
) GROUP BY posterID
ORDER BY COUNT(posterID) DESC LIMIT 1
) AS mostSocialUser