Duplicates in pre-aggregated sub-query sql - mysql

I have two tables with many-to-many relationship. I am trying to get values from both of the table where UserId is unique (I'm joining these table on this value)
I am rying to use pre aggregated query, but I get error
Column 'clv.ProbabilityAlive' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
I understand that I should add these all values to group by clause, but then I am getting duplicates because peakClv values repeat.
If i am using simple join then it takes forever because of many to many relationship.
this is my query:
SELECT
distinct(s.userid) as userId,
s.ProbabilityAlive AS ProbabilityAlive,
a.PeakClv as PeakClv
FROM (
SELECT [UserId], ([sb].[ProbabilityAlive]) AS ProbabilityAlive
FROM clv as sb
WHERE sb.[CalculationDate] = '20200311'
GROUP BY [UserId]
) s
LEFT JOIN (
SELECT [UserId], PeakClv
FROM [dbo].[AdditionalClvData] where peakClv > 1
GROUP BY [UserId]
) a ON a.[UserId] = s.[UserId]
I am a bit out of ideas could someone lend a hand?
I also tried using distinct like one answer suggested:
SELECT
distinct (s.userid) as userId,
s.ProbabilityAlive AS ProbabilityAlive,
a.PeakClv as PeakClv
FROM (
SELECT DISTINCT ([UserId]), ([sb].[ProbabilityAlive]) AS
ProbabilityAlive
FROM clv as sb
WHERE sb.[CalculationDate] = '10/09/2020 00:00:00' AND sb.
[EstimatedNumberOfTransactionsLong] >= 0 AND sb.
[EstimatedNumberOfTransactionsLong] <= 5680 AND sb.[ClientId] = '16'
AND sb.[Product] = 'Total'
ORDER BY sb.[userId] asc OFFSET (1 - 1) * 10 ROWS FETCH NEXT 10 ROWS
ONLY
) s
LEFT JOIN (
SELECT DISTINCT [UserId], PeakClv
FROM [dbo].[AdditionalClvData]
) a ON a.[UserId] = s.[UserId]
but I still get duplicates:

If you have not aggregation function like SUM(), MAX() .. you can't use GROUP BY
SELECT
distinct s.userid as userId,
s.ProbabilityAlive AS ProbabilityAlive,
a.PeakClv as PeakClv
FROM (
SELECT DISTINCT [UserId], ([sb].[ProbabilityAlive]) AS ProbabilityAlive
FROM clv as sb
WHERE sb.[CalculationDate] = '20200311'
) s
LEFT JOIN (
SELECT DISTINCT [UserId], PeakClv
FROM [dbo].[AdditionalClvData] where peakClv > 1
) a ON a.[UserId] = s.[UserId]
if you need distinct (not repeated rows) use distinct
but looking to you img seems you need an aggregation function on PeakClv eg max() and group by
SELECT
s.userid as userId,
s.ProbabilityAlive AS ProbabilityAlive,
max(a.PeakClv) as PeakClv
FROM (
SELECT DISTINCT [UserId], ([sb].[ProbabilityAlive]) AS ProbabilityAlive
FROM clv as sb
WHERE sb.[CalculationDate] = '20200311'
) s
LEFT JOIN (
SELECT DISTINCT [UserId], PeakClv
FROM [dbo].[AdditionalClvData] where peakClv > 1
) a ON a.[UserId] = s.[UserId]
GROUP BY s.userid,
s.ProbabilityAlive

Related

How to join two queries in one and get results in a single query?

SELECT SUM(commission) as regularincome FROM `tbl_member_commission` where mem_id=2 AND MONTH(cdate) = MONTH(CURRENT_DATE())
SELECT SUM(commission) as crowdfund FROM `tbl_member_comm_month` where mem_id=2 AND MONTH(cdate) = MONTH(CURRENT_DATE())
note:- both of tables have these same column names : commission, mem_id, cdate
If this is the only record in each subquery, you can use CROSS JOIN:
select a.regularincome, b.crowdfund
FROM
(SELECT SUM(commission) as regularincome FROM `tbl_member_commission` where mem_id=2 AND MONTH(cdate) = MONTH(CURRENT_DATE())) as a
cross join
(SELECT SUM(commission) as crowdfund FROM `tbl_member_comm_month` where mem_id=2 AND MONTH(cdate) = MONTH(CURRENT_DATE())) as b

How to get all jobs that user has not applied to in mySql

I have to tables, job_postings and job_applies. How can I get all the jobs, that user has not applied to?
Below are columns of my job_postings table:
id, user_id, title, description, duties, salary, child_count, benefits, created_at, updated_at
Below are columns of the job_applies table
user_id, posting_id, status, created_at, updated_at
What I tried:
$job_postings = DB::table('job_postings')
->select(
'job_postings.title',
'job_postings.description',
'job_postings.duties',
'job_postings.salary',
'job_postings.child_count',
'job_postings.benefits',
'job_postings.created_at',
'job_postings.id AS posting_id',
'job_postings.user_id')
->join('job_applies', 'job_applies.posting_id', '!=', 'job_postings.id')
->where('job_applies.user_id', "=" , user()->id)
->get();
In Mysql you should write (replace * with your field list). You can adapt to your language
SELECT *
FROM JOB_POSTING A
LEFT JOIN JOB_APPLIES B ON B.POSTING_ID = A.ID AND B.USER_ID = A.USER_ID
WHERE B.POSTING_ID IS NULL
or
SELECT *
FROM JOB_POSTING A
WHERE NOT EXISTS (SELECT 1 FROM JOB_APPLIES B WHERE B.POSTING_ID = A.ID AND B.USER_ID = A.USER_ID)

Mysql union how to Group returned 3 rows into single row

Mysql union how to Group returned 3 rows into single row
(
select CONCAT(c.Email,';', c.CCEmail,';', c.AdminEmail,';', c.HREmail) as Email
from companies c
Where companyid=#companyid#
)
union
(
select ClaimAdministratorEmail
from claimadminregion
where FIND_IN_SET(#companyid#, companyid)
)
union
(
select LossPreventionPersonEmail
from losspreventionregion
where FIND_IN_SET(#companyid#, companyid)
)
it return 3 rows but i want them in single row but don't know how
Try this:
SELECT CONCAT(c.Email,';', c.CCEmail,';', c.AdminEmail,';', c.HREmail,';', A.ClaimAdministratorEmail,';', B.LossPreventionPersonEmail) AS Email
FROM companies c
LEFT JOIN (SELECT #companyid# AS companyid, GROUP_CONCAT(ClaimAdministratorEmail SEPARATOR ';') ClaimAdministratorEmail
FROM claimadminregion WHERE FIND_IN_SET(#companyid#, companyid)
) A ON c.companyid = A.companyid
LEFT JOIN (SELECT #companyid# AS companyid, GROUP_CONCAT(LossPreventionPersonEmail SEPARATOR ';') LossPreventionPersonEmail
FROM losspreventionregion WHERE FIND_IN_SET(#companyid#, companyid)
) B ON c.companyid = B.companyid
WHERE companyid=#companyid#

combining multiple sql queries together

I have multiple table for a project (sessions , charges and payments)
To get the sessions i'm doing the following :
SELECT
sess.file_id, SUM(sess.rate * sess.length) AS total
FROM
sess
WHERE sess.sessionDone = 1
GROUP BY sess.file_id
This will return the amount that a specific student should pay
I also have another table "charges"
SELECT
file_charges.file_id, SUM(file_charges.price) AS total_charges
FROM
file_charges
GROUP BY file_charges.file_id
And finally the payment query :
SELECT
file_payments.file_id, SUM(file_payments.paymentAmount) AS total_payment
FROM
file_payments
GROUP BY file_payments.file_id
Can i combine those 3 in a way to have :
Total = Payments - (Session + Charges)
Note that it could be negative so i could have file_id that exists in session , charges but not in payments and i could have a payment without sessions or charges ...
Edit : http://sqlfiddle.com/#!2/a90d9
One issue that needs to be addressed is whether one of these queries can be the "driver", in cases where we don't have rows for a given file_id returned by one or more of the queries. (e.g. there might be rows from sess, but none from file_payments. If we want to be sure to include every possible file_id that appears in any of the queries, we can get a list of all possible file_id with a query like this:
SELECT ss.file_id FROM sess ss
UNION
SELECT fc.file_id FROM file_charges fc
UNION
SELECT fp.file_id FROM file_payments fp
(NOTE: The UNION operator will remove any duplicates)
To get the specified resultset, we can use that query, along with "left joins" of the other three original queries. The outline of the query will be:
SELECT a.file_id, p.total_payment - ( s.total + c.total_charges)
FROM a
LEFT JOIN s ON s.file_id = a.file_id
LEFT JOIN c ON c.file_id = a.file_id
LEFT JOIN p ON p.file_id = a.file_id
ORDER BY a.file_id
In that statement a is a standin for the query that gets the set of all file_id values (as shown above). The s, c and p are standins for your three original queries, on sess, file_charges and file_payments, respectively.
If any of the file_id values is "missing" from any of the queries, we are going to need to substitute a zero for the missing value. We can use the IFNULL function to handle that for us.
This query should return the specified resultset:
SELECT a.file_id
, IFNULL(p.total_payment,0) - ( IFNULL(s.total,0) + IFNULL(c.total_charges,0)) AS t
FROM ( -- all possible values of file_id
SELECT ss.file_id FROM sess ss
UNION
SELECT fc.file_id FROM file_charges fc
UNION
SELECT fp.file_id FROM file_payments fp
) a
LEFT
JOIN ( -- the amount that a specific student should pay
SELECT sess.file_id, SUM(sess.rate * sess.length) AS total
FROM sess
WHERE sess.sessionDone = 1
GROUP BY sess.file_id
) s
ON s.file_id = a.file_id
LEFT
JOIN ( -- charges
SELECT file_charges.file_id, SUM(file_charges.price) AS total_charges
FROM file_charges
GROUP BY file_charges.file_id
) c
ON c.file_id = a.file_id
LEFT
JOIN ( -- payments
SELECT file_payments.file_id, SUM(file_payments.paymentAmount) AS total_payment
FROM file_payments
GROUP BY file_payments.file_id
) p
ON p.file_id = a.file_id
ORDER BY a.file_id
(The EXPLAIN for this query is not going to be pretty, with four derived tables. On really large sets, performance may be horrendous. But the resultset returned should meet the specification.)
Beware of queries that JOIN all three tables together... that will likely give incorrect results when there are (for example) two (or more) rows for the same file_id in the file_payment table.
There are other approaches to getting an equivalent result set, but the query above answers the question: "how can i get the results of these queries joined together into a total".
Using correlated subqueries
Here's another approach, using correlated subqueries in the SELECT list...
SELECT a.file_id
, IFNULL( ( SELECT SUM(file_payments.paymentAmount) FROM file_payments
WHERE file_payments.file_id = a.file_id )
,0)
- ( IFNULL( ( SELECT SUM(sess.rate * sess.length) FROM sess
WHERE sess.file_id = a.file_id )
,0)
+ IFNULL( ( SELECT SUM(file_charges.price) FROM file_charges
WHERE file_charges.file_id = a.file_id )
,0)
) AS tot
FROM ( -- all file_id values
SELECT ss.file_id FROM sess ss
UNION
SELECT fc.file_id FROM file_charges fc
UNION
SELECT fp.file_id FROM file_payments fp
) a
ORDER BY a.file_id
try this
SELECT sess.file_id, SUM(file_payments.paymentAmount) - (SUM(sess.rate * sess.length)+SUM(file_charges.price)) as total_payment FROM sess , file_charges , file_payments
WHERE sess.sessionDone = 1
GROUP BY total_payment
EDIT.
SELECT a.file_id
, IFNULL(p.total_payment,0) - ( IFNULL(s.total,0) + IFNULL(c.total_charges,0)) AS tot
FROM (
SELECT ss.file_id FROM sess ss
UNION
SELECT fc.file_id FROM file_charges fc
UNION
SELECT fp.file_id FROM file_payments fp
) a
LEFT JOIN (
SELECT sess.file_id, SUM(sess.rate * sess.length) AS total
FROM sess
WHERE sess.sessionDone = 1
GROUP BY sess.file_id
) s
ON s.file_id = a.file_id
LEFT JOIN (
SELECT file_charges.file_id, SUM(file_charges.price) AS total_charges
FROM file_charges
GROUP BY file_charges.file_id
) c
ON c.file_id = a.file_id
LEFT JOIN (
SELECT file_payments.file_id, SUM(file_payments.paymentAmount) AS total_payment
FROM file_payments
GROUP BY file_payments.file_id
) p
ON p.file_id = a.file_id
ORDER BY a.file_id
DEMO HERE

What's wrong on this query?

I'm selecting total count of villages, total count of population from my tables to build statistics. However, there is something wrong. It returns me everything (530 pop (there are 530 pop in total), (106 villages (there are 106 users in total)) in first row, next rows are NULLs
SELECT s1_users.id userid, (
SELECT count( s1_vdata.wref )
FROM s1_vdata, s1_users
WHERE s1_vdata.owner = userid
)totalvillages, (
SELECT SUM( s1_vdata.pop )
FROM s1_users, s1_vdata
WHERE s1_vdata.owner = userid
)pop
FROM s1_users
WHERE s1_users.dp >=0
ORDER BY s1_users.dp DESC
Try removing s1_users from inner SELECTS
You're already using INNER JOINs. Whan you list tables separated with comma, it is a shortcut for INNER JOIN.
Now, the most obvious answer is that your subqueries using aggregating functions (COUNT and SUM) are missing a GROUP BY clauses.
SELECT s1_users.id userid, (
SELECT count( s1_vdata.wref )
FROM s1_vdata, s1_users
WHERE s1_vdata.owner = userid
GROUP BY s1_vdata.owner
)totalvillages, (
SELECT SUM( s1_vdata.pop )
FROM s1_users, s1_vdata
WHERE s1_vdata.owner = userid
GROUP BY s1_vdata.owner
)pop
FROM s1_users
WHERE s1_users.dp >=0
ORDER BY s1_users.dp DESC
However, using subqeries in column list is really inefficient. It casues subqueries to be run once for each row in outer query.
Try like this instead
SELECT
s1_users.id AS userid,
COUNT(s1_vdata.wref) AS totalvillages,
SUM(s1.vdata.pop) AS pop
FROM
s1_users, s1_vdata --I'm cheating here! There's hidden INNER JOIN in this line ;P
WHERE
s1_users.dp >= 0
AND s1_users.id = s1_vdata.owner
GROUP BY
s1_users.id
ORDER BY
s1_users.dp DESC
SELECT s1_users.id AS userid,
(
SELECT COUNT(*)
FROM s1_vdata
WHERE s1_vdata.owner = userid
) AS totalvillages,
(
SELECT SUM(pop)
FROM s1_vdata
WHERE s1_vdata.owner = userid
) AS pop
FROM s1_users
WHERE dp >= 0
ORDER BY
dp DESC
Note that this is less efficient than this query:
SELECT s1_users.id AS user_id, COUNT(s1_vdata.owner), SUM(s1_vdata.pop)
FROM s1_users
LEFT JOIN
s1_vdata
ON s1_vdata.owner = s1_users.id
GROUP BY
s1_users.id
ORDER BY
dp DESC
since the aggregation needs to be done twice in the former.
SELECT userid,totalvillages,pop from
(
SELECT s1_users.id as userid, count( s1_vdata.wref ) as totalvillages
FROM s1_vdata, s1_users
WHERE s1_vdata.owner = userid
GROUP BY s1_users.id) tabl1 INNER JOIN
(
SELECT s1_users.id as userid, SUM( s1_vdata.pop ) as pop
FROM s1_users, s1_vdata
WHERE s1_vdata.owner = userid
GROUP BY s1_users.id) tabl2 on tabl1.userid = tabl2.userid