Sum columns depending on another column value

Sum columns depending on another column value - mysql

I'm having trouble with summing the fields values based on another fields value.
I need to SUM(activities.points) based on activities.activity_type if it's used_points or added_points and put it in AS used_points/added_points.
Table activities:
id | subscription_id | activity_type | points
--------------------------------------------------
1 | 1 | used_points | 10
2 | 1 | used_points | 50
3 | 1 | added_points | 20
4 | 1 | added_points | 30
5 | 2 | used_points | 20
6 | 2 | used_points | 45
7 | 2 | added_points | 45
8 | 2 | added_points | 45
Table subscriptions:
id | name | current_points
-------------------------------------
1 | card_1 | 700
2 | card_2 | 900
What I need:
name | current_points | used_points | added_points
-----------------------------------------------------------
card_1 | 700 | 60 | 50
card_2 | 900 | 65 | 90
What I tried :
SELECT
subscriptions.name,
subscriptions.current_points,
IF(activities.activity_type="used_points", SUM(activities.points), null)
AS used_points,
IF(activities.activity_type="added_points", SUM(activities.points), null)
AS added_points
FROM activities
JOIN subscriptions
ON activities.subscription.id = subscription.id
GROUP BY subscriptions.name
Which is wrong.
Thanks

You want to use SUM(IF( )). You want to add up the values returned from the IF. You want that IF expression to be evaluated for each individual row. Then, use the SUM aggregate to add up the value returned for each row.
Remove the SUM aggregate from inside the IF expression and instead, wrap the IF inside a SUM.
Followup
Q But why SUM() inside of IF doesn't work ?
A Well, it does work. It's just not working the way you want it to work.
The MySQL SUM function is an "aggregate" function. It aggregates rows together, and returns a single value.
For an expression of this form: IF(col='foo',SUM(numcol),0)
What MySQL is doing is aggregating all the rows into the SUM, and returning a single value.
Other databases would pitch a fit, and throw an error with the reference to the non-aggregate col in that expression. MySQL is more lenient, and treats the col reference like it was an aggregate (like MIN(col), or MAX(col)... working on a group of rows, and returning a single value. In this case, MySQL is selecting a single, sample row. (It's not determinate which row will be "chosen" as the sample row.) So that reference to col is sort of like a GET_VALUE_FROM_SAMPLE_ROW(col). Once the aggregates are completed, then that IF expression gets evaluated once.
If you start with this query, this is the set of rows you want to operate on.
SELECT s.name
, s.current_points
, a.activity_type
, a.points
, IF(a.activity_type='used_points',a.points,NULL) AS used_points
, IF(a.activity_type='added_points',a.points,NULL) AS added_points
FROM subscriptions s
JOIN activities a
ON a.subscription_id = s.id
When you add a GROUP BY clause, that's going to aggregate some of those rows together. What you will get back for the non-aggregates is values from a sample row.
Try adding GROUP BY s.name to the query, and see what is returned.
Also try adding in some aggregates, such as SUM(a.points)
SELECT s.name
, s.current_points
, a.activity_type
, a.points
, IF(a.activity_type='used_points',a.points,NULL) AS used_points
, IF(a.activity_type='added_points',a.points,NULL) AS added_points
, SUM(a.points) AS total_points
FROM subscriptions s
JOIN activities a
ON a.subscription_id = s.id
GROUP BY s.name
Finally, we can add in the expressions in your query into the SELECT list:
, IF(a.activty_type='used_points',SUM(a.points),NULL) AS if_used_sum
, IF(a.activty_type='added_points',SUM(a.points),NULL) AS if_added_sum
What we discover is that the value returned from these expressions will either be SUM(a.points), which will match the total_points, or it will be NULL. And we can see the value of the activity_type column, retrieved from a single, sample row for each group, and we can see that this is expression is "working", it's just not doing what we you really want to happen: for the conditional test to run on each individual row, returning a value for points or a null, and then summing that up for the group.

Your code is only slightly out:
SELECT
subscriptions.name,
subscriptions.current_points,
SUM(IF(activities.activity_type="used_points", 0, activities.points))
AS used_points,
SUM(IF(activities.activity_type="added_points", 0, activities.points))
AS added_points
FROM activities
JOIN subscriptions
ON activities.subscription_id = subscription.id
GROUP BY subscriptions.name, subscriptions.current_points
Note the fixed typo in the second last line - you wrote subscription.id instead of subscription_id. Also you only grouped by name instead of name and current_points, not sure if that's allowed in mysql (I use T-SQL), it's good practice to have it there anyway.

Well, I did it not using the IF statement. Here's the example (http://sqlfiddle.com/#!2/076c3f/12):
SELECT
subs.name,
subs.current_points,
(SELECT SUM(points) FROM activities WHERE type = 1 AND subs_id = subs.id) AS used_points,
(SELECT SUM(points) FROM activities WHERE type = 2 AND subs_id = subs.id) AS added_points
FROM activities
JOIN subs ON activities.id = subs.id
GROUP BY subs.name
NOTE: I changed the type from VARCHAR to INT to simplify.

Try change
IF(activities.activity_type="used_points", null, SUM(activities.points))
AS used_points,
IF(activities.activity_type="added_points", null, SUM(activities.points))
AS added_points
To next
SUM(IF(activities.activity_type="used_points", activities.points, 0))
AS used_points,
SUM(IF(activities.activity_type="added_points", activities.points, 0))
AS added_points
In this way you check column and sum points or 0

To sum a column of integer values(c1) based on another column of character values(c2). And if you need to sum only not null values, the below code will help.
SELECT SUM(c1) FROM table_name WHERE c2 <> '' AND c2 IS NOT NULL

Related

MySQL COUNT(DISTINCT) giving wrong values with GROUP BY

I have a table that contains custom user analytics data. I was able to pull the number of unique users with a query:
SELECT COUNT(DISTINCT(user_id)) AS 'unique_users'
FROM `events`
WHERE client_id = 123
And this will return 16728
This table also has a column of type DATETIME that I would like to group the counts by. However, if I add a GROUP BY to the end of it, everything groups properly it seems except the totals don't match. My new query is this:
SELECT COUNT(DISTINCT(user_id)) AS 'unique_users', DATE(server_stamp) AS 'date'
FROM `events`
WHERE client_id = 123
GROUP BY DATE(server_stamp)
Now I get the following values:
|-----------------------------|
| unique_users | date |
|---------------|-------------|
| 2650 | 2019-08-26 |
| 3486 | 2019-08-27 |
| 3475 | 2019-08-28 |
| 3631 | 2019-08-29 |
| 3492 | 2019-08-30 |
|-----------------------------|
Totaling to 16734. I tried using a sub query to get the distinct users then count and group in the main query but no luck there. Any help in this would be greatly appreciated. Let me know if there is further information to help diagnosis.

A user, who is connected with events on multiple days (e.g. session starts before midnight and ends afterwards), will occur the number of these days times in the new query. This is due to the fact, that the first query performs the DISTINCT over all rows at once while the second just removes duplicates inside each groups. Identical values in different groups will stay untouched.
So if you have a combination of DISTINCT in the select clause and a GROUP BY, the GROUP BY will be executed before the DISTINCT. Thus without any restrictions you cannot assume, that the COUNT(DISTINCT user_id) of the first query and the sum over the COUNT(DISTINCT user_id) of all groups is the same.

Xandor is absolutely correct. If a user logged on 2 different days, There is no way your 2nd query can remove them. If you need data grouped by date, You can try below query -
SELECT COUNT(user_id) AS 'unique_users', DATE(MIN_DATE) AS 'date'
FROM (SELECT user_id, MIN(DATE(server_stamp)) MIN_DATE -- Might be MAX
FROM `events`'
WHERE client_id = 123
GROUP BY user_id) X
GROUP BY DATE(server_stamp);

ORDER BY does not work if COUNT is used

I have a table with following content
loan_application
+----+---------+
| id | user_id |
+----+---------+
| 1 | 10 |
| 2 | 10 |
| 3 | 10 |
+----+---------+
I want to fetch 3rd record only if there are 3 records available, in this case i want id 3 and total count must be 3, here is what i expect
+--------------+----+
| COUNT(la.id) | id |
+--------------+----+
| 3 | 3 |
+--------------+----+
Here is the query i tried.
SELECT COUNT(la.id), la.id FROM loan_application la HAVING COUNT(la.id) = 3 ORDER BY la.id DESC;
However this gives me following result
+--------------+----+
| COUNT(la.id) | id |
+--------------+----+
| 3 | 1 |
+--------------+----+
The problem is that it returns id 1 even if i use order by id descending, whereas i am expecting the id to have value of 3, where am i going wrong ?
Thanks.

In your case u can use this query:
SELECT COUNT(la.id), max(la.id) FROM loan_application la
GROUP BY user_id
I try your table in my db MySQL

When you have a group by function (in this instance count()) in the select list without a group by clause, then mysql will return a single record only with the function applied to the whole table.
Mysql under certain configuration settings allow you to include fields in the select loist which are not in the group by clause, nor are aggregated. Mysql pretty much picks up the 1st value it encounters while scanning the data as a value for such fields, in your case the value 1 for id.
If you want to fetch the record where id=count of records within the table, then I would use the following query:
select *
from loan_application
join (select count(*) as numrows from loan_application) t
where id=t.numrows and t.numrows=3
However, this implies that the values within the id field are continuous and there are no gaps.

You are selecting la.id along with an aggregated function (COUNT). So after iterating the first record the la.id is selected but the count goes on. So in this case you will get the first la.id not the last. In order to get the last la.id you need to use the max function on that field.
Here's the updated query:
SELECT
COUNT(la.id),
MAX(la.id)
FROM
loan_application la
GROUP BY user_id
HAVING
COUNT(la.id) = 3
N:B: You are using COUNT without a GROUP BY Function. So this particular aggregated function is applied to the whole table.

how to find duplicate count without counting original

I need to count the number of duplicate emails in a mysql database, but without counting the first one (considered the original). In this table, the query result should be the single value "3" (2 duplicate x#q.com plus 1 duplicate f#q.com).
TABLE
ID | Name | Email
1 | Mike | x#q.com
2 | Peter | p#q.com
3 | Mike | x#q.com
4 | Mike | x#q.com
5 | Frank | f#q.com
6 | Jim | f#q.com
My current query produces not one number, but multiple rows, one per email address regardless of how many duplicates of this email are in the table:
SELECT value, count(lds1.leadid) FROM leads_form_element lds1 LEFT JOIN leads lds2 ON lds1.leadID = lds2.leadID
WHERE lds2.typesID = "31" AND lds1.formElementID = '97'
GROUP BY lds1.value HAVING ( COUNT(lds1.value) > 1 )

It's not one query so I'm not sure if it would work in your case, but you could do one query to select the total number of rows, a second query to select distinct email addresses, and subtract the two. This would give you the total number of duplicates...
select count(*) from someTable;
select count(distinct Email) from someTable;
In fact, I don't know if this will work, but you could try doing it all in one query:
select (count(*)-(count(distinct Email))) from someTable
Like I said, untested, but let me know if it works for you.

Try doing a group by in a sub query and then summing up. Something like:
select sum(tot)
from
(
select email, count(1)-1 as tot
from table
group by email
having count(1) > 1
)

MySQL join query or sub query

I'm trying to do a query that selects mike if it isn't in the three highest bids for a keyword. Rows 4 and 7 should be selected.
So in final, if mike isn't in the three highest bids for a keyword, then select.
How do I solve this? With a sub query?
$construct = "SELECT child.* FROM `temp-advertise` child
LEFT JOIN `temp-advertise` parent on child.keyword=parent.keyword
WHERE child.name='mike'
ORDER BY child.id DESC";
id | name| keyword | bid |
1 | mike| one | 7 |
2 | tom | one | 4 |
3 | ced | one | 6 |
4 | mike| two | 1 |
5 | tom | two | 5 |
6 | har | two | 5 |
7 | mike| one | 3 |
8 | har | two | 3 |

SELECT *
FROM `temp-advertise` ta
WHERE ta.keyword = 'one'
AND ta.name = 'mike'
AND ta.bid <
(
SELECT bid
FROM `temp-advertise` tai
WHERE tai.keyword = 'one'
ORDER BY
bid DESC
LIMIT 2, 1
)

Your structure doesn't look too promising, nor your sample data. However, that said, you want to know if "Mike" was in the top 3 per keyword... and that he has 3 bids.... 2 for "one", 1 for "two". From the raw data, it looks like Mike is in 1st place and 4th place for the "one" keyword, and 4th place for "two" keyword.
This should get you what you need with SOME respect to not doing a full query of all keywords. The first innermost query is to just get keywords bid on by "mike" (hence alias "JustMike"). Then join that to the temp-advertise on ONLY THOSE keywords.
Next, by using MySQL variables, we can keep track of the rank PER KEYWORD. The trick is the ORDER BY clause needs to return them in the order that represents proper ranking. In this case, each keyword first, then within each keyword, ordered by highest bid first.
By querying the records, then using the #variables, we increase the counter, start at 1 every time the keyword changes, then preserve the keyword into the #grpKeyword variable for comparison of the next record. Once ALL bids are processed for the respective keywords, it then queries THAT result but ONLY for those bid on by "mike". These records will have whatever his rank position was.
select RankPerKeyword.*
from
( SELECT ta.*,
#grpCnt := if( #grpKeyword = ta.Keyword, #grpCnt +1, 1 ) as KWRank,
#grpKeyword := ta.Keyword as carryForward
FROM
( select distinct ta1.keyword
from `temp-advertise` ta1
where ta1.name = "mike" ) as JustMike
JOIN `temp-advertise` ta
on JustMike.Keyword = ta.Keyword,
( select #grpCnt := 0,
#grpKeyword := '' ) SqlVars
ORDER BY
ta.Keyword,
ta.Bid DESC" ) RankPerKeyword
where
RankPerKeyword.name = "mike"
(Run above to just preview the results... should show 3 records)
So, if you want to know if it was WITHIN the top 3 for a keyword you could just change to
select RankPerKeyword.keyword, MIN( RankPerKeyword.KWRank ) as BestRank
from (rest of query)
group by RankPerKeyword.Keyword

Try this:
Select ID, name, keyword from temp-advertise e
where 3 <= (select count(name) from temp-advertise
where e.keyword = keyword and bid > e.bid)

Try
SELECT .. ORDER BY bid LIMIT 3,999

MySQL - Exclude rows from Select based on duplication of two columns

I am attempting to narrow results of an existing complex query based on conditional matches on multiple columns within the returned data set. I'll attempt to simplify the data as much as possible here.
Assume that the following table structure represents the data that my existing complex query has already selected (here ordered by date):
+----+-----------+------+------------+
| id | remote_id | type | date |
+----+-----------+------+------------+
| 1 | 1 | A | 2011-01-01 |
| 3 | 1 | A | 2011-01-07 |
| 5 | 1 | B | 2011-01-07 |
| 4 | 1 | A | 2011-05-01 |
+----+-----------+------+------------+
I need to select from that data set based on the following criteria:
If the pairing of remote_id and type is unique to the set, return the row always
If the pairing of remote_id and type is not unique to the set, take the following action:
Of the sets of rows for which the pairing of remote_id and type are not unique, return only the single row for which date is greatest and still less than or equal to now.
So, if today is 2011-01-10, I'd like the data set returned to be:
+----+-----------+------+------------+
| id | remote_id | type | date |
+----+-----------+------+------------+
| 3 | 1 | A | 2011-01-07 |
| 5 | 1 | B | 2011-01-07 |
+----+-----------+------+------------+
For some reason I'm having no luck wrapping my head around this one. I suspect the answer lies in good application of group by, but I just can't grasp it. Any help is greatly appreciated!

/* Rows with exactly one date - always return regardless of when date occurs */
SELECT id, remote_id, type, date
FROM YourTable
GROUP BY remote_id, type
HAVING COUNT(*) = 1
UNION
/* Rows with more than one date - Return Max date <= NOW */
SELECT yt.id, yt.remote_id, yt.type, yt.date
FROM YourTable yt
INNER JOIN (SELECT remote_id, type, max(date) as maxdate
FROM YourTable
WHERE date <= DATE(NOW())
GROUP BY remote_id, type
HAVING COUNT(*) > 1) sq
ON yt.remote_id = sq.remote_id
AND yt.type = sq.type
AND yt.date = sq.maxdate

The group by clause groups all rows that have identical values of one or more columns together and returns one row in the result set for them. If you use aggregate functions (min, max, sum, avg etc.) that will be applied for each "group".
SELECT id, remote_id, type, max(date)
FROM blah
GROUP BY remote_id, date;
I'm not whore where today's date comes in, but assumed that was part of the complex query that you didn't describe and I assume isn't directly relevant to your question here.

Try this:
SELECT a.*
FROM table a INNER JOIN
(
select remote_id, type, MAX(date) date, COUNT(1) cnt from table
group by remote_id, type
) b
WHERE a.remote_id = b.remote_id,
AND a.type = b.type
AND a.date = b.date
AND ( (b.cnt = 1) OR (b.cnt>1 AND b.date <= DATE(NOW())))

Try this
select id, remote_id, type, MAX(date) from table
group by remote_id, type

Hey Carson! You could try using the "distinct" keyword on those two fields, and in a union you can use Count() along with group by and some operators to pull non-unique (greatest and less-than today) records!

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Sum columns depending on another column value - mysql

To sum a column of integer values(c1) based on another column of character values(c2). And if you need to sum only not null values, the below code will help. SELECT SUM(c1) FROM table_name WHERE c2 <> '' AND c2 IS NOT NULL

Related

MySQL COUNT(DISTINCT) giving wrong values with GROUP BY

ORDER BY does not work if COUNT is used

how to find duplicate count without counting original

MySQL join query or sub query

MySQL - Exclude rows from Select based on duplication of two columns

Categories

Resources