In MySQL - What is the difference between using SUM or COUNT?
SELECT SUM(USER_NAME = 'JoeBlow')
SELECT COUNT(USER_NAME = 'JoeBlow')
To answer the OP question more direct and literal, consider if you were totalling integers in your column instead of strings.
+----+------+
| id | vote |
+----+------+
| 1 | 1 |
| 2 | -1 |
| 3 | 1 |
| 4 | -1 |
| 5 | 1 |
+----+------+
COUNT = 5 votes
SUM = 1 vote
(-2 + 3 = 1)
Sum is doing the mathematical sum, whereas count simply counts any value as 1 regardless of what data type.
It is a big difference because the result is not the same.
The first query returns the number of times the condition is true, because true is 1 and false is 0.
The second query returns the complete record count because count() does not care about the content inside it, as long as the content is NOT NULL. Because count(1) and count(0) are still values and both get counted.
To get the correct return value for the second query you would have to make the result of the condition be null (instead of 0) to not being counted. Like this:
SELECT COUNT(case when USER_NAME = 'JoeBlow' then 'no matter what' else NULL end)
from your_table
Or simply remove the else part from the case statement which automatically makes the else part null.
I guess COUNT() returns the number of rows in a column whereas SUM() returns the sum for the column
select count(field) from table
is slower than
select sum(1) from table
Consider using the second option
Related
Goal: I want to create a Select query where the result contains all records from both tables, except the time slot. In a addition to this I want to have the condition that if the minutes of parking are 0 or Null the value of the field should be set to -1.
Progress: At the current state I merged the two tables and could set the 0 value to -1. Due to the fact that I am quite new to SQL I couldnĀ“t find a solution for keeping the original values for minutes and integrate the 'When Null Then -1' clause. Many solutions suggest a Update query , but the operation needs to be in a Select result. MYSQL 2017. This is my code so far:
Select c.ID, c.status, c.Date, Case When c.Minutes = 0 Then -1 End as Minutes
From Customer_1 as c
Union
Select c1.ID, c1.status, c1.Date, Case When c1.Minutes = 0 Then -1 End as
Minutes
From Customer_2 as c1
Original Dataset: I Have two tables with the exact same column names, representing user IDs
Customer_1:
ID| Date| Minutes| Time| status
1 | 2019| 3 | 2019| A
2 | 2019| 0 | 2019| A
Customer_2:
ID| Date| Minutes| Time| status
3 | 2019| Null | 2019| A
4 | 2019| 0 | 2019| A
What the final query should look like:
ID| Date| Minutes| status
1 | 2019| 3 | A
2 | 2019| -1 | A
3 | 2019| -1 | A
4 | 2019| -1 | A
Any suggestion how build a working query that fulfills the criteria would be much appreciated!
Just using Coalesce() function and adding Else part is needed within your Case .. When statement :
Select ID, status, Date, Case When Coalesce(Minutes,0) = 0 Then -1 Else Minutes End as Minutes
From Customer_1
Union
Select ID, status, Date, Case When Coalesce(Minutes,0) = 0 Then -1 Else Minutes End
From Customer_2
and using aliases for tables is redundant in this case, since the queries are independent except for Union. The alias(Minutes) for the second query is also redundant.
Another alternative might be using an IF statement along with COALESCE() function :
Select ID, status, Date, IF(Coalesce(Minutes,0) , Minutes, -1) as Minutes
From Customer_1
Union
Select ID, status, Date, IF(Coalesce(Minutes,0) , Minutes, -1)
From Customer_2
Demo
Looking for a query that takes the following table ProductList
id| column_1 | column_2 | Sum
================================
1 | Product-A | Product-B | 67
2 | Product-A | Product-C | 55
3 | Product-A | Product-D | 23
4 | Product-B | Product-C | 95
5 | Product-C | Product-D | 110
and returns the first record Product-A_Product-B and then skips all records that contain Product-A or Product-B in either column and returns Product-C_Product-D.
I only want to return the row if everything in the row is appearing for the first time.
Assuming that the products don't contain ,, you could use a comma-delimited session variable to store already selected products and check for every row if one of the columns is already contained in that variable:
select column_1, column_2
from (
select l.*,
case when find_in_set(l.column_1, #products) or find_in_set(l.column_2, #products)
then 1
else (#products := concat(#products, ',', l.column_1, ',', l.column_2)) = ''
end as skip
from ProductList l
cross join (select #products := '') init
order by l.id
) t
where skip = 0;
Demo: http://rextester.com/NDVBW87988
But you should know the risks:
ORDER BY in a subquery is not really valid and usually doesn't make sence. The engine may skip it or move it to the outer query.
If you read and write the same session variable in one statement, the execution order is not defined. So the query might not work for all (future) versions.
I have a scenario where I need to display total number of attendees of an event. With the help of registration form I have already captured the details of people who are attending and my table looks like below.
ID | NAME | PHONE_NUMBER | IS_LIFE_PARTNER_ATTENDING
1 | ABC | 1234567890 | N
2 | PQR | 1234567891 | Y
3 | XYZ | 1234567892 | N
I can easily display number of registrations by using count(id). But while displaying number of attendees I have to consider as two attendees if registrant is coming with his/her partner. (identified by IS_LIFE_PARTNER_ATTEDNING column)
So, in the above case, the number of registrants are 3, but number of attendees are 4, because "PQR" is coming with his/her life partner.
How can we do this in mysql query?
You can use the following query:
SELECT
SUM( 1 + (IS_LIFE_PARTNER_ATTEDNING = 'Y')) AS totalAttendees
FROM your_table;
WORKING DEMO
Since boolean expression resolves into 0/1 in MySQL so that you can capitalize this in your case.
Note:
SUM(a=b) returns 1 only if a is equal to b otherwise it returns 0
Caution:
*Never underestimate these parentheses (IS_LIFE_PARTNER_ATTEDNING = 'Y'). If you omit them then the whole summation would result in zero(0).
* because of operator precedence
Use SUM with CASE
SELECT
Name,
SUM(CASE WHEN IS_LIFE_PARTNER_ATTEDNING='y' THEN 2 ELSE 1 END ) AS'Attendes'
FROM
table
GROUP by name
I have the following table :
id | command_id | started_at | ended_at | rows_involved | completed
-----------------------------------------------------------------------------------------------
1 | 1 | 2015-05-20 12:02:25 | 2015-05-20 12:02:28 | 1 | 1
2 | 1 | 2015-05-20 12:02:47 | NULL | NULL | 0
3 | 1 | 2015-05-20 12:11:10 | NULL | NULL | 0
4 | 1 | 2015-05-20 12:11:46 | NULL | NULL | 0
5 | 1 | 2015-05-20 12:12:25 | NULL | NULL | 0
I want to fetch a COUNT of rows where started_at is '2015-05-20' AND commande_id = 1 and I want to get 2 sub totals, 1 is the total of these rows where completed = 1 and 1 is the total of these rows where completed = 0.
Expected data set is then the following :
array(4) {
["totalRows"]=> 5
["name"]=> "evo:send_post_registration_mail_1"
["totalCompleted"] => 1
["totalUncompleted"] => 4
}
The "name" column is not important, is a join with another table on command_id field.
My current query is the following, but it doesn't fetch the 2 subtotals :
SELECT COUNT(s0_.id) AS totalRows, s1_.name AS name
FROM sf_command_executions s0_
INNER JOIN sf_commands s1_ ON (s1_.id = s0_.command_id)
WHERE DATE_FORMAT(s0_.started_at,'%Y-%m-%d') = '2015-05-20'
GROUP BY s0_.command_id
Can I fetch these 2 subtotals within that single query ?
You can use conditional aggregation. Use an expression like this in your SELECT list...
SELECT ...
, SUM(IF(s0_.completed=1,1,0)) AS tot_completed_1
, SUM(IF(s0_.completed=0,1,0)) AS tot_completed_0
You can achieve the same thing using a (more ANSI-standards compliant) CASE expression:
, SUM(CASE WHEN s0_.completed = 1 THEN 1 ELSE 0 END) AS tot_completed_1
Or you can use even shorter MySQL shorthand, since boolean expressions return a value of 1, 0 or NULL:
, SUM(s0_.completed=1) AS tot_completed_1
EDIT
The following doesn't address the question you asked (see above for an answer to the question you asked). But I wanted to point out the predicate on the started_at column (i.e. the WHERE clause).
WHERE DATE_FORMAT(s0_.started_at,'%Y-%m-%d') = '2015-05-20'
^^^^^^^^^^^^ ^^^^^^^^^^^^
The DATE_FORMAT function wrapped around the column reference prevents MySQL from using an index range scan operation to satisfy that predicate.
That is, MySQL has to evaluate that function on every row in the table, and then compare the result from the expression to a literal value.
If started_at is defined as a DATETIME or TIMESTAMP, we can rewrite that to an equivalent condition, but on the bare started_at column. That would allow MySQL to use an index range scan operation. For example, we could get the same rows writing it like this:
WHERE s0_.started_at >= '2015-05-20'
AND s0_.started_at < '2015-05-20' + INTERVAL 1 DAY
If started_at is defined as a DATE, we could reference the bare column with an equality comparison. There's no need for a DATE_FORMAT function.
If we have to use a function to do some sort of conversion so the values can be compared, we'd prefer a function to wrapped around the literal rather than the column reference. Around the literal, that function only has to be evaluated once.
This isn't actually required in this case, but just as an example of wrapping the literal in a function:
WHERE s0_.started_at >= STR_TO_DATE('2015-05-20','%Y-%m-%d')
AND s0_.started_at < STR_TO_DATE('2015-05-20','%Y-%m-%d') + INTERVAL 1 DAY
Note (again) that using the STR_TO_DATE function isn't actually required; this is just demonstrating a pattern. If we did need to do a conversion, we'd prefer that to be on the literal side, rather than on the column, to allow MySQL to make use of an available index on started_at.
You can use conditional sum as
SELECT
COUNT(s0_.id) AS totalRows,
s1_.name AS name ,
sum(s0_.completed=1) as totalCompleted,
sum(s0_.completed=0) as totalUncompleted
FROM sf_command_executions s0_
INNER JOIN sf_commands s1_ ON (s1_.id = s0_.command_id)
WHERE DATE_FORMAT(s0_.started_at,'%Y-%m-%d') = '2015-05-20'
GROUP BY s0_.command_id
Try with this:
SELECT COUNT(s0_.id) AS totalRows, s1_.name AS name,
(select count(S2_.id) from sf_command_executions S2_ where s0_.command_id=S2_.command_id and s2_.completed = 1) AS totalCompleted,
(select count(S2_.id) from sf_command_executions S2_ where s0_.command_id=S2_.command_id and s2_.completed = 0) AS totalUncompleted
FROM sf_command_executions s0_
INNER JOIN sf_commands s1_ ON (s1_.id = s0_.command_id)
WHERE DATE_FORMAT(s0_.started_at,'%Y-%m-%d') = '2015-05-20'
GROUP BY s0_.command_id
I have a table with multiple rows which have a same data. I used SELECT DISTINCT to get a unique row and it works fine. But when i use ORDER BY with SELECT DISTINCT it gives me unsorted data.
Can anyone tell me how distinct works?
Based on what criteria it selects the row?
From your comment earlier, the query you are trying to run is
Select distinct id from table where id2 =12312 order by time desc.
As I expected, here is your problem. Your select column and order by column are different. Your output rows are ordered by time, but that order doesn't necessarily need to preserved in the id column. Here is an example.
id | id2 | time
-------------------
1 | 12312 | 34
2 | 12312 | 12
3 | 12312 | 48
If you run
SELECT * FROM table WHERE id2=12312 ORDER BY time DESC
you will get the following result
id | id2 | time
-------------------
2 | 12312 | 12
1 | 12312 | 34
3 | 12312 | 48
Now if you select only the id column from this, you will get
id
--
2
1
3
This is why your results are not sorted.
When you specify SELECT DISTINCT it will give you all the rows, eliminating duplicates from the result set. By "duplicates" I mean rows where all fields have the same values. For example, say you have a table that looks like:
id | num
--------------
1 | 1
2 | 3
3 | 3
SELECT DISTINCT * would return all rows above, whereas SELECT DISTINCT num would return two rows:
num
-----
1
3
Note that which row actual row (eg: whether it's row 2 or row 3) it selects is irrelevant, as the result would be indistinguishable.
Finally, DISTINCT should not affect how ORDER BY works.
Reference: MySQL SELECT statement
The behaviour you describe happens when you ORDER BY an expression that is not present in the SELECT clause. The SQL standard does not allow such a query but MySQL is less strict and allows it.
Let's try an example:
SELECT DISTINCT colum1, column2
FROM table1
WHERE ...
ORDER BY column3
Let's say the content of the table table1 is:
id | column1 | column2 | column3
----+---------+---------+---------
1 | A | B | 1
2 | A | B | 5
3 | X | Y | 3
Without the ORDER BY clause, the above query returns following two records (without ORDER BY the order is not guaranteed):
column1 | column2
---------+---------
A | B
X | Y
But with ORDER BY column3 the order is also not guaranteed.
The DISTINCT clause operates on the values of the expressions present in the SELECT clause. If row #1 is processed first then (A, B) is placed in the result set and it is associated with row #1. Then, when row #2 is processed, the values of the SELECT expressions produce the record (A, B) that is already in the result set. Because of DISTINCT it is dropped. Row #3 produces (X, Y) that is also put in the result set. Then, the ORDER BY column3 clause makes the records be sorted in the result set as (A, B), (X, Y).
But if row #2 is processed before row #1 then, following the same logic exposed in the previous paragraph, the records in the result set are sorted as (X, Y), (A, B).
There is no rule imposed on the database engine about the order it processes the rows when it runs a query. The database is free to process the rows in any order it consider it's better for performance.
Your query is invalid SQL and the fact that it can return different results using the same input data proves it.