MySQL SUM when using GROUP BY not working - mysql

Let's say we have this table:
Symbol | Size
A | 12
B | 5
A | 3
A | 6
B | 8
And we want a view like this:
Symbol | Size
A | 21
B | 13
So we use this:
Select Symbol, sum(Size) from table group by Symbol order by Symbol ASC
But instead we get this:
Symbol | Size
A | 12
B | 5
What am I doing wrong?!

You are doing it right, you should expect the correct results. Could you please supply more information about the DB you are using, additional schemas, etc?
Maybe you have some unique index on Symbol?
Try to execute the following to "sanity-test" your system:
SELECT SUM(Size) FROM table
Should result in 34
SELECT Symbol, Count(*) FROM table GROUP BY Symbol
Should results in 3 and 2
If both of the above work perfectly as you noted, please try:
SELECT Symbol, Count(*), Sum(Size) FROM table GROUP BY Symbol
This is your code, with the additions of Count(*) and without the ORDER BY clause. If that does not work after the two above do, I'm really puzzled...

I found out that somewhere in the Select commands that leaded to the Un-SUMable table instead of a left join there was a simple join.Although I still don't get why that should mess up the calculation, I changed that and now it works... I'm sorry I couldn't upload the whole thing...

Related

Union as sub query using MySQL 8

I'm wanting to optimize a query using a union as a sub query.
Im not really sure how to construct the query though.
I'm using MYSQL 8.0.12
Here is the original query:
---------------
| c1 | c2 |
---------------
| 18182 | 0 |
| 18015 | 0 |
---------------
2 rows in set (0.35 sec)
I'm sorry but the question doesn't stored if I paste the sql query as text and format using ctrl+k
Output expected
---------------
| c1 | c2 |
---------------
| 18182 | 167 |
| 18015 | 0 |
---------------
As a output I would like to have the difference of rows between the two tables in UNION ALL.
I processed this question using the wizard https://stackoverflow.com/questions/ask
Since a parenthesized SELECT can be used almost anywhere a expression can go:
SELECT
ABS( (SELECT COUNT(*) FROM tbl_aaa) -
(SELECT COUNT(*) FROM tbl_bbb) ) AS diff;
Also, MySQL is happy to allow a SELECT without a FROM.
There are several ways to go for this, including UNION, but I wouldn't recommend it, as it is IMO a bit 'hacky'. Instead, I suggest you use subqueries or use CTEs.
With subqueries
SELECT
ABS(c_tbl_aaa.size - c_tbl_bbb.size) as diff
FROM (
SELECT
COUNT(*) as size
FROM tbl_aaa
) c_tbl_aaa
CROSS JOIN (
SELECT
COUNT(*) as size
FROM tbl_bbb
) c_tbl_bbb
With CTEs, also known as WITHs
WITH c_tbl_aaa AS (
SELECT
COUNT(*) as size
FROM tbl_aaa
), c_tbl_bbb AS (
SELECT
COUNT(*) as size
FROM tbl_bbb
)
SELECT
ABS(c_tbl_aaa.size - c_tbl_bbb.size) as diff
FROM c_tbl_aaa
CROSS JOIN c_tbl_bbb
In a practical sense, they are the same. Depending on the needs, you might want to define and join the results though, and in said cases, you could use a single number as a "pseudo id" in the select statement.
Since you only want to know the differences, I used the ABS function, which returns the absolute value of a number.
Let me know if you want a solution with UNIONs anyway.
Edit: As #Rick James pointed out, COUNT(*) should be used in the subqueries to count the number of rows, as COUNT(id_***) will only count the rows with non-null values in that field.

Mysql-> Group after rand()

I have the following table in Mysql
Name Age Group
abel 7 A
joe 6 A
Rick 7 A
Diana 5 B
Billy 6 B
Pat 5 B
I want to randomize the rows, but they should still remain grouped by the Group column.
For exmaple i want my result to look something like this.
Name Age Group
joe 6 A
abel 7 A
Rick 7 A
Billy 6 B
Pat 5 B
Diana 5 B
What query should i use to get this result? The entire table should be randomised and then grouped by "Group" column.
What you describe in your question as GROUPing is more correctly described as sorting. This is a particular issue when talking about SQL databases where "GROUP" means something quite different and determines the scope of aggregation operations.
Indeed "group" is a reserved word in SQL, so although mysql and some other SQL databases can work around this, it is a poor choice as an attribute name.
SELECT *
FROM yourtable
ORDER BY `group`
Using random values also has a lot of semantic confusion. A truly random number would have a different value every time it is retrieved - which would make any sorting impossible (and databases do a lot of sorting which is normally invisible to the user). As long as the implementation uses a finite time algorithm such as quicksort that shouldn't be a problem - but a bubble sort would never finish, and a merge sort could get very confused.
There are also degrees of randomness. There are different algorithms for generating random numbers. For encryption it's critical than the random numbers be evenly distributed and completely unpredictable - often these will use hardware events (sometimes even dedicated hardware) but I don't expect you would need that. But do you want the ordering to be repeatable across invocations?
SELECT *
FROM yourtable
ORDER BY `group`, RAND()
...will give different results each time.
OTOH
SELECT
FROM yourtable
ORDER BY `group`, MD5(CONCAT(age, name, `group`))
...would give the results always sorted in the same order. While
SELECT
FROM yourtable
ORDER BY `group`, MD5(CONCAT(DATE(), age, name, `group`))
...will give different results on different days.
DROP TABLE my_table;
CREATE TABLE my_table
(name VARCHAR(12) NOT NULL
,age INT NOT NULL
,my_group CHAR(1) NOT NULL
);
INSERT INTO my_table VALUES
('Abel',7,'A'),
('Joe',6,'A'),
('Rick',7,'A'),
('Diana',5,'B'),
('Billy',6,'B'),
('Pat',5,'B');
SELECT * FROM my_table ORDER BY my_group,RAND();
+-------+-----+----------+
| name | age | my_group |
+-------+-----+----------+
| Joe | 6 | A |
| Abel | 7 | A |
| Rick | 7 | A |
| Pat | 5 | B |
| Diana | 5 | B |
| Billy | 6 | B |
+-------+-----+----------+
Do the random first then sort by column group.
select Name, Age, Group
from (
select *
FROM yourtable
order by RAND()
) t
order by Group
Try this:
SELECT * FROM table order by Group,rand()

SQL calculating difference between columns

I'm a bit of a newby at SQL and I don't really understand what to do here, so any help is really appreciated. I have a table full of readings from different readers, there's like 500.000 of them, so I can't do this by hand.
I received the table without the difference in it. I managed to calculate it, but there's a bit of a problem there...
It looks a bit like this:
reader_id | date | reading | difference
1 | 01-01-2013 | 205 | 0
1 | 02-01-2013 | 210 | 5
1 | 03-01-2013 | 213 | 3
... | ... | ... | ...
1 | 31-12-2013 | 2451 | 4
2 | 01-01-2013 | 8543 | 6092
2 | 02-01-2013 | 8548 | 5
reader_id and date form the primary key. The combination is unique.
How can I make sure I don't get the difference calculated when the last column contained a different reader_id?
When querying my data with a query like this one, the data get skewed by the incorrect difference between the two reader_ids:
SELECT AVG(difference), reader_id FROM table GROUP BY reader_id
For
I just want to get the average difference for each reader.
your query is perfectly good. I think you got something wrong in your difference calculation. The first value for reader_id=2, 6092, is the difference of the last reading from reader1 and the first reading from reader 2, i don't think that makes sense. If i'm not mistaken, the difference value is the current day reading - previous day reading. Therefore you should set the difference value of the first reading of each reader to 0.
You can do this with the following query:
UPDATE table t INNER JOIN (SELECT reader_id, min(date) as first_day FROM table GROUP BY reader_id) as tmp ON tmp.reader_id=t.reader_id AND tmp.first_day=t.date SET t.difference=0
Then
SELECT AVG(difference), reader_id FROM table GROUP BY reader_id
will do what you expect.
If you simply want the average difference, you can use the following query:
SELECT
meter_id,
MAX(reading) - MIN(reading) / COUNT(*) average_difference
FROM table
GROUP BY meter_id
ORDER BY meter_id;
It works on the logic that the the total difference for a given meter_id should be equal to MAX(reading) - MIN(reading).

Return the query when count of a query is greater than a number?

I want to return all rows that have a certain value in a column and have more than 5 instances in which a number is that certain value. For example, I would like to return all rows of the condition in which if the value in the column M has the number 1 in it and there are 5 or more instances of M having the number 1 in it, then it will return all rows with that condition.
select *
from tab
where M = 1
group by id --ID is the primary key of the table
having count(M) > 5;
EDIT: Here is my table:
id | M | price
--------+-------------+-------
1 | | 100
2 | 1 | 50
3 | 1 | 30
4 | 2 | 20
5 | 2 | 10
6 | 3 | 20
7 | 1 | 1
8 | 1 | 1
9 | 1 | 1
10 | 1 | 1
11 | 1 | 1
Originally I just want to insert into a trigger so that if the number of M = 1's is greater than 5, then I want to create an exception. The query I asked for would be inserted into the trigger. END EDIT.
But my table is always empty. Can anyone help me out? Thanks!
Try this :
select *
from tab
where M in (select M from tab where M = 1 group by M having count(id) > 5);
SQL Fiddle Demo
please try
select *,count(M) from table where M=1 group by id having count(M)>5
Since you group on your PK (which seems a futile excercise), you are counting per ID, whicg will indeed always return 1.
As i explain after this code, this query is NOT good, it is NOT the answer, and i also explain WHY. Please do not expect this query to run correctly!
select *
from tab
where M = 1
group by M
having count(*) > 5;
Like this, you group on what you are counting, which makes a lot more sense. At the same time, this will have unexpected behaviour, as you are selecting all kinds of columns that are not in the group by or in any aggregate. I know mySQL is lenient on that, but I don;t even want to know what it will produce.
Try indeed a subquery along these lines:
select *
from tab
where M in
(SELECT M
from tab
group by M
having count(*) > 5)
I've built a SQLFiddle demo (i used 'Test' as table name out of habit) accomplishing this (I don't have a mySQL at hand now to test it).
-- Made up a structure for testing
CREATE TABLE Test (
id INT NOT NULL AUTO_INCREMENT,
PRIMARY KEY(id),
M int
);
SELECT id, M FROM tab
WHERE M IN (
SELECT M
FROM Test
WHERE M = 1
GROUP BY M
HAVING COUNT(M) > 5
)
The sub-query is a common "find the duplicates" kind of query, with the added condition of a specific value for the column M, also stating that there must be at least 5 dupes.
It will spit out a series of values of M which you can use to query the table against, ending with the rows you need.
You shouldn't use SELECT * , it's a bad practice in general: don't retrieve data you aren't actually using, and if you are using it then take the little time needed to type in a list of field, you'll likely see faster querying and on the other hand the code will be way more readable.

How to optimize this query with multiple substring and subquery

Okay, I´m working on a website right now that shows information about parts of electronic devices. These parts sometimes get a revision. The part number stays the same, but they append an A, B, C etc to the part number, so the ´higher´ the letter, the newer it is. Also a date is added. So the table looks something like this:
------------------------------------------------------------
| Partcode | Description | Partdate |
------------------------------------------------------------
| 12345A | Some description 1 | 2009-11-10 |
| 12345B | Some description 2 | 2010-12-30 |
| 17896A | Some description 3 | 2009-01-12 |
| 12345C | Some description 4 | 2011-08-06 |
| 17896B | Some description 5 | 2009-07-10 |
| 12345D | Some description 6 | 2012-05-04 |
------------------------------------------------------------
What I need right now is the data from the newest revision of a part. So for this example I need:
12345D and 17896B
The query that some has build before me is something in the line of this:
SELECT substring(Partcode, 1, 5) AS Part,
(
SELECT pt.Partcode
FROM Parttable pt
WHERE substring(pt.PartCode, 1, 5) = Part
ORDER BY pt.Partdate DESC
LIMIT 0,1
),
(
SELECT pt.Description
FROM Parttable pt
WHERE substring(pt.PartCode, 1, 5) = Part
ORDER BY pt.Partdate DESC
LIMIT 0,1
),
(
SELECT pt.Partdate
FROM Parttable pt
WHERE substring(pt.PartCode, 1, 5) = Part
ORDER BY pt.Partdate DESC
LIMIT 0,1
)
FROM Parttable
GROUP BY Part
As you will understand, this query is insanely slow and feels really inefficient. But I just can't get my head around how to optimize this query.
So I really hope someone can help.
Thanks in advance!
PS. I'm working on a MySQL database and before anyone asks, I can't change the database.
First : why not storing your version variable in a separate column? This way you wouldn't need to call substring to first extract it. If you really need the code and version to be concatenated, I thing it's a good practice to do it at the end.
Then in your place, I would first split the code and version, and simply use a max in an aggregate query, like:
SELECT code,max(version) FROM
(SELECT substring(Partcode, 5, 1) as code,
substring(Partcode, 1, 5) as version
FROM Parttable
)
AS part
GROUP BY code;
Note: I haven't tested this query so you may need to fix few parameters, like the substring indexes.