SQL to add a summary row to MySQL result set - mysql

If I have a MySQL table such as:
I want to use SQL to calculate the sum of the PositiveResult column and also the NegativeResult column. Normally I could simply do SUM(PositiveResult) in a query.
But what if I wanted to go a step further and place the totals in a row at the bottom of the result set:
Can this be achieved at the data level or is it a presentation layer issue? If it can be done by SQL, how might I do this? I am a bit of an SQL newbie.
Thanks to the respondents. I will now check things with the customer.
Also, can a text column be added so that the value of the last row of data is not shown in the summary row? Like this:

I would also do this in the presentation layer, but you can do it MySQL...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
,pos DECIMAL(5,2)
,neg DECIMAL(5,2)
);
INSERT INTO my_table VALUES
(1,0,0),
(2,1,-2.5),
(3,1.6,-1),
(4,1,-2);
SELECT COALESCE(id,'total') my_id,SUM(pos),SUM(neg) FROM my_table GROUP BY id WITH ROLLUP;
+-------+----------+----------+
| my_id | SUM(pos) | SUM(neg) |
+-------+----------+----------+
| 1 | 0.00 | 0.00 |
| 2 | 1.00 | -2.50 |
| 3 | 1.60 | -1.00 |
| 4 | 1.00 | -2.00 |
| total| 3.60 | -5.50 |
+-------+----------+----------+
5 rows in set (0.02 sec)
Here's a hack for the amended problem - it ain't pretty but I think it works...
SELECT COALESCE(id,'') my_id
, SUM(pos)
, SUM(neg)
, COALESCE(string,'') n
FROM my_table
GROUP
BY id
, string
WITH ROLLUP
HAVING n <> '' OR my_id = ''
;

select keyword,sum(positiveResults)+sum(NegativeResults)
from mytable
group by
Keyword
if you need the absolute value put sum(abs(NegativeResults)

This should be handled at least one layer above the SQL query layer.
The initial query can fetch the detail info and then the application layer can calculate the aggregation (summary row). Or, a second db call to fetch the summary directly can be used (although this would be efficient only for cases where the calculation of the summary is very resource-intensive and a second db call is really necessary - most of the time the app layer can do it more efficiently).
The ordering/layout of the results (i.e. the detail rows followed by the "footer" summary row) should be handled at the presentation layer.

I'd recommend doing this at the presentation layer. To do something like this in SQL is also possible.
create table test (
keywordid int,
positiveresult decimal(10,2),
negativeresult decimal(10,2)
);
insert into test values
(1, 0, 0), (2, 1, -2.5), (3, 1.6, -1), (4, 1, -2);
select * from (
select keywordid, positiveresult, negativeresult
from test
union all
select null, sum(positiveresult), sum(negativeresult) from test
) main
order by
case when keywordid is null then 1000000 else keywordid end;
I added ordering using a arbitrarily high number if keywordid is null to make sure the ordered recordset can be pulled easily by the view for displaying.
Result:
+-----------+----------------+----------------+
| keywordid | positiveresult | negativeresult |
+-----------+----------------+----------------+
| 1 | 0.00 | 0.00 |
| 2 | 1.00 | -2.50 |
| 3 | 1.60 | -1.00 |
| 4 | 1.00 | -2.00 |
| NULL | 3.60 | -5.50 |
+-----------+----------------+----------------+

Related

Inserting random data from a list

These are my table columns:
ID || Date || Description || Priority
My goal is to insert random test data of 2000 rows with date ranging between (7/1/2019 - 7/1/2020) and randomize the priority from list (High, Medium, Low).
I know how to insert random numbers but I am stuck with the date and the priority fields.
If I need to write code, any pointers on how do I do it?
Just want to be clear - I have issue with randomizing and inserting from a given list
CREATE TABLE mytable (
id SERIAL PRIMARY KEY,
date DATE NOT NULL,
description TEXT,
priority ENUM('High','Medium','Low') NOT NULL
);
INSERT INTO mytable (date, priority)
SELECT '2019-07-01' + INTERVAL FLOOR(RAND()*365) DAY,
ELT(1+FLOOR(RAND()*3), 'High', 'Medium', 'Low')
FROM DUAL;
The fake table DUAL is a special keyword. You can select from it, and it always returns exactly one row. But it has no real columns with data, so you can only select expressions.
Do this INSERT a few times and you get:
mysql> select * from mytable;
+----+------------+-------------+----------+
| id | date | description | priority |
+----+------------+-------------+----------+
| 1 | 2019-10-20 | NULL | Medium |
| 2 | 2020-05-17 | NULL | High |
| 3 | 2020-06-25 | NULL | Low |
| 4 | 2020-05-06 | NULL | Medium |
| 5 | 2019-09-30 | NULL | High |
| 6 | 2019-08-06 | NULL | Low |
| 7 | 2020-02-21 | NULL | High |
| 8 | 2019-11-10 | NULL | High |
| 9 | 2019-07-30 | NULL | High |
+----+------------+-------------+----------+
Here's a trick to use the number of rows in the table itself to insert the same number of rows, basically doubling the number of rows:
INSERT INTO mytable (date, priority)
SELECT '2019-07-01' + INTERVAL FLOOR(RAND()*365) DAY,
ELT(1+FLOOR(RAND()*3), 'High', 'Medium', 'Low')
FROM mytable;
Just changing FROM DUAL to FROM mytable I change from selecting one row, to selecting the current number of rows from the table. But the values I insert are still random expressions, not the values already in those rows. So I get new rows with new random values.
Then repeat this INSERT as many times as you want to double the number of rows.
Read also about the ELT() function.
You seem to be looking for something like this. A basic random sample is:
select t.*
from t
where date >= '2019-07-01' and date < '2020-07-01'
order by random()
fetch first 2000 rows only;
Of course, the function for random() varies by database, as does the logic for limiting rows. This should get about the same distribution of priorities as in the original data.
If you want the rows to come by priority first, then use:
select t.*
from t
where date >= '2019-07-01' and date < '2020-07-01'
order by (case when priority = 'High' then 1 when priority = 'Medium' then 2 else 3 end),
random()
fetch first 2000 rows only;

Insert data in table using two or more tables

I have two existing table and wants to create third table with help of few columns. The fist two tables are;
Table one: users
|id | name | sid |
| 1 | demo | test1 |
| 2 | anu | test2 |
Table one: insights
| id | description| name |
| 1 | yes | demoone|
| 2 | no | demotwo|
I want to insert data in new table called insight_owner. As per my knowledge, I made below query but that is giving me below error
ERROR 1242 (21000): Subquery returns more than 1 row
The query used is
insert into insight_owner (column_one, column_two, column_three, column_four, column_five) VALUES ('1', '0', NULL, (select u.id from users u where u.sid='test1'), (select i.id from insights i)) ;
Expected output is
| column_one| column_two| column_three| column_four| column_five| column_six |
+----+-----------------+--------------------+---------------+-----------+--------------------+
| 1 | 1 | 1 | NULL | 1 | 1 |
| 2 | 1 | 1 | NULL | 1 | 2 |
column_five = Users id
column_six = Insight id
INSERT...SELECT syntax is what you're looking for (instead of INSERT...VALUES, which is limited to single values per column in each value list). That allows you to select the data directly from the table(s) concerned, using normal SELECT and JOIN syntax. You can also hard-code values which you want to appear on every row, just as you can in a normal SELECT statement. Basically, write the SELECT statement, get it to output what you want. Then stick an INSERT at the start of it and it sends the output to the desired table.
insert into insight_owner (column_one, column_two, column_three, column_four, column_five)
select '1', '0', NULL, (select u.id from users u where u.sid='test1'), i.id
from insights i
You are using
insert into insight_owner (column_one, column_two, column_three, column_four, column_five) VALUES ('1', '0', NULL, (select u.id from users u where u.sid='test1'), (select i.id from insights i));
Which basically inserts one row in your new table.
So, when you add subquery
select i.id from insights i
It will return all rows from insights table an you actually want just one value.
The result you will get is
| id |
| 1 |
| 2 |
And you want
| id |
| 1 |
So, you should be adding conditional that will make sure you are getting only one result as you are doing with first query (where u.sid='test1'), or limit.
I hope this helps.

How to count numbers or occurrences in all columns and list them using MySQL?

I have a table looking like:
| A | B | C | ... | Z | <- names of columns
-----------------------
| 1 | 0 | 1 | ... | 1 |
| 0 | 1 | 1 | ... | 1 |
| 1 | 1 | 1 | ... | 1 |
| 0 | 1 | 1 | ... | 0 |
| 1 | 0 | 1 | ... | 1 |
And I would like to sum up all 1s in all the columns and list them out. How can I do that using MySQL? Number of columns is about 80, if possible I would like not to list them in the SQL call.
I would like to get a response similar to this one:
A: 3
B: 3
C: 5
...
Z: 4
This table has been designed in a way that makes the query you describe more difficult.
Using many columns for data values that should be compared or counted together because they're the same type of value is called repeating groups. It's a violation of database normalization rules.
The more traditional way to store this data would be over 80 rows, not 80 columns.
CREATE TABLE mytable (
id INT PRIMARY KEY,
label CHAR(1) NOT NULL,
value TINYINT NOT NULL
);
INSERT INTO mytable VALUES
('A', 1), ('B', 0), ('C', 1), ...
Then you could use a simple query with an aggregate function like this:
SELECT label, SUM(value)
FROM mytable
GROUP BY label;
There are times when it's worth using a denormalized table design (like your current table), but that time is when you want to optimize for a particular query. Be careful about using denormalized designs, because they optimize for one query at the expense of all other queries you might run against the same data. The query you want to make is one of those that is made more difficult by using the denormalized design you currently have.
There is no easy way, you will need to explicitly list the columns. A UNION query should be what you need, like:
SELECT 'A' column_name, SUM(A) cnt FROM mytable
UNION ALL SELECT 'B', SUM(B) FROM mytable
UNION ALL SELECT 'C', SUM(C) FROM mytable
...
NB: it should be possible to generate the query programmatically using any text manipulation tool (Excel, perl, ...), or dynamically using a prepared statement.

Unexpected result in MyISAM when grouping by bit and selecting distinct values

We have a MyISAM table with a single column bit and two rows, containing 0 and 1. We group by this column, make a count and select it. The result as follows is expected.
select count( bit), bit from tab GROUP BY bit;
| count(bit) | bit |
|------------|-----|
| 1 | 0 |
| 1 | 1 |
But when using the distinct keyword, the output value of the column is always 1. Why?
select count(distinct bit), bit from tab GROUP BY bit;
| count(bit) | bit |
|------------|-----|
| 1 | 1 | # WHYYY
| 1 | 1 |
I've been crawling the documentation and the internet but with no luck.
Here is the setup:
CREATE TABLE `tab` (
`bit` bit(1) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8; # When using InnoDB everything's fine
INSERT INTO `tab` (`bit`) VALUES
(CONV('1', 2, 10) + 0),
(CONV('0', 2, 10) + 0);
PS: One more thing. I've been doing several experiments. Using group_concat, the column bit becomes independent again.
select count(distinct bit), group_concat(bit) from tab GROUP BY bit;
| count(bit) | bit |
|------------|------------|
| 1 | 1 byte (0) |
| 1 | 1 byte (1) |
Thanks to comments, I am from now on convinced of not using the bit column at all. The more reliable alternative is tinyint(1).
Inspired from the Adminer application bit handling, I recommend using bin function to cast bit on an expected value every time when selecting:
select count(distinct bit), BIN(bit) from tab GROUP BY bit;

mysql insert external data with join

I'm usually pretty resourceful, but I'm stuck on this one. Any help would be appreciated.
Say I've got a table for produce, like this, including counts of sold/in stock for each produce type.
+--------------+--------------+------+-----+
| Field | Type | Null | Key |
+--------------+--------------+------+-----+
| produce_type | varchar(100) | NO | PRI |
| sold_count | int(8) | YES | |
| stock_count | int(8) | YES | |
+--------------+--------------+------+-----+
I'm doing a separate insert using external data for each of the 'stock' and 'sold' counts, with hundreds to thousands of produce_types at a time. I may have data with a given produce_type existing only in the 'stock' or 'sold' data to be inserted, but want all to be present in the table.
So, e.g., doing one insert for sold_count ('potato', 3), ('onion', 5) and one for stock_count ('potato', 8), ('carrots', 6), I'd want to end up with this:
+--------------+------------+-------------+
| produce_type | sold_count | stock_count |
+--------------+------------+-------------+
| potato | 3 | 8 |
| onion | 5 | NULL |
| carrots | NULL | 6 |
+--------------+------------+-------------+
So I'd need to join to existing data upon the second column's insert statement, but all I see here or elsewhere on the web is instructions for joins when inserting from another table.
INSERT IGNORE doesn't do it, as one of the 'potato' columns wouldn't get written to.
INSERT ... ON DUPLICATE KEY UPDATE gets closer but I can't figure out how to set the update field to the value from the dataset I'm inserting.
Do I need to create a temp table for the 2nd insert (+ outer join)? Any structurally simpler way of doing this?
Thanks in advance.
Edit: I think I can probaly use this:
https://stackoverflow.com/a/3466/2540707
Does this work?
insert into produce ( produce_type, sold_count )
select produce_type, sold_count from sold_data
on duplicate key update sold_count = ( select sold_count from sold_data
where produce.produce_type = sold_data.produce_type
);