I'm wanting to optimize a query using a union as a sub query.
Im not really sure how to construct the query though.
I'm using MYSQL 8.0.12
Here is the original query:
---------------
| c1 | c2 |
---------------
| 18182 | 0 |
| 18015 | 0 |
---------------
2 rows in set (0.35 sec)
I'm sorry but the question doesn't stored if I paste the sql query as text and format using ctrl+k
Output expected
---------------
| c1 | c2 |
---------------
| 18182 | 167 |
| 18015 | 0 |
---------------
As a output I would like to have the difference of rows between the two tables in UNION ALL.
I processed this question using the wizard https://stackoverflow.com/questions/ask
Since a parenthesized SELECT can be used almost anywhere a expression can go:
SELECT
ABS( (SELECT COUNT(*) FROM tbl_aaa) -
(SELECT COUNT(*) FROM tbl_bbb) ) AS diff;
Also, MySQL is happy to allow a SELECT without a FROM.
There are several ways to go for this, including UNION, but I wouldn't recommend it, as it is IMO a bit 'hacky'. Instead, I suggest you use subqueries or use CTEs.
With subqueries
SELECT
ABS(c_tbl_aaa.size - c_tbl_bbb.size) as diff
FROM (
SELECT
COUNT(*) as size
FROM tbl_aaa
) c_tbl_aaa
CROSS JOIN (
SELECT
COUNT(*) as size
FROM tbl_bbb
) c_tbl_bbb
With CTEs, also known as WITHs
WITH c_tbl_aaa AS (
SELECT
COUNT(*) as size
FROM tbl_aaa
), c_tbl_bbb AS (
SELECT
COUNT(*) as size
FROM tbl_bbb
)
SELECT
ABS(c_tbl_aaa.size - c_tbl_bbb.size) as diff
FROM c_tbl_aaa
CROSS JOIN c_tbl_bbb
In a practical sense, they are the same. Depending on the needs, you might want to define and join the results though, and in said cases, you could use a single number as a "pseudo id" in the select statement.
Since you only want to know the differences, I used the ABS function, which returns the absolute value of a number.
Let me know if you want a solution with UNIONs anyway.
Edit: As #Rick James pointed out, COUNT(*) should be used in the subqueries to count the number of rows, as COUNT(id_***) will only count the rows with non-null values in that field.
Related
How to retrieve odd rows from the table?
In the Base table always Cr_id is duplicated 2 times.
Base table
I want a SELECT statement that retrieves only those c_id =1 where Cr_id is always first as shown in the output table.
Output table
Just see the base table and output table you should automatically know what I want, Thanx.
Just testing min date should be enough
drop table if exists t;
create table t(c_id int,cr_id int,dt date);
insert into t values
(1,56,'2020-12-17'),(56,56,'2020-12-17'),
(1,8,'2020-12-17'),(56,8,'2020-12-17'),
(123,78,'2020-12-17'),(1,78,'2020-12-18');
select c_id,cr_id,dt
from t
where c_id = 1 and
dt = (select min(dt) from t t1 where t1.cr_id = t.cr_id);
+------+-------+------------+
| c_id | cr_id | dt |
+------+-------+------------+
| 1 | 56 | 2020-12-17 |
| 1 | 8 | 2020-12-17 |
+------+-------+------------+
2 rows in set (0.002 sec)
What you're looking for could be "partition by", at least if you're working on mssql.
(In the future, please include more background, SQL is not just SQL)
https://codingsight.com/grouping-data-using-the-over-and-partition-by-functions/
I have an old query lying around, that is able to put a sorting index on data who lacks this, although the underlying reason is 99.9% sure to be a bad data design.
Typically I use this query to remove bad data, but you may rewrite it to become a join instead, so that you can identify the data you need.
The reason why I'm not putting that answer here, is to point out, bad data design results in more work when reading it afterwards, whom seems to be the real root cause here.
DELETE t
FROM
(
SELECT ROW_NUMBER () OVER (PARTITION BY column_1 ,column_2, column_3 ORDER BY column_1,column_2 ,column_3 ) AS Seq
FROM Table
)t
WHERE Seq > 1
Edit: If it makes any difference, I am using mysql 5.7.19.
I have a table A, and am trying to randomly sample on average 10% of the rows. I have decided that using rand() in a subquery, and then filtering out on that random result would do the trick, but it is giving unexpected results. When I print out the randomly generated value after filtering, I get random values that do not match my main query's "where" clause, so I suppose it is regenerating the random value in the outer select.
I guess I'm missing something to do with subqueries and when things are executed, but I'm really not sure what's going on.
Can anyone explain what I might be doing wrong? I've checked out this post: In which sequence are queries and sub-queries executed by the SQL engine? , and my subquery is correlated so I assume that my subquery is being executed first, and then the main query is filtering off of it. Given my assumptions, I do not understand why the result has values that should have been filtered away.
Query:
select
*
from
(
select
*,
rand() as rand_value
from
A
) a_rand
where
rand_value < 0.1;
Result:
--------------------------------------
| id | events | rand_value |
--------------------------------------
| c | 1 | 0.5512495763145849 | <- not what I expected
--------------------------------------
I am not able to reproduce using this SQL Fiddle use that link and click the blue [Run SQL] button a few times
CREATE TABLE Table1
(`x` int)
;
INSERT INTO Table1
(`x`)
VALUES
(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
;
Query 1:
select
*
from (
select
*
, rand() as rand_value
from Table1
) a_rand
where
rand_value < 0.1
[Results]:
| x | rand_value |
|---|---------------------|
| 1 | 0.03006686086772649 |
| 1 | 0.09353976332912199 |
| 1 | 0.08519635823107917 |
I need to make several select statements to get simple data (only one row containing one or several fields for each select).
Simplified example:
select name, price from article where id=125
select log, email from user where uid=241
I want to process only one single statement from php side (or: I do NOT want to prepare several statements, execute several statements, catch and handle exceptions for each execution and finally fetch result for each statement...).
I tried:
select * from (
(select name, price from article where id=125) as a,
(select log, email from user where uid=241) as b
)
which works great if every subselect returns values:
name | price | log | email
------------------------------------------
dummy | 12,04 | john | john#example.com
But if one of the subselects returns empty, the whole select returns empty.
What I want is: null values for empty resulting subselects.
I tried many things with ifnull() and coalesce(), but couldn't get the awaited result (I know how to use them with null values, but I didn't find a way to deal with them in the case of an empty result set).
I finally found a solution with left joins:
select * from (
(select 1) as thisWillNeverReturnEmpty
left join (select name, price from article where id=125) as a on 1
left join (select log, email from user where uid=241) as b on 1
)
which works perfectly even if one of the subqueries returns empty (or even both, therefore the "select 1").
Another way I found on SO would be to add a count(*) in each subquery to make sure there's a value.
But it all looks quite dirty and I can't believe there's no simple way just using something like ifnull().
What is the right way to do it?
The best way I finally found was:
select * from (
(select count(*) as nbArt, name, price from article where id=125) as a,
(select count(*) as nbUser, log, email from user where uid=241) as b
)
This way, no subquery ever returns empty, which solves the problem (there's always at least a "zero" count followed by null values).
Sample result when no article is found:
nbArt | name | price | nbUser | log | email
----------------------------------------------------------------
0 | null | null | 1 | john | john#example.com
I have the following data:
Name | Condition
Mike | Good
Mike | Good
Steve | Good
Steve | Alright
Joe | Good
Joe | Bad
I want to write an if statement, if Bad exists, I want to classify the name as Bad. If Bad does not exist but Alright Exists, then classify as Alright. If only Good exists, then classify as good.
So my data would turn into:
Name | Condition
Mike | Good
Steve | Alright
Joe | Bad
Is this possible in SQL?
An Access query would be easy if you first create a table which maps Condition to a rank number.
Condition rank
--------- ----
Bad 1
Alright 2
Good 3
Then a GROUP BY query would give you the minimum rank for each Name:
SELECT y.Name, Min(c1.rank) AS MinOfrank
FROM
[YourTable] AS y
INNER JOIN conditions AS c1
ON y.Condition = c1.Condition
GROUP BY y.Name;
If you want to display the Condition string for those ranks, join back to the conditions table again:
SELECT sub.Name, sub.MinOfrank, c2.Condition
FROM
(
SELECT y.Name, Min(c1.rank) AS MinOfrank
FROM
[YourTable] AS y
INNER JOIN conditions AS c1
ON y.Condition = c1.Condition
GROUP BY y.Name
) AS sub
INNER JOIN conditions AS c2
ON sub.MinOfrank = c2.rank;
Performance should be fine with indexes on those conditions fields.
Seems to me this approach could also work in those other databases (MySQL and SQL Server) tagged in the question.
You can use a case statement to rank the conditions then max() or min() to summarize the results before returning them back to the user in the same format.
Query:
SELECT [Name]
, case min(case condition when 'bad' then 0 when 'alright' then 1 else 2 end)
when 0 then 'bad' when 1 then 'alright' when 2 then 'good' end as Condition
from mytable
group by [name]
mysql has an IF - function.
Here, have a look at it: https://dev.mysql.com/doc/refman/5.1/en/control-flow-functions.html#function_if
Let's say we have this table:
Symbol | Size
A | 12
B | 5
A | 3
A | 6
B | 8
And we want a view like this:
Symbol | Size
A | 21
B | 13
So we use this:
Select Symbol, sum(Size) from table group by Symbol order by Symbol ASC
But instead we get this:
Symbol | Size
A | 12
B | 5
What am I doing wrong?!
You are doing it right, you should expect the correct results. Could you please supply more information about the DB you are using, additional schemas, etc?
Maybe you have some unique index on Symbol?
Try to execute the following to "sanity-test" your system:
SELECT SUM(Size) FROM table
Should result in 34
SELECT Symbol, Count(*) FROM table GROUP BY Symbol
Should results in 3 and 2
If both of the above work perfectly as you noted, please try:
SELECT Symbol, Count(*), Sum(Size) FROM table GROUP BY Symbol
This is your code, with the additions of Count(*) and without the ORDER BY clause. If that does not work after the two above do, I'm really puzzled...
I found out that somewhere in the Select commands that leaded to the Un-SUMable table instead of a left join there was a simple join.Although I still don't get why that should mess up the calculation, I changed that and now it works... I'm sorry I couldn't upload the whole thing...