I scratch my head with UNION ALL - mysql

I am not an SQL query wizard at all, and here is my problem:
I have those 3 separate querys that works very well and each one gives me a nice looking frame with results on my website.
SELECT arretsautressb AS Raison, SUM(minutesarrets) AS Minutes
FROM rapport_production_salles_blanches_2_repeat
GROUP BY Raison
ORDER BY Minutes DESC
SELECT redresseuseminutesarrets AS Raison, SUM(minutesarretsredresseuse) AS Minutes
FROM rapport_production_salles_blanches_3_repeat
GROUP BY Raison
ORDER BY Minutes DESC
SELECT raisonarretsconvoyeurair AS Raison, SUM(minutesarretsconvoyeurair) AS Minutes
FROM rapport_production_salles_blanches_4_repeat
GROUP BY Raison
ORDER BY Minutes DESC
So everything is fine with those 3 results...the Raison column in my table return all the rows and the Minutes query SUM all rows Group by Raison...
but i would like to merge those querys so it would give me only 1 big table with the results,instead on 3 tables.
But no matter how i try to format my UNION ALL code, what i get is 1 result only from each Raison query (so it takes only 1 row in sql table), instead of all the rows when they are separated. but the Minutes query is doing fine calculating the SUM of all the rows.
It would be cool if someone would just show me how to do it...cause i have been reading documentation for a couple of hours, and i am still stuck on this one.
This is what i tried so far, no error, but only 1 row of Raison is taken from sql table, instead of all rows:
SELECT *
FROM ( (SELECT arretsautressb AS Raison,
SUM(minutesarrets) AS Minutes
FROM rapport_production_salles_blanches_2_repeat t1)
UNION ALL
(SELECT redresseuseminutesarrets AS Raison,
SUM(minutesarretsredresseuse) AS Minutes
FROM rapport_production_salles_blanches_3_repeat t2)
UNION ALL
(SELECT raisonarretsconvoyeurair AS Raison,
SUM(minutesarretsconvoyeurair) AS Minutes
FROM rapport_production_salles_blanches_4_repeat t3)
) AS t123
GROUP BY Raison
ORDER BY Minutes DESC
This is what i get from my UNION ALL query:
UNION ALL
But this is what i get from 3 separated querys:
3 querys

I think your query doesn't return your desired result because of the following things:
It's fine to use a sub query where you specify the three tables and union them. However, you cannot use an aggregate (in this case SUM) without the use of GROUP BY.
Next, whenever you use GROUP BY, you should refer to the attribute instead of the column name. In my query I changed GROUP BY Raison to GROUP BY t1.arretsautressb.
I have used an ORDER BY on the outer query and I order by the second column, which is in this case the SUM(minutesarrets).
The query I would use is the following:
SELECT *
FROM (
SELECT arretsautressb AS Raison
, SUM(minutesarrets) AS sum_minutes
FROM rapport_production_salles_blanches_2_repeat AS t1
GROUP BY t1.arretsautressb
UNION ALL
SELECT redresseuseminutesarrets AS Raison
, SUM(minutesarretsredresseuse) AS sum_minutes
FROM rapport_production_salles_blanches_3_repeat AS t2
GROUP BY t2.redresseuseminutesarrets
UNION ALL
SELECT raisonarretsconvoyeurair AS Raison
, SUM(minutesarretsconvoyeurair) AS sum_minutes
FROM rapport_production_salles_blanches_4_repeat AS t3
GROUP BY t3.raisonarretsconvoyeurair
) AS t123
ORDER BY 2 DESC

Try this:
SELECT * FROM (
SELECT * FROM (
(SELECT arretsautressb AS Raison, SUM(minutesarrets) AS Minutes FROM rapport_production_salles_blanches_2_repeat t1)
UNION ALL
(SELECT redresseuseminutesarrets AS Raison, SUM(minutesarretsredresseuse) AS Minutes FROM rapport_production_salles_blanches_3_repeat t2)
) t1
UNION All
(SELECT raisonarretsconvoyeurair AS Raison, SUM(minutesarretsconvoyeurair) AS Minutes FROM rapport_production_salles_blanches_4_repeat t3)
) AS t123 GROUP BY t123.Raison ORDER BY t123.Minutes DESC

Related

Getting missing time period value with an interval in My SQL

I'm trying to fetch the records with half an hour time interval of the current day with concern data count for that time period.
So, my output came as expected. But, If count(no records) on the particular time period let's say 7:00 - 7:30 I'm not getting that record with zero count.
My Query as follows :
SELECT time_format( FROM_UNIXTIME(ROUND(UNIX_TIMESTAMP(start_time)/(30* 60)) * (30*60)) , '%H:%i')
thirtyHourInterval , COUNT(bot_id) AS Count FROM bot_activity
WHERE start_time BETWEEN CONCAT(CURDATE(), ' 00:00:00') AND CONCAT(CURDATE(), ' 23:59:59')
GROUP BY ROUND(UNIX_TIMESTAMP(start_time)/(30* 60))
For reference of my output :
We need a source for that 7:30 row; a row source for all the time values.
If we have a clock table that contains all of the time values we want to return, such that we can write a query that returns that first column, the thirty minute interval values we want to return,
as an example:
SELECT c.hhmm AS thirty_minute_interval
FROM clock c
WHERE c.hhmm ...
ORDER BY c.hhmm
then we can do an outer join the results of the query with the missing rows
SELECT c.hhmm AS _thirty_minute_interval
, IFNULL(r._cnt_bot,0) AS _cnt_bot
FROM clock c
LEFT
JOIN ( -- query with missing rows
SELECT time_format(...) AS thirtyMinuteInterval
, COUNT(...) AS _cnt_bot
FROM bot_activity
WHERE
GROUP BY time_format(...)
) r
ON r.thirtyMinuteInterval = c.hhmm
WHERE c.hhmm ...
ORDER BY c.hhmm
The point is that the SELECT will not generate "missing" rows from a source where they don't exist; we need a source for them. We don't necessarily have to have a separate clock table, we could have an inline view generate the rows. But we do need to be able to SELECT those value from a source.
( Note that bot_id in the original query is indeterminate; the value will be from some row in the collapsed set of rows, but no guarantee which value. (If we add ONLY_FULL_GROUP_BY to sql_mode, the query will throw an error, like most other relational databases will when non-aggregate expressions in the SELECT list don't appear in the GROUP BY are aren't functionally dependent on the GROUP BY )
EDIT
In place of a clock table, we can use an inline view. For small sets, we could something like this.
SELECT c.tmi
FROM ( -- thirty minute interval
SELECT CONVERT(0,TIME) + INTERVAL h.h+r.h HOUR + INTERVAL m.mm MINUTE AS tmi
FROM ( SELECT 0 AS h UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3
UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7
UNION ALL SELECT 8 UNION ALL SELECT 9 UNION ALL SELECT 10 UNION ALL SELECT 11
) h
CROSS JOIN ( SELECT 0 AS h UNION ALL SELECT 12 ) r
CROSS JOIN ( SELECT 0 AS mm UNION ALL SELECT 30 ) m
ORDER BY tmi
) c
ORDER
BY c.tmi
(Inline view c is a standin for a clock table, returns time values on thirty minute boundaries.)
That's kind of ugly. We can see where if we had a rowsource of just integer values, we could make this much simpler. But if we pick that apart, we can see how to extend the same pattern to generate fifteen minute intervals, or shorten it to generate two hour intervals.

group by and order by count desc taking time

The query take more than 6 seconds for 4 million records. Any other procedure can be done to minimize the query time.
SELECT title_id, count(title_id) as count
FROM `title_keywords`
WHERE keyword_id in (1,2,3,4,5,6,7,8,9)
GROUP BY title_id
ORDER BY count desc
Index and unique columns
Added composite index too
Because the COUNT function needs to potentially touch every record in each group, there may not be much which can speed up the aggregation. However, we might be able to take advantage of an index to speed up the WHERE clause:
CREATE INDEX idx ON title_keywords (keyword_id, title_id);
You could also try reversing the order of the index columns, and in either case perhaps check the execution plan using EXPLAIN. The reason this index might work is that it would allow MySQL to quickly access on the matching keyword_id records. The index also covers title_id, so that this value would be available in the leaf nodes of the B-tree.
try avoid IN clause using a INNER JOIN
SELECT title_id, count(title_id) as count
FROM title_keywords
INNER JOIN (
SELECT 1 col1
UNION
SELECT 2
UNION
SELECT 3
UNION
SELECT 4
UNION
SELECT 5
UNION
SELECT 6
UNION
SELECT 7
UNION
SELECT 8
UNION
SELECT 9
) t t.col1 = title_keywords.keyword_id
group by title_id
order by count desc
and be sure you have a proper index on
table title_keywords columns( keyword_id, title_id )

MYSQL: Query 2 tables with union is very slow, how to improve?

I want to query 2 tables with (almost) identical rows at the same time. As a result, I want to get the 5 recent entries (ordered by date, in total), no matter from which table they are from
So far, I tried this:
SELECT date, name, text FROM `table_A`
UNION
SELECT date, name, text FROM `table_B` ORDER BY date desc LIMIT 5
Unfortunately, this query takes about 20 seconds (both tables have ~300.000 rows).
When I just do:
SELECT date, name, text FROM `table_A` ORDER BY date desc LIMIT 5
or
SELECT date, name, text FROM `table_B` ORDER BY date desc LIMIT 5
the query takes only a few milliseconds.
So my question is: How can I improve my query to be faster or what select query should I use to get the 5 latest rows from both tables?
Select the most recent 5 rows in each table before combining them.
SELECT *
FROM (
(SELECT date, name, text FROM table_A ORDER BY date DESC LIMIT 5)
UNION
(SELECT date, name, text FROM table_B ORDER BY date DESC LIMIT 5)
) x
ORDER BY date DESC
LIMIT 5
The problem with your query is that it's first merging the entire tables and removing duplicates before doing the ordering and limiting. The merged table doesn't have an index, so that part is slow.

Mysql: Get records from last date

I want to get all records which are not "older" than 20 days. If there are no records within 20 days, I want all records from the most recent day. I'm doing this:
SELECT COUNT(DISTINCT t.id) FROM t
WHERE
(DATEDIFF(NOW(), t.created) <= 20
OR
(date(t.created) >= (SELECT max(date(created)) FROM t)));
This works so far, but it is awful slow. created is a datetime, might be due tue the conversion to a date... Any ideas how to speed this up?
SELECT COUNT(*) FROM (
SELECT * FROM t WHERE datediff(now(),created) between 0 and 20
UNION
SELECT * FROM (SELECT * FROM t WHERE created<now() LIMIT 1) last1
) last20d
I used the between clause just in case there might be dates in the future in the table. These will be excluded. Also you can simplify the select, if you just need the count() to
SELECT COUNT(*) FROM (
SELECT id FROM t WHERE datediff(now(),created) between 0 and 20
UNION
SELECT id FROM (SELECT id FROM t WHERE created<now() LIMIT 1) last1
) last20d
otherwise, in the first select version you can leave out the outer select if you want all the data of the chosen records. The UNION will make sure that duplicates will be excluded (in other cases I always use UNION ALL since it is faster).

Grab x amount of records from MySQL and get duplicates if not enough

Not really sure how to do this, but is it possible in one query to fetch x amount of records from a table, and if not enough is found, it will just randomly select duplicates.
I have a photos table, let's say it has 5 records in it, and I want to pull out 10 records and order them randomly, so I have something like:
SELECT * FROM TABLE
ORDER BY RAND()
LIMIT 10
This will just pull back 5 randomly, cos that is all I have in the table. Can I tell MySQL, hey, if you find less than 10, just randomly grab more until you reach that number?
Any help appreciated!
Thanks
This will do it:
select * from Table1
union all
select * from
(
select * from
(
select * from Table1 limit 10
union all
select * from Table1 limit 10
union all
select * from Table1 limit 10
union all
select * from Table1 limit 10
-- more unions...
) t2 order by rand()
) rand_ordered
limit 10
Union the table for as many times as your number of needed records is (10 times in this example) to make it work with only one row in the table, order the result by rand() and append it to your table with another union all.
This might not be the best performing solution tho, but it will do it.
Example here: SQLFIDDLE