I have an issue of performance with a query with multiple UNION ALL statements. I need to add (row by row) data from different tables into the same columns. The query need to be used to create a view in MySQL, so, here an example:
CREATE OR REPLACE
ALGORITHM = UNDEFINED
DEFINER = usr
SQL SECURITY DEFINER
VIEW my_view AS
SELECT DISTINCT
column 1,
column 2,
column 3
FROM
table 1
WHERE
condition 1
UNION ALL
SELECT DISTINCT
column 1,
column 2,
column 3
FROM
table 2
WHERE
condition 2
UNION ALL
SELECT DISTINCT
column 1,
column 2,
column 3
FROM
table 3
WHERE
condition 3
It seems pointless to do all the multiple UNION ALLs just to add (row by row) data from the same features (not just 3 columns as in the example, I have many more) coming from different tables because this is something that requires lots of resources from the DB, leading to "lost connection error during the query" due to the time it takes to run.
Is there any way to optimize this kind of query?
Thanks in advance.
UNION ALL is the most performant way of concatenating result sets. (UNION is slower because it removes duplicates.)
Surely your timeout occurs when you use the view, not when you create it.
Your performance issue stems from one or more of the SELECT queries in your UNION ALL cascade being very slow. You, or your "data engineer" colleagues, may need to create appropriate indexes on your table 1, table 2, table 3 tables.
To figure this out, do these things.
Read Optimizing Queries With EXPLAIN.
Run SHOW CREATE TABLE whateverTableName;. Look at the output. It will show you the indexes.
Run the SELECT queries using that same table prefixed with EXPLAIN. It will show you the indexes it used to satisfy the query.
Ask another question here showing us the output from those two steps.
Or, it's possible your resultset from your big query is vast. There's no magic that can process millions of rows faster than O(n).
Related
I am in the process of learning MySQL right now and while I get how to do UNIONS and JOINS. However I'm not seeing the advantages of a UNION over any type of JOIN. They both combine results from tables but seems like you have to jump through more hoops to combine tables using UNION if they're not identical with their columns. Is there an advantage of using a UNION sometimes or is it just another command we can use?
UNION adds rows from multiple tables/views.
Whereas join make the filters between rows from different related tables in a single sql statement.
Union: used to combine the result set of two different SELECT statement with same datatype of result set.
Join: used to retrieve matched records between 2 or more tables.
Please visit this link, it will help you to clear your doubts.
I have a dozen of tables with the same structure. All of their names match question_20%. Each table has an indexed column named loaded which can have values of 0 and 1.
I want to count all of the records where loaded = 1. If I had only one table, I would run select count(*) from question_2015 where loaded = 1.
Is there a query I can run that finds the tables in INFORMATION_SCHEMA.TABLES, sums over all of these counts, and produces a single output?
You can do what you want with dynamic SQL.
However, you have a problem with your data structure. Having multiple parallel tables is usually a very bad idea. SQL supports very large tables, so having all the information in one table is a great convenience, from the perspective of querying (as you are now learning) and maintainability.
SQL offers indexes and partitioning schemes for addressing performance issues on large tables.
Sometimes, separate tables are necessary, to meet particular system requirements. If so, then a view should be available to combine all the tables:
create view v_tables as
select t1.*, 'table1' as which from table1 union all
select t2.*, 'table2' as which from table2 union all
. . .
If you had such a view, then your query would simply be:
select which, count(*)
from v_tables
where loaded = 1
group by which;
My problem is this:
select * from
(
select * from barcodesA
UNION ALL
select * from barcodesB
)
as barcodesTOTAL, boxes
where barcodesTotal.code=boxes.code;
Table barcodesA has 4000 entries
Table barcodesB has 4000 entries
Table boxes has like 180.000 entries
It takes 30 seconds to proccess the query.
Another problematic query:
select * from
viewBarcodesTotal, boxes
where barcodesTotal.code=boxes.code;
viewBarcodesTotal contains the UNION ALL from both barcodes tables. It also takes forever.
Meanwhile,
select * from barcodesA , boxes where barcodesA.code=boxes.code
UNION ALL
select * from barcodesB , boxes where barcodesB.code=boxes.code
This one takes <1second.
The question is obviously WHY?, is my code bugged? is mysql bugged?
I have to migrate from access to mysql, and i would have to rewrite all my code if the first option in bugged.
Add an index on boxes.code if you don't already have one. Joining 8000 records (4K+4K) to the 180,000 will benefit from an index on the 180K side of the equation.
Also, be explicit and specify the fields you need back in your SELECT statements. Using * in a production-use query is bad form as it encourages not having to think about what fields (and how big they might be), not to mention the fact that you have 2 different tables in your example, barcodesa and barcodesb with potentially different data types and column orders that you're UNIONing....
The REASON for the performance difference...
The first query says... First, do a complete union of EVERY record in A UNIONed with EVERY record in B, THEN Join it to boxes on the code. The union does not have an index to be optimized against.
By explicitly applying your SECOND query instance, each table individually IS optimized on the join (apparently there IS an index per performance of second, but I would ensure both tables have index on "code" column).
1000 Apologies if I've repeated a question, couldn't find an answer here to my question.
I'm try to retrieve the data from 2 separate columns from 2 unrelated tables in the same query.
I've tried using a UNION statement, but the problem is that I need to be able to separate the results into 'venues' and 'programmes' - here was what I did:
SELECT venue_name
FROM my_venues
UNION
SELECT programme_title
FROM my_programmes;
Maybe it's not necessary to combine the query and I can just do 2 separate queries? The database won't be especially large, but it seems unnecessary...
Help and thanks!
Just add a constant column in both selects, with the same name, but different values:
SELECT "venues" as source, venue_name as thing_name
FROM my_venues
UNION ALL
SELECT "programmes" as source, programme_title as thing_name
FROM my_programmes;
Now:
Rows with value "venues" for column
source will come from the table
my_venues ,
rows with value "programmes" for
column source will come from table
my_programmes.
I have a very big table (nearly 2,000,000 records) that got split to 2 smaller tables. one table contains only records from last week and the other contains all the rest (which is a lot...)
now i got some Stored Procedures / Functions that used to query the big table before it got split.
i still need them to query the union of both tables, however it seems that creating a View which uses the union statement between the two tables lasts forever...
that's my view:
CREATE VIEW `united_tables_view` AS select * from table1 union select * from table2;
and then i'd like to switch everywhere the Stored procedure select from 'oldBigTable' to select from 'united_tables_view'...
i've tried adding indexes to make the time shorter but nothing helps...
any Ideas?
PS
the view and union are my idea but any other creative idea would be perfect!
bring it on!
thanks!
If there is a reason not to, you should merge the tables rather than constantly query both of them.
Here is question on StackOverflow about doing that:
How can I merge two MySQL tables?
If you need to keep them seperate, you can use syntax along the lines of:
SELECT table1.column1, table2.column2 FROM table1, table2 WHERE table1.column1 = table2.column1;
Here is an article about when to use SELECT, JOIN and UNION
https://web.archive.org/web/1/http://articles.techrepublic%2ecom%2ecom/5100-10878_11-1050307.html