This is an extension of the "dual" table concept (temporary table created on the fly for one query and discarded straight after)
I am trying to join a multi row dual table with another one, so as to avoid to have to run the same query several times with different parameters, using 1 statement.
One of the issue I am having is that union is very slow for dual tables, and I am unaware of any more efficient way to accomplish the following. (100 ms when joining 50 dual together)
SELECT
b.id,
b.ref_unid,
a.date
FROM
(
SELECT
'b8518a84-c501-11dd-b0b6-001d7dc91168' as unid,
'2010-01-05' as date
UNION
SELECT
'b853a1f2-c501-11dd-b0b6-001d7dc91168',
'2010-01-06'
UNION
SELECT
'b8557bd0-c501-11dd-b0b6-001d7dc91168',
'2010-01-07'
/* ... */
) as a
join other_table b
ON
b.ref_unid = a.unid
Is there another way of accomplishing this goal?
Is there any syntax similar to that of insert into values statement that would accomplish that goal, such as:
SELECT
unid,
id
FROM
(
WITH (unid, date) USING VALUES
(
('b8518a84-c501-11dd-b0b6-001d7dc91168','2010-01-05'),
('b853a1f2-c501-11dd-b0b6-001d7dc91168','2010-01-06'),
('b8557bd0-c501-11dd-b0b6-001d7dc91168','2010-01-07'),
/* ... */
)
) as a
join other_table b
ON
b.ref_unid = a.unid
I'm looking for a 1-statement solution. Multiple trips to the database aren't possible.
There's no other convention I'm aware of that's available in MySQL to construct a derived table in a single statement. If this dealt with a single column, at ~50 values it could be converted to use an IN clause.
The best performing approach is to load the data into a table of one form or another -- in MySQL, for a temporary use I'd recommend using the MEMORY engine. At ~50 tuples, I have to wonder why the data isn't already in the database...
Related
I have about 20 tables. These tables have only id (primary key) and description (varchar). The data is a lot reaching about 400 rows for one table.
Right now I have to get data of at least 15 tables at a time.
Right now I am calling them one by one. Which means that in one session I am giving 15 calls. This is making my process slow.
Can any one suggest any better way to get the results from the database?
I am using MySQL database and using Java Springs on server side. Will making view for all combined help me ?
The application is becoming slow because of this issue and I need a solution that will make my process faster.
It sounds like your schema isn't so great. 20 tables of id/varchar sounds like a broken EAV, which is generally considered broken to begin with. Just the same, I think a UNION query will help out. This would be the "View" to create in the database so you can just SELECT * FROM thisviewyoumade and let it worry about the hitting all the tables.
A UNION query works by having multiple SELECT stataements "Stacked" on top of one another. It's important that each SELECT statement has the same number, ordinal, and types of fields so when it stacks the results, everything matches up.
In your case, it makes sense to manufacturer an extra field so you know which table it came from. Something like the following:
SELECT 'table1' as tablename, id, col2 FROM table1
UNION ALL
SELECT 'table2', id, col2 FROM table2
UNION ALL
SELECT 'table3', id, col2 FROM table3
... and on and on
The names or aliases of the fields in the first SELECT statement are the field names that are used in the result set that is returned, so no worries about doing a bunch AS blahblahblah in subsequent SELECT statements.
The real question is whether this union query will perform faster than 15 individual calls on such a tiny tiny tiny amount of data. I think the better option would be to change your schema so this stuff is already stored in one table just like this UNION query outputs. Then you would need a single select statement against a single table. And 400x20=8000 is still a dinky little table to query.
To get a row of all descriptions into app code in a single roundtrip send a query kind of
select t1.description, ... t15.description
from t -- this should contain all needed ids
join table1 t1 on t1.id = t.t1id
...
join table1 t15 on t15.id = t.t15id
I cannot get you what you really need but here merging all those table values into single table
CREATE TABLE table_name AS (
SELECT *
FROM table1 t1
LEFT JOIN table2 t2 ON t1.ID=t2.ID AND
...
LEFT JOIN tableN tN ON tN-1.ID=tN.ID
)
[Summary of the question: 2 SQL statements produce same results, but at different speeds. One statement uses JOIN, other uses IN. JOIN is faster than IN]
I tried a 2 kinds of SELECT statement on 2 tables, named booking_record and inclusions. The table inclusions has a many-to-one relation with table booking_record.
(Table definitions not included for simplicity.)
First statement: (using IN clause)
SELECT
id,
agent,
source
FROM
booking_record
WHERE
id IN
( SELECT DISTINCT
foreign_key_booking_record
FROM
inclusions
WHERE
foreign_key_bill IS NULL
AND
invoice_closure <> FALSE
)
Second statement: (using JOIN)
SELECT
id,
agent,
source
FROM
booking_record
JOIN
( SELECT DISTINCT
foreign_key_booking_record
FROM
inclusions
WHERE
foreign_key_bill IS NULL
AND
invoice_closure <> FALSE
) inclusions
ON
id = foreign_key_booking_record
with 300,000+ rows in booking_record-table and 6,100,000+ rows in inclusions-table; the 2nd statement delivered 127 rows in just 0.08 seconds, but the 1st statement took nearly 21 minutes for same records.
Why JOIN is so much faster than IN clause?
This behavior is well-documented. See here.
The short answer is that until MySQL version 5.6.6, MySQL did a poor job of optimizing these types of queries. What would happen is that the subquery would be run each time for every row in the outer query. Lots and lots of overhead, running the same query over and over. You could improve this by using good indexing and removing the distinct from the in subquery.
This is one of the reasons that I prefer exists instead of in, if you care about performance.
EXPLAIN should give you some clues (Mysql Explain Syntax
I suspect that the IN version is constructing a list which is then scanned by each item (IN is generally considered a very inefficient construct, I only use it if I have a short list of items to manually enter).
The JOIN is more likely constructing a temp table for the results, making it more like normal JOINs between tables.
You should explore this by using EXPLAIN, as said by Ollie.
But in advance, note that the second command has one more filter: id = foreign_key_booking_record.
Check if this has the same performance:
SELECT
id,
agent,
source
FROM
booking_record
WHERE
id IN
( SELECT DISTINCT
foreign_key_booking_record
FROM
inclusions
WHERE
id = foreign_key_booking_record -- new filter
AND
foreign_key_bill IS NULL
AND
invoice_closure <> FALSE
)
This may sound like an odd question, but I'm curious to know if it's possible...
Is there a way to simulate MySQL records using inline data? For instance, if it is possible, I would expect it to work something like this:
SELECT inlinedata.*
FROM (
('Emily' AS name, 26 AS age),
('Paul' AS name, 56 AS age)
) AS inlinedata
ORDER BY age
Unfortunately MySQL does not support the standard values row-constructor for this kind of things, so you need to use a "dummy" select for each row and combine the rows using UNION ALL
SELECT *
FROM (
select 'Emily' AS name, 26 AS age
union all
select 'Paul', 56
) AS inlinedata
ORDER BY age
The UNION ALL serves two purposes
It preserves any duplicate you might have on purpose
It's a (tiny) bit faster than a plain UNION (because it does not check for duplicates)
No, not without making it complicated, but you can create a temporary table and query that instead. Temporary tables are deleted when the current client session terminates.
You can query them and insert data into them just like with other tables. When you create them, you have to use the TEMPORARY keyword, like so:
CREATE TEMPORARY TABLE ...
This way, you can also reuse the data for multiple queries if needed, no data gets stored, and all records that you query have the right structure (whereas the syntax you give in your example would create problems when you spell a column name wrong)...
with cte as (
select '2012-04-04' as student_dob, '%test1%' as student_pat
union all
select '2012-05-04', '%test2%'
union all
select '2012-07-04', '%test3%'
union all
select '2012-05-11', '%test-n%'
)
select *
from students s
inner join cte c
on s.student_dob = c.student_dob and s.student_name like c.student_pat
arguably that's not a lot more readable, but taking a lead from that, you can just store those in a table or go through temporary table, like Roy suggested.
Also it's not great idea to make a group by student id and select also something else like you did in 2nd query.
I have a query like this :
SELECT * FROM (SELECT linktable FROM adm_linkedfields WHERE name = 'company') as cbo WHERE group='BEST'
Basically, the table name for the main query is fetched through the subquery.
I get an error that #1054 - Unknown column 'group' in 'where clause'
When I investigate (removing the where clause), I find that the query only returns the subquery result at all times.
Subquery table adm_linkedfields has structure id | name | linktable
Currently am using MySQL with PDO but the query should be compatible with major DBs (viz. Oracle, MSSQL, PgSQL and MySQL)
Update:
The subquery should return the name of the table for the main query. In this case it will return tbl_company
The table tbl_company for the main query has this structure :
id | name | group
Thanks in advance.
Dynamic SQL doesn't work like that, what you created is an inline-view, read up on that. What's more, you can't create a dynamic sql query that will work on every db. If you have a limited number of linktables you could try using left-joins or unions to select from all tables but if you don't have a good reason you don't want that.
Just select the tablename in one query and then make another one to access the right table (by creating the query string in php).
Here is an issue:
SELECT * FROM (SELECT linktable FROM adm_linkedfields WHERE name = 'company') as cbo
WHERE group='BEST';
You are selecting from DT which contains only one column "linktable", then you cant put any other column in where clause of outer block. Think in terms of blocks the outer select is refering a DT which contains only one column.
Your problem is similar when you try to do:
create table t1(x1 int);
select * from t1 where z1 = 7; //error
Your query is:
SELECT *
FROM (SELECT linktable
FROM adm_linkedfields
WHERE name = 'company'
) cbo
WHERE group='BEST'
First, if you are interested in cross-database compatibility, do not name columns or tables after SQL reserved words. group is a really, really bad name for a column.
Second, the from clause is returning a table containing a list of names (of tables, but that is irrelevant). There is no column called group, so that is the problem you are having.
What can you do to fix this? A naive solution would be to run the subquery, run it, and use the resulting table name in a dynamic statement to execute the query you want.
The fundamental problem is your data structure. Having multiple tables with the same structure is generally a sign of a bad design. You basically have two choices.
One. If you have control over the database structure, put all the data in a single table, linktable for instance. This would have the information for all companies, and a column for group (or whatever you rename it). This solution is compatible across all databases. If you have lots and lots of data in the tables (think tens of millions of rows), then you might think about partitioning the data for performance reasons.
Two. If you don't have control over the data, create a view that concatenates all the tables together. Something like:
create view vw_linktable as
select 'table1' as which, t.* from table1 t union all
select 'table2', t.* from table2 t
This is also compatible across all databases.
Avoid using IN(...) when selecting on indexed fields, It will kill the performance of SELECT query.
I found this here: https://wikis.oracle.com/pages/viewpage.action?pageId=27263381
Can you explain it? Why that will kill performance? And what should I use instead of IN. "OR" statement maybe?
To tell the truth, that statement contradicts to many hints that I have read in books and articles on MySQL.
Here is an example: http://www.mysqlperformanceblog.com/2010/01/09/getting-around-optimizer-limitations-with-an-in-list/
Moreover, expr IN(value, ...) itself has additional enhancements for dealing with large value lists, since it is supposed to be used as a useful alternative to certain range queries:
If all values are constants, they are evaluated according to the type of expr and sorted. The search for the item then is done using a binary search. This means IN is very quick if the IN value list consists entirely of constants.
Still overusing INs may result in slow queries. Some cases are noted in the article.
Because MySQL can't optimize it.
Here is an example:
explain select * from keywordmaster where id in (1, 567899);
plan (sorry for external link. Doesn't show correctly here)
here is another query:
explain
select * from table where id = 1
union
select * from keywordmaster where id = 567899
plan
As you can see in the second query we get ref as const and type is const instead of range. MySQL can't optimize range scans.
Prior to MySQL 5.0 it seems that mySQL would only use a single index for a table. So, if you had a SELECT * FROM tbl WHERE (a = 6 OR b = 33) it could chooose to use either the a index or the b index, but not both. Note that it says fields, plural. I suspect the advice comes from that time and the work-around was to union the OR results, like so:
SELECT * FROM tbl WHERE (a = 6)
UNION
SELECT * FROM tbl WHERE (b = 33)
I believe IN is treated the same as a group of ORs, so using ORs won't help.
An alternative is to create a temporary table to hold the values of your IN-clause and then join with that temporary table in your SELECT.
For example:
CREATE TEMPORARY TABLE temp_table (v VARCHAR)
INSERT INTO temp_table VALUES ('foo')
INSERT INTO temp_table VALUES ('bar')
SELECT * FROM temp_table tmp, orig_table orig
WHERE temp_table.v = orig.value
DROP TEMPORARY TABLE temp_table