Is there any way to reference a subquery in a union?
I am trying to do something like the following, and would like to avoid a temporary table, but the subquery will be drawn from a much larger dataset so it makes sense to only do it once..
SELECT * FROM (SELECT * FROM ads WHERE state='FL' AND city='Maitland' AND page='home' ORDER BY RAND()) AS sq WHERE spot = 'full-banner' LIMIT 1
UNION
SELECT * FROM sq WHERE spot = 'leaderboard' LIMIT 1
UNION
SELECT * FROM sq WHERE spot = 'rectangle1' LIMIT 1
UNION
SELECT * FROM sq WHERE spot = 'rectangle2' LIMIT 1
.... etc,,
It's a shame that DISTINCT can't be specified for a single column of a result set.
Well, there is no way to do what you're trying to do without repeating the creation of the derived table.
If querying ads is really expensive then you should try adding an index like:
alter table ads add index (state, city, page, spot);
If after adding that index the query takes too much, then I'd recommend creating a table to store this data and then query that table for each spot.
Depending on your data, you could play around with GROUP BY to get similar results.
Related
I'm making a sample recent screen that will display a list, it displays the list, with id set as primary key.
I have done the correct query as expected but the table with big amount of data can cause slow performance issues.
This is the sample query below:
SELECT distinct H.id -- (Primary Key),
H.partnerid as PartnerId,
H.partnername AS partner, H.accountname AS accountName,
H.accountid as AccountNo,
FROM myschema.mytransactionstable H
INNER JOIN (
SELECT S.accountid, S.partnerid, S.accountname,
max(S.transdate) AS maxDate
from myschema.mytransactionstable S
group by S.accountid, S.partnerid, S.accountname
) ms ON H.accountid = ms.accountid
AND H.partnerid = ms.partnerid
AND H.accountname =ms.accountname
AND H.transdate = maxDate
WHERE H.accountid = ms.accountid
AND H.partnerid = ms.partnerid
AND H.accountname = ms.accountname
AND H.transdate = maxDate
GROUP BY H.partnerid,H.accountid, H.accountname
ORDER BY H.id DESC
LIMIT 5
In my case, there are values which are similar in the selected columns but differ only in their id's
Below is a link to an image without executing the query above. They are all the records that have not yet been filtered.
Sample result query click here
Since I only want to get the 5 most recent by their id but the other columns can contain similar values
accountname,accountid,partnerid.
I already got the correct query but,
I want to improve the performance of the query. Any suggestions for the improvement of query?
You can try using row_number()
select * from
(
select *,row_number() over(order by transdate desc) as rn
from myschema.mytransactionstable
)A where rn<=5
Don't repeat ON and WHERE clauses. Use ON to say how the tables (or subqueries) are "related"; use WHERE for filtering (that is, which rows to keep). Probably in your case, all the WHERE should be removed.
Please provide SHOW CREATE TABLE
This 'composite' index would probably help because of dealing with the subquery and the JOIN:
INDEX(partnerid, accountid, accountname, transdate)
That would also avoid a separate sort for the GROUP BY.
But then the ORDER BY is different, so it cannot avoid a sort.
This might avoid the sort without changing the result set ordering: ORDER BY partnerid, accountid, accountname, transdate DESC
Please provide EXPLAIN SELECT ... and EXPLAIN FORMAT=JSON SELECT ... if you have further questions.
If we cannot get an index to handle the WHERE, GROUP BY, and ORDER BY, the query will generate all the rows before seeing the LIMIT 5. If the index does work, then the outer query will stop after 5 -- potentially a big savings.
I have a framework that generate SQL. one of the query is using my index "A" and return results in 7 seconds. I see that I can optimize this and I created an index "B".
now if I run "explain my query", it still use my index A. however, if I force the use of index B, I get my results in 1 seconds (7x faster)
so clearly my index B is faster than my index A. I can't use the "force index" or "use index" command as my sql is generated from a framework that does not support this.
So, Why is mysql not naturally using the fastest index. And is there a way I can tell mysql to always use a certain index without adding "use" or "force".
the query :
SELECT *
FROM soumission
LEFT OUTER JOIN region_administrative
ON soumission.region_administrative_oid=region_administrative.oid
WHERE (soumission.statut=2
AND ((soumission.telephone LIKE '%007195155134070067132211046052045128049212213255%'
OR (soumission.autre_telephone LIKE '%007195155134070067132211046052045128049212213255%'))
OR (soumission.cellulaire LIKE '%007195155134070067132211046052045128049212213255%')))
ORDER BY soumission.date_confirmation DESC, soumission.numero;
i added an index on multiple column "statut","telephone","autre_telephone","cellulaire"
if I force using this index my query is 7x faster but if I dont specify which index to use, it use another index (only on statut field) which is 7x slower
here is the explain if I select a large date period (using the wrong index)
here is When I select a small date window
This seems to be what you are doing...
SELECT s.*, ra.*
FROM soumission AS s
LEFT OUTER JOIN region_administrative AS ra ON s.region_administrative_oid=ra.oid
WHERE s.statut = 2
AND ( s.telephone LIKE '%007195155134070067132211046052045128049212213255%'
OR s.autre_telephone LIKE '%007195155134070067132211046052045128049212213255%'
OR s.cellulaire LIKE '%007195155134070067132211046052045128049212213255%'
)
ORDER BY s.date_confirmation DESC, s.numero;
If you don't need ra.*, get rid of the LEFT JOIN.
The multi-column index you propose is useless and won't be used unless... statut = 2 for less than 20% of the rows. In that case, it will only use the first column of the index.
OR defeats indexing. (See below)
Leading wildcard on LIKE defeats indexing. Do you need the leading or trailing wild cards?
The mixing of DESC and ASC in the ORDER BY defeats using an index to avoid sorting.
So, what to do? Instead of having 3 columns for exactly 3 phone numbers, have another table for phone numbers. Then have any number of rows for a given soumission. Then searching that table may be faster because of avoiding OR -- but only if you get rid the leading wildcard.
(That's an awfully long phone number! Is it real?)
As to the query itself:
Try avoiding the leading LIKE wildcard (removed in the query below).
Split the query to several parts, combined with a UNION clause, so that indexes can be used.
So, create these indexes:
ALTER TABLE `region_administrative` ADD INDEX `region_administrativ_idx_oid` (`oid`);
ALTER TABLE `soumission` ADD INDEX `soumission_idx_statut_oid_cellulaire` (`statut`,`region_administrative_oid`,`cellulaire`);
ALTER TABLE `soumission` ADD INDEX `soumission_idx_statut_oid_telephone` (`statut`,`region_administrative_oid`,`autre_telephone`);
ALTER TABLE `soumission` ADD INDEX `soumission_idx_statut_oid_telephone` (`statut`,`region_administrative_oid`,`telephone`);
Then try this query:
SELECT
*
FROM
((SELECT
*
FROM
soumission
LEFT OUTER JOIN
region_administrative
ON soumission.region_administrative_oid = region_administrative.oid
WHERE
(
soumission.statut = 2
AND (
(
soumission.cellulaire LIKE '007195155134070067132211046052045128049212213255%'
)
)
)
ORDER BY
soumission.date_confirmation DESC,
soumission.numero)
UNION
DISTINCT (SELECT
*
FROM
soumission
LEFT OUTER JOIN
region_administrative
ON soumission.region_administrative_oid = region_administrative.oid
WHERE
(soumission.statut = 2
AND (((soumission.autre_telephone LIKE '007195155134070067132211046052045128049212213255%'))))
ORDER BY
soumission.date_confirmation DESC,
soumission.numero)
UNION
DISTINCT (SELECT
*
FROM
soumission
LEFT OUTER JOIN
region_administrative
ON soumission.region_administrative_oid = region_administrative.oid
WHERE
(soumission.statut = 2
AND ((soumission.telephone LIKE '007195155134070067132211046052045128049212213255%')))
ORDER BY
soumission.date_confirmation DESC,
soumission.numero)
) AS union1
ORDER BY
union1.date_confirmation DESC,
union1.numero
i am pretty much stucked in an Sql Query from past few hours . i need to get latest few elements from four tables as follows..
table names are -- events , contactinfo , video , news
i need last 3 results from events and news and last single result from video and contactinfo..
i tried following query but as expected it didnt worked ..
SELECT * FROM
((SELECT * FROM EVENTS ORDER BY eventid DESC LIMIT 3)EV) INNER JOIN
((SELECT * FROM NEWS ORDER BY newsid DESC LIMIT 3)NE) INNER JOIN
((SELECT * FROM VIDEOS ORDER BY videoid DESC LIMIT 1)VI) INNER JOIN
((SELECT * FROM CONTACTINFO ORDER BY cid DESC LIMIT 1)AB);
Actually i am not a DB Expert i am a Developer and i really dont know much about MySql.
Any Help Would be Appreciated.
If these tables have the same columns you can do a UNION (instead of your INNER JOIN). If not, I suggest doing 4 queries.
JOINs suggests that the data that is joined correlates to each other and if that's not the case than doing an JOIN seams like the wrong solution.
If you need result as a single table then use SELECT and UNION to union data, providing same column numbers and their data types in each query (CAST column and provide default values if need). Otherwise, if you need results with different structures then run 4 queries.
JOINs don't make sense for your task as last N rows from one table unlikely have corresponding rows within last N rows of another table.
UPDATE
See example:
SELECT * FROM
(SELECT TOP 5 n.ID, n.Content, n.CreatedOn as CreatedOn, n.UserID as NewsUserID, 1 as SourceType FROM News n ORDER BY n.CreatedOn DESC) t1
UNION ALL
SELECT * FROM
(SELECT TOP 5 e.ID, e.Description as Content, e.CreatedAt as CreatedOn, NULL as NewsUserID, 2 as SourceType FROM Events e ORDER BY e.CreatedAt DESC) t2
ORDER BY SourceType, CreatedOn DESC
So i decided i want to have ID, Content and CreatedOn from every source, and also want to have UserID from News table. I built 2 queries so they return same columns of same datatypes. Each query takes only first 5 rows from source (TOP 5 is MS SQL syntax, please use your database's). Also i added an extra field SourceType that keeps type of entity. In the main query i union all results and order by source type first, then by CreatedDate.
This is not a logical way to get four table data in one call, since all tables are independent.
I think you wants to minimise database call,
In order to minimise database hits, you should use memcache instead of using such query.
Memcache :
It save data as key value pair, for each key you will get result set.
Its very fast.
I'm not positive but I believe sub-selects are less than optimal (speedwise?).
Is there a way to remove this sub-select from my query (I think the query is self-explanatory).
select *
from tickers
where id = (select max(id) from tickers where annual_score is not null);
Would something like:
SELECT *
FROM ticker
WHERE annual_score IS NOT NULL
ORDER BY id desc
LIMIT 1
work?
That particular sub-select shouldn't be inefficient at all. It should be run once before the main query begins.
There are a certain class of subqueries that are inefficient (those that join columns between the main query and the subquery) because they end up running the subquery for every single row of the main query.
But this shouldn't be one of them, unless MySQL is severely brain-damaged, which I doubt.
However, if you remain keen to get rid of the subquery, you can order the rows by id (descending) and only fetch the first, something like:
select * from tickers
where annual_score is not null
order by id desc
limit 0, 1
Not too familiar with MySQL, but if you want to eliminate the subquery then you could try something like this:
select *
from tickers
where annual_score is not null
order by id desc
limit 1
I don't know if this is more or less performant as MySQL is not my background.
I have a large dataset in MySQL and I would like to speed up the select statement when reading data. Assuming that there are 1000 records, I would like to issue a select statement that retrieves half of them for example but based on time-stamp.
Using something like this will not work, while id is not tightly coupled with time-stamp
select * from table where table.id mod 5 = 0;
Retrieving all the data and afterwards select the data needed is not a solution while I want to avoid retrieving the large dataset. Thus, I 'm looking for something that would distinguish the records upon select.
Thnx
If you need speed then try this
select * from table ORDER BY table.id DESC LIMIT 0,500;
select * from table ORDER BY table.id DESC LIMIT 500,500;
and so on...