i am pretty much stucked in an Sql Query from past few hours . i need to get latest few elements from four tables as follows..
table names are -- events , contactinfo , video , news
i need last 3 results from events and news and last single result from video and contactinfo..
i tried following query but as expected it didnt worked ..
SELECT * FROM
((SELECT * FROM EVENTS ORDER BY eventid DESC LIMIT 3)EV) INNER JOIN
((SELECT * FROM NEWS ORDER BY newsid DESC LIMIT 3)NE) INNER JOIN
((SELECT * FROM VIDEOS ORDER BY videoid DESC LIMIT 1)VI) INNER JOIN
((SELECT * FROM CONTACTINFO ORDER BY cid DESC LIMIT 1)AB);
Actually i am not a DB Expert i am a Developer and i really dont know much about MySql.
Any Help Would be Appreciated.
If these tables have the same columns you can do a UNION (instead of your INNER JOIN). If not, I suggest doing 4 queries.
JOINs suggests that the data that is joined correlates to each other and if that's not the case than doing an JOIN seams like the wrong solution.
If you need result as a single table then use SELECT and UNION to union data, providing same column numbers and their data types in each query (CAST column and provide default values if need). Otherwise, if you need results with different structures then run 4 queries.
JOINs don't make sense for your task as last N rows from one table unlikely have corresponding rows within last N rows of another table.
UPDATE
See example:
SELECT * FROM
(SELECT TOP 5 n.ID, n.Content, n.CreatedOn as CreatedOn, n.UserID as NewsUserID, 1 as SourceType FROM News n ORDER BY n.CreatedOn DESC) t1
UNION ALL
SELECT * FROM
(SELECT TOP 5 e.ID, e.Description as Content, e.CreatedAt as CreatedOn, NULL as NewsUserID, 2 as SourceType FROM Events e ORDER BY e.CreatedAt DESC) t2
ORDER BY SourceType, CreatedOn DESC
So i decided i want to have ID, Content and CreatedOn from every source, and also want to have UserID from News table. I built 2 queries so they return same columns of same datatypes. Each query takes only first 5 rows from source (TOP 5 is MS SQL syntax, please use your database's). Also i added an extra field SourceType that keeps type of entity. In the main query i union all results and order by source type first, then by CreatedDate.
This is not a logical way to get four table data in one call, since all tables are independent.
I think you wants to minimise database call,
In order to minimise database hits, you should use memcache instead of using such query.
Memcache :
It save data as key value pair, for each key you will get result set.
Its very fast.
Related
I'm making a sample recent screen that will display a list, it displays the list, with id set as primary key.
I have done the correct query as expected but the table with big amount of data can cause slow performance issues.
This is the sample query below:
SELECT distinct H.id -- (Primary Key),
H.partnerid as PartnerId,
H.partnername AS partner, H.accountname AS accountName,
H.accountid as AccountNo,
FROM myschema.mytransactionstable H
INNER JOIN (
SELECT S.accountid, S.partnerid, S.accountname,
max(S.transdate) AS maxDate
from myschema.mytransactionstable S
group by S.accountid, S.partnerid, S.accountname
) ms ON H.accountid = ms.accountid
AND H.partnerid = ms.partnerid
AND H.accountname =ms.accountname
AND H.transdate = maxDate
WHERE H.accountid = ms.accountid
AND H.partnerid = ms.partnerid
AND H.accountname = ms.accountname
AND H.transdate = maxDate
GROUP BY H.partnerid,H.accountid, H.accountname
ORDER BY H.id DESC
LIMIT 5
In my case, there are values which are similar in the selected columns but differ only in their id's
Below is a link to an image without executing the query above. They are all the records that have not yet been filtered.
Sample result query click here
Since I only want to get the 5 most recent by their id but the other columns can contain similar values
accountname,accountid,partnerid.
I already got the correct query but,
I want to improve the performance of the query. Any suggestions for the improvement of query?
You can try using row_number()
select * from
(
select *,row_number() over(order by transdate desc) as rn
from myschema.mytransactionstable
)A where rn<=5
Don't repeat ON and WHERE clauses. Use ON to say how the tables (or subqueries) are "related"; use WHERE for filtering (that is, which rows to keep). Probably in your case, all the WHERE should be removed.
Please provide SHOW CREATE TABLE
This 'composite' index would probably help because of dealing with the subquery and the JOIN:
INDEX(partnerid, accountid, accountname, transdate)
That would also avoid a separate sort for the GROUP BY.
But then the ORDER BY is different, so it cannot avoid a sort.
This might avoid the sort without changing the result set ordering: ORDER BY partnerid, accountid, accountname, transdate DESC
Please provide EXPLAIN SELECT ... and EXPLAIN FORMAT=JSON SELECT ... if you have further questions.
If we cannot get an index to handle the WHERE, GROUP BY, and ORDER BY, the query will generate all the rows before seeing the LIMIT 5. If the index does work, then the outer query will stop after 5 -- potentially a big savings.
My database is called: (training_session)
I try to print out some information from my data, but I do not want to have any duplicates. I do get it somehow, may someone tell me what I do wrong?
SELECT DISTINCT athlete_id AND duration FROM training_session
SELECT DISTINCT athlete_id, duration FROM training_session
It works perfectly if i use only one column, but when I add another. it does not work.
I think you misunderstood the use of DISTINCT.
There is big difference between using DISTINCT and GROUP BY.
Both have some sort of goal, but they have different purpose.
You use DISTINCT if you want to show a series of columns and never repeat. That means you dont care about calculations or group function aggregates. DISTINCT will show different RESULTS if you keep adding more columns in your SELECT (if the table has many columns)
You use GROUP BY if you want to show "distinctively" on a certain selected columns and you use group function to calculate the data related to it. Therefore you use GROUP BY if you want to use group functions.
Please check group functions you can use in this link.
https://dev.mysql.com/doc/refman/8.0/en/group-by-functions.html
EDIT 1:
It seems like you are trying to get the "latest" of a certain athlete, I'll assume the current scenario if there is no ID.
Here is my alternate solution:
SELECT a.athlete_id ,
( SELECT b.duration
FROM training_session as b
WHERE b.athlete_id = a.athlete_id -- connect
ORDER BY [latest column to sort] DESC
LIMIT 1
) last_duration
FROM training_session as a
GROUP BY a.athlete_id
ORDER BY a.athlete_id
This syntax is called IN-SELECT subquery. With the help of LIMIT 1, it shows the topmost record. In-select subquery must have 1 record to return or else it shows error.
MySQL's DISTINCT clause is used to filter out duplicate recordsets.
If your query was SELECT DISTINCT athlete_id FROM training_session then your output would be:
athlete_id
----------
1
2
3
4
5
6
As soon as you add another column to your query (in your example, the column called duration) then each record resulting from your query are unique, hence the results you're getting. In other words the query is working correctly.
Hi i have an issue with a mysql select statement i cant get my head around,
Table client_directory_data
id int,
verified int,
client_id int,
created timestamp,
description longtext
select * from client_directory_data where verified = 1 order by created desc
but this selects multiple rows for each client_id
what i need to do is to select every client_id which has a verified = 1 but only get the most recent row for each client_id, i hope that makes sense.
This is an issue I face all the time. Fortunately there's a nice little trick for doing this:
SELECT
client_id,
SUBSTRING_INDEX(GROUP_CONCAT(id ORDER BY created DESC),",",1) AS `id`
FROM client_directory_data
WHERE verified = 1
GROUP BY client_id
And if you want the whole row you can just join onto it like so:
SELECT
*
FROM (
SELECT
client_id,
SUBSTRING_INDEX(GROUP_CONCAT(id ORDER BY created DESC),",",1) AS `id`
FROM client_directory_data
WHERE verified = 1
GROUP BY client_id
) ids
JOIN client_directory_data USING (id);
Of course if you're ordering by an indexed field anyway (that you could therefore join on efficiently anyway), it's better to use MAX(id) AS id, although it actually has very little impact on performance. The main reason to use MAX() is really to make the code a little simpler. It also avoids the pitfalls you may encounter if the field contains commas (which you can get around with a different seperator for the group concat) or hitting the max GROUP_CONCAT length (which can be extended with SET group_concat_max_len = xxx; and only causes warnings anyway).
I can see why this would intuitively seem like it would have performance issues, however it's actually the best performng method I've found for these queries - especially on large tables.
Here are some benchmarks I've taken from some of the larger tables currently available to me comparing the three methods in this thread.
Query A: (~5,000 records, ~900 results, non-indexed field)
GROUP_CONCAT method: 0.0100 seconds
MAX method: 0.102 seconds
LEFT JOIN method: 0.0082 seconds
Query B : (~300,000 records, ~95,000 results)
GROUP_CONCAT method: 1.8618 seconds
MAX method: 1.7904 seconds
LEFT JOIN method: 6.4649 seconds
Query C : (~300,000 records, ~7 results)
GROUP_CONCAT method: 0.103 seconds
MAX method: 0.0102 seconds
LEFT JOIN method: (I got bored after 4 hours)
Query D : (~500,000 records, ~5,000 different values of the field being grouped)
GROUP method: 0.1355 seconds
MAX Method : 0.0429 seconds
LEFT JOIN method: (I got bored after 10 minutes)
That makes sense and is a classic question.
Assuming that the most recent row is the one with highest id, you can use:
SELECT *
FROM client_directory_data c
LEFT JOIN client_directory_data d ON c.client_id = d.client_id AND d.verified = 1 AND d.id > c.id
WHERE d.id IS NULL
AND c.verified = 1;
You can have an explanation of this query pattern here.
Make id as primary key for the table client_directory_data
Is there any way to reference a subquery in a union?
I am trying to do something like the following, and would like to avoid a temporary table, but the subquery will be drawn from a much larger dataset so it makes sense to only do it once..
SELECT * FROM (SELECT * FROM ads WHERE state='FL' AND city='Maitland' AND page='home' ORDER BY RAND()) AS sq WHERE spot = 'full-banner' LIMIT 1
UNION
SELECT * FROM sq WHERE spot = 'leaderboard' LIMIT 1
UNION
SELECT * FROM sq WHERE spot = 'rectangle1' LIMIT 1
UNION
SELECT * FROM sq WHERE spot = 'rectangle2' LIMIT 1
.... etc,,
It's a shame that DISTINCT can't be specified for a single column of a result set.
Well, there is no way to do what you're trying to do without repeating the creation of the derived table.
If querying ads is really expensive then you should try adding an index like:
alter table ads add index (state, city, page, spot);
If after adding that index the query takes too much, then I'd recommend creating a table to store this data and then query that table for each spot.
Depending on your data, you could play around with GROUP BY to get similar results.
I would like to know the impact on performance if I run this query in the following conditions.
Query:
select `players`.*, count(`clicks`.`id`) as `clicks_count`
from `players` left join `clicks` on `clicks`.`player_id` = `players`.`id`
group by `players`.`id`
order by `clicks_count` desc
limit 1
Conditions:
In the clicks table I expect to get
insert 1000 times in a 1 minute
The clicks table will contain more
then 1,000,000 rows
The players table will contain
10,000 rows
The players table get inserted into every 5
minutes
I would like to know what to expect performance-wise if I run the query 1000 times in 1 minute.
Thanks
That query will never run in milliseconds with any meaningful amounts of data in your tables. It'll run two full table scans, join the two together, aggregate the mess, and fetch the top row from that.
Use a trigger to store the total in the players, and index that field. You'll then be able to avoid the join altogether:
select p.* from players p order by clicks_count desc limit 1
First & foremost, you should worry about your schema if you want decent performance with that number of records and frequent writes; i.e. proper indexes and constraints must be created if not already in place.
Next, the query itself, select the minimum number of fields needed (so if you do not need ALL players field, avoid using "players.*").
Personal pref, I'd restructure tables (e.g. playerID in place of id) and query like so:
SELECT p.*, COUNT(c.id) as clicks_count
FROM players p
JOIN clicks c USING(playerID)
GROUP BY p.playerID
ORDER BY clicks_count desc
LIMIT 1
Again, see if you really need ALL player table fields; if not, omit "p.*" and replace with p.foo, p.bar, etc.