I have the following tables:
CREATE TABLE `data` (
`date_time` decimal(26,6) NOT NULL,
`channel_id` mediumint(8) unsigned NOT NULL,
`value` varchar(40) DEFAULT NULL,
`status` tinyint(3) unsigned DEFAULT NULL,
`connected` tinyint(1) unsigned NOT NULL,
PRIMARY KEY (`channel_id`,`date_time`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE `channels` (
`channel_id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
`channel_name` varchar(40) NOT NULL,
PRIMARY KEY (`channel_id`),
UNIQUE KEY `channel_name` (`channel_name`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
I was wondering if anyone could give me some advice on how to optimize or rewrite the following query:
SELECT channel_name, t0.date_time, t0.value, t0.status, t0.connected, t1.date_time, t1.value, t1.status, t1.connected FROM channels,
(SELECT MAX(date_time) AS date_time, channel_id, value, status, connected FROM data
WHERE date_time <= 1300818330
GROUP BY channel_id) AS t0
RIGHT JOIN
(SELECT MAX(date_time) AS date_time, channel_id, value, status, connected FROM data
WHERE date_time <= 1300818334
GROUP BY channel_id) AS t1
ON t0.channel_id = t1.channel_id
WHERE channels.channel_id = t1.channel_id
Basically I am getting the value, status and connected fields for each channel_name at two different times. Since t0 is always <= t1, the fields could exist for t1, but not t0, and I want that to be shown. That is why I am using the RIGHT JOIN. If it does not exist for t1, then it won't exist for t0, so no row should be returned.
The problem seems to be that since I am joining sub queries, no index can be used? I tried rewriting it to do a self join on the channel_id of the data table first but that is millions of rows.
It would also be nice to be able to add a boolean field to each of the final rows that is true when t0.value = t1.value & t0.status = t1.status & t0.connected = t1.connected.
Thank you very much for your time.
You can reduce the two sub-queries to one
SELECT channel_id,
MAX(date_time) AS t1_date_time,
MAX(case when date_time <= {$p1} then date_time end) AS t0_date_time
FROM data
WHERE date_time <= {$p2}
GROUP BY channel_id
GROUP BY is notoriously misleading in MySQL. Imagine if you had MIN() and MAX() in the same select, which row should the non-grouped columns come from? Once you understand this, you will see why it is not deterministic.
To get the full t0 and t1 rows
SELECT x.channel_id,
t0.date_time, t0.value, t0.status, t0.connected,
t1.date_time, t1.value, t1.status, t1.connected
FROM (
SELECT channel_id,
MAX(date_time) AS t1_date_time,
MAX(case when date_time <= {$p1} then date_time end) AS t0_date_time
FROM data
WHERE date_time <= {$p2}
GROUP BY channel_id
) x
INNER JOIN data t1 on t1.channel_id = x.channel_id and t1.date_time = x.t1_date_time
LEFT JOIN data t0 on t0.channel_id = x.channel_id and t0.date_time = x.t0_date_time
And finally a join to get the channel name
SELECT c.channel_name,
t0.date_time, t0.value, t0.status, t0.connected,
t1.date_time, t1.value, t1.status, t1.connected,
t0.value=t1.value AND t1.status=t0.status
AND t0.connected=t1.connected name_me
FROM (
SELECT channel_id,
MAX(date_time) AS t1_date_time,
MAX(case when date_time <= {$p1} then date_time end) AS t0_date_time
FROM data
WHERE date_time <= {$p2}
GROUP BY channel_id
) x
INNER JOIN channels c on c.channel_id = x.channel_id
INNER JOIN data t1 on t1.channel_id = x.channel_id and t1.date_time = x.t1_date_time
LEFT JOIN data t0 on t0.channel_id = x.channel_id and t0.date_time = x.t0_date_time
EDIT
To perform an RLIKE on channel name, it looks simple enough to add a WHERE clause at the end of the query on c.channel_name. It may however perform better to filter it at the subquery, making use of MySQL feature of processing comma-notation joins left to right.
SELECT x.channel_name,
t0.date_time, t0.value, t0.status, t0.connected,
t1.date_time, t1.value, t1.status, t1.connected,
t0.value=t1.value AND t1.status=t0.status
AND t0.connected=t1.connected name_me
(
SELECT c.channel_id, c.channel_name,
MAX(d.date_time) AS t1_date_time,
MAX(case when d.date_time <= {$p1} then d.date_time end) AS t0_date_time
FROM channels c, data d
WHERE c.channel_name RLIKE {$expr}
AND c.channel_id = d.channel_id
AND d.date_time <= {$p2}
GROUP BY c.channel_id
) x
INNER JOIN data t1 on t1.channel_id = x.channel_id and t1.date_time = x.t1_date_time
LEFT JOIN data t0 on t0.channel_id = x.channel_id and t0.date_time = x.t0_date_time
Related
select Fname, Lname
from Patientinformation
where PatientID in (Select distinctrow PatientID
from Preconditions
where (Patientinformation.PatientID = Preconditions.PatientID)>=2
);
This shows no error, but I'm getting zero results
This shows no error, but I'm getting zero results
(Patientinformation.PatientID = Preconditions.PatientID) may be NULL, TRUE or FALSE which is treated as NULL, 1 or 0 respectively.
So (Patientinformation.PatientID = Preconditions.PatientID)>=2 is NULL >= 2, 0 >= 2 or 1 >= 2 - i.e. it is never TRUE.
how can I check if the PatientID matches more than one PreCondition, without using the count Function? – sandeep venkat
Why you want NOT to use COUNT()? – Akina
Part of my assignment. – sandeep venkat
Use, for example
SELECT DISTINCT t1.columns
FROM t1
JOIN t2 t21 ON t1.somecolumn = t21.somecolumn
JOIN t2 t22 ON t1.somecolumn = t22.somecolumn AND t21.id != t22.id
or
SELECT t1.columns
FROM t1
WHERE EXISTS ( SELECT NULL
FROM t2 t21
WHERE t1.somecolumn = t21.somecolumn
AND EXISTS ( SELECT NULL
FROM t2 t22
WHERE t21.somecolumn = t22.somecolumn
AND t21.id != t22.id ) )
You can get the patient ids using:
select PatientID
from Preconditions p
group by PatientID
having count(*) > 2;
Then you can combine this with the patient information. IN is fine (although it would not be my first choice):
select pi.Fname, pi.Lname
from Patientinformation pi
where pi.PatientID in (select p.PatientID
from Preconditions p
group by PatientID
having count(*) > 2
);
I've been trying to figure this one out for days but can't come up with a solution.
Here are the table schemes.
This is my current query.
SELECT DISTINCT `address`, `order`.`id`
FROM `order`, `ordered_articles`
WHERE `order`.`id` = `f_order_id`
AND `Status` > 1
AND `Status` <4;
The problem is that the query returns as long as there is one article with status bigger than 1. I need a query where all the articles of that order have a status bigger than 1.
You can do it with NOT EXISTS:
SELECT o.`address`, o.`id`
FROM `order` o
WHERE NOT EXISTS (
SELECT 1 FROM `ordered_articles`
WHERE `f_order_id` = o.`id`
AND (`Status` <= 1 OR `Status` >= 4 )
);
or:
SELECT o.`address`, o.`id`
FROM `order` o INNER JOIN `ordered_articles` i
ON i.`f_order_id` = o.`id`
GROUP BY o.`address`, o.`id`
HAVING SUM(`Status` <= 1 OR `Status` >= 4) = 0;
I've mysql tables that looks like :
user_messages
id | user_id | phone_number | message | direction | created_at
users
id | name
I want to 'group by' user_messages two times and UNION the result. Why I want to do this? because user_id sometimes has a valid user id (anything but '-1') then I group by it, if it has -1, then group by phone_number.
I also want to left join the result with users table to get the user name in case user_id is set to a valid user
I'm almost done with the query, but the problem is:
- I want the result to have the record that results from group by to be the latest one, which means, the biggest created_at value
select * from (
(
select *, count(*) as `total` from
(select `user_id`, `message`, `created_at`, `phone_number`,`direction` from `users_messages` where `user_id` != -1 order by `created_at` desc)
as `t1` group by `user_id`
)
union
(
select *, count(*) as `total` from
(select `user_id`, `message`, `created_at`, `phone_number`,`direction` from `users_messages` where `user_id` = -1 order by `created_at` desc)
as `t2` group by `phone_number`
)
) as `t3`
left join (select `id`,`name` from `users`) as `t4` on `t3`.`user_id` = `t4`.`id` order by `created_at` desc
What this gets me is the results not sorted by created_at DESC
Update:
The query actually works in my local machine but not on the production server. In my local machine I have 5.5.42 - Source distribution and in server Ver 14.14 Distrib 5.7.17, for Linux (x86_64) using EditLine wrapper ... What could be wrong?
In local machine it correctly returns me the max created_at but in server it returns the FIRST created for the grouped by record
Something like this should work:
SELECT s.`user_id`, um.`phone_number`, s.msgCount
, um.`message`, um.`created_at`, um.`direction`
, u.`name` AS userName
FROM (
SELECT `user_id`, IF(`user_id` = -1, `phone_number`, '') AS altID, MAX(`created_at`) AS lastCreatedAt, COUNT(*) AS msgCount
FROM `users_messages`
GROUP BY user_id, altID
) AS s
INNER JOIN `users_messages` AS um
ON s.user_id = um.user_id
AND s.altID = IF(um.`user_id` = -1, um.`phone_number`, '')
AND s.lastCreatedAt = um.created_at
LEFT JOIN `users` AS u
ON s.user_id = u.user_id
ORDER BY um.created_at DESC
;
The s subquery gets the summary information for each user and userless phone number; the summary information calculated includes the most recent created_at value for use in the following....
The join to um gets the row data for their last messages (by including the lastCreatedAt value from s in the join criteria)
The final join to users is used to get the user.name for the known users (and assumes there will be no -1 user, or that such a user would have an appropriate 'unknown' name.)
Since you're grouping by user_id and phone_number, you can't keep message or direction. Add a max function for created_at in each subquery. I think this would work.
select * from (
(
select user_id
,'' as phone_number
,max('created_at') as 'created_at'
,count(*) as `total` from
(select `user_id`
,`created_at`
from `users_messages`
where `user_id` != -1)
as `t1` group by `user_id`
)
union
(
select '' as user_id
,phone_number
,max('created at') as 'created_at'
,count(*) as `total` from
(select `created_at`
,`phone_number'
from `users_messages`
where `user_id` = -1)
as `t2` group by `phone_number`
)
) as `t3`
left join (select `id`,`name` from `users`) as `t4`
on `t3`.`user_id` = `t4`.`id`
order by `created_at` desc
I'm concerned about the performance of the query below once the tables are fully populated. So far it's under development and performs well with dummy data.
The table "adress_zoo" will contain about 500 million records once fully populated. "adress_zoo" table looks like this:
CREATE TABLE `adress_zoo`
( `adress_id` int(11) NOT NULL, `zoo_id` int(11) NOT NULL,
UNIQUE KEY `pk` (`adress_id`,`zoo_id`),
KEY `adress_id` (`adress_id`) )
ENGINE=InnoDB DEFAULT CHARSET=latin1;
The other tables will contain maximum 500 records each.
The full query looks like this:
SELECT a.* FROM jos_zoo_item AS a
JOIN jos_zoo_search_index AS zsi2 ON zsi2.item_id = a.id
WHERE a.id IN (
SELECT r.id FROM (
SELECT zi.id AS id, Max(zi.priority) as prio
FROM jos_zoo_item AS zi
JOIN jos_zoo_search_index AS zsi ON zsi.item_id = zi.id
LEFT JOIN jos_zoo_tag AS zt ON zt.item_id = zi.id
JOIN jos_zoo_category_item AS zci ON zci.item_id = zi.id
**JOIN adress_zoo AS az ON az.zoo_id = zi.id**
WHERE 1=1
AND ( (zci.category_id != 0 AND ( zt.name != 'prolong' OR zt.name is NULL))
OR (zci.category_id = 0 AND zt.name = 'prolong') )
AND zi.type = 'telefoni'
AND zsi.element_id = '44d3b1fd-40f6-4fd7-9444-7e11643e2cef'
AND zsi.value = 'Small'
AND zci.category_id > 15
**AND az.adress_id = 5**
GROUP BY zci.category_id ) AS r
)
AND a.application_id = 6
AND a.access IN (1,1)
AND a.state = 1
AND (a.publish_up = '0000-00-00 00:00:00' OR a.publish_up <= '2012-06-07 07:51:26')
AND (a.publish_down = '0000-00-00 00:00:00' OR a.publish_down >= '2012-06-07 07:51:26')
AND zsi2.element_id = '1c3cd26e-666d-4f8f-a465-b74fffb4cb14'
GROUP BY a.id
ORDER BY zsi2.value ASC
The query will usually return about 25 records.
Based on your experience, will this query perform acceptable (respond within say 3 seconds)?
What can I do to optimise this?
As adviced by #Jack I ran the query with EXPLAIN and got this:
This part is an important limiter:
az.adress_id = 5
MySQL will limit the table to only those records where adress_id matches before joining it with the rest of the statement, so it will depend on how big you think that result set might be.
Btw, you have a UNIQUE(adress_id, zoo_id) and a separate INDEX. Is there a particular reason? Because the first part of a spanning key can be used by MySQL to select with as well.
What's also important is to use EXPLAIN to understand how MySQL will "attack" your query and return the results. See also: http://dev.mysql.com/doc/refman/5.5/en/execution-plan-information.html
To avoid subquery you can try to rewrite your query as:
SELECT a.* FROM jos_zoo_item AS a
JOIN jos_zoo_search_index AS zsi2 ON zsi2.item_id = a.id
INNER JOIN
(
SELECT ** distinct ** r.id FROM (
SELECT zi.id AS id, Max(zi.priority) as prio
FROM jos_zoo_item AS zi
JOIN jos_zoo_search_index AS zsi ON zsi.item_id = zi.id
LEFT JOIN jos_zoo_tag AS zt ON zt.item_id = zi.id
JOIN jos_zoo_category_item AS zci ON zci.item_id = zi.id
**JOIN adress_zoo AS az ON az.zoo_id = zi.id**
WHERE 1=1
AND ( (zci.category_id != 0 AND ( zt.name != 'prolong' OR zt.name is NULL))
OR (zci.category_id = 0 AND zt.name = 'prolong') )
AND zi.type = 'telefoni'
AND zsi.element_id = '44d3b1fd-40f6-4fd7-9444-7e11643e2cef'
AND zsi.value = 'Small'
AND zci.category_id > 15
**AND az.adress_id = 5**
GROUP BY zci.category_id ) AS r
) T
on a.id = T.id
where
AND a.application_id = 6
AND a.access IN (1,1)
AND a.state = 1
AND (a.publish_up = '0000-00-00 00:00:00' OR a.publish_up <= '2012-06-07 07:51:26')
AND (a.publish_down = '0000-00-00 00:00:00' OR a.publish_down >= '2012-06-07 07:51:26')
AND zsi2.element_id = '1c3cd26e-666d-4f8f-a465-b74fffb4cb14'
GROUP BY a.id
ORDER BY zsi2.value ASC
This approach don't perform subquery for each candidate row. Performance may be increased only if T is calculated in few milliseconds.
I am creating a small message board and I am stuck
I can select the subject, the original author, the number of replies but what I can't do is get the username, topic or date of the last post.
There are 3 tables, boards, topics and messages.
I want to get the author, date and topic of the last message in the message table. The author and date field are already fields on the messages table but i would need to join the messages and topics table on the topicid field.
this is my query that selects the subject, author, and number of replies
SELECT t.topicname, t.author, count( message ) AS message
FROM topics t
INNER JOIN messages m
ON m.topicid = t.topicid
INNER JOIN boards b
ON b.boardid = t.boardid
WHERE b.boardid = 1
GROUP BY t.topicname
Can anyone please help me get this finished?
This is what my tables look like
CREATE TABLE `boards` (
`boardid` int(2) NOT NULL auto_increment,
`boardname` varchar(255) NOT NULL default '',
PRIMARY KEY (`boardid`)
);
CREATE TABLE `messages` (
`messageid` int(6) NOT NULL auto_increment,
`topicid` int(4) NOT NULL default '0',
`message` text NOT NULL,
`author` varchar(255) NOT NULL default '',
`date` timestamp(14) NOT NULL,
PRIMARY KEY (`messageid`)
);
CREATE TABLE `topics` (
`topicid` int(4) NOT NULL auto_increment,
`boardid` int(2) NOT NULL default '0',
`topicname` varchar(255) NOT NULL default '',
`author` varchar(255) NOT NULL default '',
PRIMARY KEY (`topicid`)
);
if your SQL supports the LIMIT clause,
SELECT m.author, m.date, t.topicname FROM messages m
JOIN topics t ON m.topicid = t.topicid
ORDER BY date desc LIMIT 1
otherwise:
SELECT m.author, m.date, t.topicname FROM messages m
JOIN topics t ON m.topicid = t.topicid
WHERE m.date = (SELECT max(m2.date) from messages m2)
EDIT: if you want to combine this with the original query, it has to be rewritten using subqueries to extract the message count and the date of last message:
SELECT t.topicname, t.author,
(select count(message) from messages m where m.topicid = t.topicid) AS messagecount,
lm.author, lm.date
FROM topics t
INNER JOIN messages lm
ON lm.topicid = t.topicid AND lm.date = (SELECT max(m2.date) from messages m2)
INNER JOIN boards b
ON b.boardid = t.boardid
WHERE b.boardid = 1
GROUP BY t.topicname
also notice that if you don't pick any field from table boards, you don't need the last join:
SELECT t.topicname, t.author,
(select count(message) from messages m where m.topicid = t.topicid) AS messagecount,
lm.author, lm.date
FROM topics t
INNER JOIN messages lm
ON lm.topicid = t.topicid AND lm.date = (SELECT max(m2.date) from messages m2)
WHERE t.boardid = 1
GROUP BY t.topicname
EDIT: if mysql doesn't support subqueries in the field list, you can try this:
SELECT t.topicname, t.author, mc.messagecount, lm.author, lm.date
FROM topics t
JOIN (select m.topicid, count(*) as messagecount from messages m group by m.topicid) as mc
ON mc.topicid = t.topicid
JOIN messages lm
ON lm.topicid = t.topicid AND lm.date = (SELECT max(m2.date) from messages m2)
WHERE t.boardid = 1
GROUP BY t.topicname
If you want to get the latest entry in a table, you should have a DateTime field that shows when the entry was created (or updated). You can then sort on this column and select the latest one.
But if your id field is a number, you could find the highest. But I would recommend against this because it makes many assumptions and you would be fixed to numerical ids in the future.
You can use a subselect. Eg.:
select * from messages where id = (select max(id) from messages)
edit: And if you identify the newest record by a timestamp, you'd use:
select * from messages where id = (
select id
from messages
order by post_time desc
limit 1)
With MySQL this should work:
SELECT author, date, topicname as topic FROM messages LEFT JOIN topics ON messages.topicid = topics.topicid ORDER BY date DESC, LIMIT 0, 1;