Mysql: Get latest row by two dates - mysql

May be someone could give me an advice how to achieve my goal.
I'm using MySQL
I have a table with historical data
CREATE TABLE `history` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`parent_id` int(11) DEFAULT NULL,
`from_dt` date NOT NULL,
`date_create` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
`approved` tinyint(1) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`),
KEY `parent_id` (`parent_id`),
)
Is there easier way to get dataset with latest records for each user(user_id) in this table, where from_dt less than now()
from_dt - could contain any date, so there're might be records in the future and in the past.
What I got for now:
SELECT * FROM `history` right join (SELECT
history.user_id, MAX(date_create)
FROM
history
RIGHT JOIN
(SELECT
user_id, MAX(from_dt) max_from
FROM
history
WHERE
from_dt < NOW()
GROUP BY user_id , from_dt) AS hf ON hf.max_from = history.from_dt
AND hf.user_id = history.user_id
GROUP BY user_id) as hdt on hdt.user_id = history.user_id
But join tables 3 times to itself looks a little bit messy for me, cause I have to join here additional data (like user info, etc)
Many thanks,
Max

You can simply try this -
SELECT *
FROM `history` H1
INNER JOIN (SELECT user_id, MAX(from_dt) max_from
FROM history
GROUP BY user_id) users
ON H1.`date_create` = users.max_from
WHERE from_dt < NOW()

Related

Refactor query to work with MySQL (unknown column error)

Schema:
CREATE TABLE IF NOT EXISTS `user` (
`id` BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
`deleted` TIMESTAMP NOT NULL,
`email` VARCHAR(254) NOT NULL UNIQUE
);
CREATE TABLE IF NOT EXISTS `userVersion` (
`userId` BIGINT UNSIGNED,
`effective` TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
`created` TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
`name` VARCHAR(100) NOT NULL,
PRIMARY KEY (`userId`, `effective`, `created`),
FOREIGN KEY (`userId`) REFERENCES `user`(`id`)
);
The query I'm trying to perform:
SELECT u.id
FROM `user` u
INNER JOIN userVersion uv
ON u.id = uv.userId
AND uv.effective = (
SELECT MAX(uv1.effective)
FROM userVersion uv1
WHERE uv1.userId = u.id
AND uv1.effective <= NOW())
AND uv.created = (
SELECT MAX(uv2.created)
FROM userVersion uv2
WHERE uv2.userId = u.id
AND uv2.effective = uv1.effective
AND uv2.created <= NOW())
I'm getting an unknown column error of uv1.effective (situated right before the last line). I believe this query works for other databases (e.g. Oracle) but doesn't seem to work with MySQL. How could I change this query to get the same behavior?
PS: The created column is supposed to represent when the row was inserted in the database while effective is supposed to represent when that row should start being used (this allows me to add changes in the present that will work in the future).

select taking 8 seconds. improve ideas

I have this select to get chat (like facebook inbox).
It will show most recent messages, grouping by user who sent them.
SELECT c.id, c.from, c.to, c.sent, c.message, c.recd FROM chat c
WHERE c.id IN(
SELECT MAX(id) FROM chat
WHERE (`to` = 1 and `del_to_status` = '0') or (`from` = 1 and `del_from_status` = '0')
GROUP BY CASE WHEN 1 = `to` THEN `from` ELSE `to` END
)
ORDER BY id DESC
limit 60
The problem is it is taking about 8 seconds.
`chat` (
`id` int(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`from` int(11) UNSIGNED NOT NULL,
`to` int(11) UNSIGNED NOT NULL,
`message` text NOT NULL,
`sent` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`recd` tinyint(1) NOT NULL DEFAULT '0',
`del_from_status` tinyint(1) NOT NULL DEFAULT '0',
`del_to_status` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `from` (`from`),
KEY `to` (`to`),
FOREIGN KEY (`from`) REFERENCES cadastro (`id`),
FOREIGN KEY (`to`) REFERENCES cadastro (`id`)
)
any ideas of indexing or re-writing this select to get better speed?
I am assuming chat.id is indexed. If not, of course you should add an index.
If it is indexed, MySQL is often very slow with sub selects.
One thing you can do is convert your sub select to a temporary table and join with it.
It will look something like
CREATE TEMPORARY TABLE IF NOT EXISTS max_chat_ids
( INDEX(id) )
ENGINE=MEMORY
AS ( 'SELECT MAX(id) as id FROM chat
WHERE (`to` = 1 and `del_to_status` = '0') or (`from` = 1 and `del_from_status` = '0')
GROUP BY CASE WHEN 1 = `to` THEN `from` ELSE `to` END' );
then, you need to just join with the temp table:
SELECT c.id, c.from, c.to, c.sent, c.message, c.recd FROM chat c
join max_chat_ids d on c.id=d.id
ORDER BY c.id DESC
limit 60
temp tables only live during the duration of the session, so if you test this in phpmyadmin remember to execute both queries together with ';' between them.
If you try this share your result.
I'll assume the column id is already indexed since it probably is the primary key of the table. If it's not the case, add the index:
create index ix1_chat on chat (id);
Then, if the selectivity of the subquery is good then an index will help. The selectivity is the percentage of rows the select is reading compared to the total number of rows. Is it 50%, 5%, 0.5%? If it's 5% or less then the following index will help:
create index ix2_chat on chat (`to`, del_to_status, `from`, del_from_status);
As a side note, please don't use reserved words for column names: I'm talking about the from column. It just makes life difficult for everyone.

Optimize a query

How can I proceed to make my response time more faster, approximately the average time of response is 0.2s ( 8039 records in my items table & 81 records in my tracking table )
Query
SELECT a.name, b.cnt FROM `items` a LEFT JOIN
(SELECT guid, COUNT(*) cnt FROM tracking WHERE
date > UNIX_TIMESTAMP(NOW() - INTERVAL 1 day ) GROUP BY guid) b ON
a.`id` = b.guid WHERE a.`type` = 'streaming' AND a.`state` = 1
ORDER BY b.cnt DESC LIMIT 15 OFFSET 75
Tracking table structure
CREATE TABLE `tracking` (
`id` bigint(11) NOT NULL AUTO_INCREMENT,
`guid` int(11) DEFAULT NULL,
`ip` int(11) NOT NULL,
`date` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `i1` (`ip`,`guid`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=4303 DEFAULT CHARSET=latin1;
Items table structure
CREATE TABLE `items` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`guid` int(11) DEFAULT NULL,
`type` varchar(255) DEFAULT NULL,
`name` varchar(255) DEFAULT NULL,
`embed` varchar(255) DEFAULT NULL,
`url` varchar(255) DEFAULT NULL,
`description` text,
`tags` varchar(255) DEFAULT NULL,
`date` int(11) DEFAULT NULL,
`vote_val_total` float DEFAULT '0',
`vote_total` float(11,0) DEFAULT '0',
`rate` float DEFAULT '0',
`icon` text CHARACTER SET ascii,
`state` int(11) DEFAULT '0',
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=9258 DEFAULT CHARSET=latin1;
Your query, as written, doesn't make much sense. It produces all possible combinations of rows in your two tables and then groups them.
You may want this:
SELECT a.*, b.cnt
FROM `items` a
LEFT JOIN (
SELECT guid, COUNT(*) cnt
FROM tracking
WHERE `date` > UNIX_TIMESTAMP(NOW() - INTERVAL 1 day)
GROUP BY guid
) b ON a.guid = b.guid
ORDER BY b.cnt DESC
The high-volume data in this query come from the relatively large tracking table. So, you should add a compound index to it, using the columns (date, guid). This will allow your query to random-access the index by date and then scan it for guid values.
ALTER TABLE tracking ADD INDEX guid_summary (`date`, guid);
I suppose you'll see a nice performance improvement.
Pro tip: Don't use SELECT *. Instead, give a list of the columns you want in your result set. For example,
SELECT a.guid, a.name, a.description, b.cnt
Why is this important?
First, it makes your software more resilient against somebody adding columns to your tables in the future.
Second, it tells the MySQL server to sling around only the information you want. That can improve performance really dramatically, especially when your tables get big.
Since tracking has significantly fewer rows than items, I will propose the following.
SELECT i.name, c.cnt
FROM
(
SELECT guid, COUNT(*) cnt
FROM tracking
WHERE date > UNIX_TIMESTAMP(NOW() - INTERVAL 1 day )
GROUP BY guid
) AS c
JOIN items AS i ON i.id = c.guid
WHERE i.type = 'streaming'
AND i.state = 1;
ORDER BY c.cnt DESC
LIMIT 15 OFFSET 75
It will fail to display any items for which cnt is 0. (Your version displays the items with NULL for the count.)
Composite indexes needed:
items: The PRIMARY KEY(id) is sufficient.
tracking: INDEX(date, guid) -- "covering"
Other issues:
If ip is an IP-address, it needs to be INT UNSIGNED. But that covers only IPv4, not IPv6.
It seems like date is not just a "date", but really a date+time. Please rename it to avoid confusion.
float(11,0) -- Don't use FLOAT for integers. Don't use (m,n) on FLOAT or DOUBLE. INT UNSIGNED makes more sense here.
OFFSET is naughty when it comes to performance -- it must scan over the skipped records. But, in your query, there is no way to avoid collecting all the possible rows, sorting them, stepping over 75, and only finally delivering 15 rows. (And, with no more than 81, it won't be a full 15.)
What version are you using? There have been important changes to the Optimization of LEFT JOIN ( SELECT ... ). Please provide EXPLAIN SELECT for each query under discussion.

How to join two tables without messing up the query

I have this query for example (good, it works how I want it to)
SELECT `discusComments`.`memberID`, COUNT( `discusComments`.`memberID`) AS postcount
FROM `discusComments`
GROUP BY `discusComments`.`memberID` ORDER BY postcount DESC
Example Results:
memberid postcount
3 283
6 230
9 198
Now I want to join the memberid of the discusComments table with that of the discusTopic table (because what I really want to do is only get my results from a specific GROUP, and the group id is only in the topic table and not in the comment one hence the join.
SELECT `discusComments`.`memberID`, COUNT( `discusComments`.`memberID`) AS postcount
FROM `discusComments`
LEFT JOIN `discusTopics` ON `discusComments`.`memberID` = `discusTopics`.`memberID`
GROUP BY `discusComments`.`memberID` ORDER BY postcount DESC
Example Results:
memberid postcount
3 14789
6 8678
9 6987
How can I stop this huge increase happening in the postcount? I need to preserve it as before.
Once I have this sorted I want to have some kind of line which says WHERE discusTopics.groupID = 6, for example
CREATE TABLE IF NOT EXISTS `discusComments` (
`id` bigint(255) NOT NULL auto_increment,
`topicID` bigint(255) NOT NULL,
`comment` text NOT NULL,
`timeStamp` bigint(12) NOT NULL,
`memberID` bigint(255) NOT NULL,
`thumbsUp` int(15) NOT NULL default '0',
`thumbsDown` int(15) NOT NULL default '0',
`status` int(1) NOT NULL default '1',
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=7190 ;
.
CREATE TABLE IF NOT EXISTS `discusTopics` (
`id` bigint(255) NOT NULL auto_increment,
`groupID` bigint(255) NOT NULL,
`memberID` bigint(255) NOT NULL,
`name` varchar(255) NOT NULL,
`views` bigint(255) NOT NULL default '0',
`lastUpdated` bigint(10) NOT NULL,
PRIMARY KEY (`id`),
KEY `groupID` (`groupID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=913 ;
SELECT `discusComments`.`memberID`, COUNT( `discusComments`.`memberID`) AS postcount
FROM `discusComments`
JOIN `discusTopics` ON `discusComments`.`topicID` = `discusTopics`.`id`
GROUP BY `discusComments`.`memberID` ORDER BY postcount DESC
Joining the topicid in both tables solved the memberID issue. Thanks #Andiry M
You need to use just JOIN not LEFT JOIN and you can add AND discusTopics.memberID = 6 after ON discusComments.memberID = discusTopics.memberID
You can use subqueries lik this
SELECT `discusComments`.`memberID`, COUNT( `discusComments`.`memberID`) AS postcount
FROM `discusComments` where `discusComments`.`memberID` in
(select distinct memberid from `discusTopics` WHERE GROUPID = 6)
If i understand your question right you do not need to use JOIN here at all. JOINs are needed in case when you have many to many relationships and you need for each value in one table select all corresponding values in another table.
But here you have many to one relationship if i got it right. Then you can simply do select from two tables like this
SELECT a.*, b.id FROM a, b WHERE a.pid = b.id
This is simple request and won't create a giant overhead as JOIN does
PS: In the future try to experiment with your queries, try to avoid JOINs especially in MySQL. They are slow and dangerous in their complexity. For 90% of cases when you want to use JOIN there is simple and much faster solution.

Some help needed with a SQL query

I need some help with a MySQL query. I have two tables, one with offers and one with statuses. An offer can has one or more statuses. What I would like to do is get all the offers and their latest status. For each status there's a table field named 'added' which can be used for sorting.
I know this can be easily done with two queries, but I need to make it with only one because I also have to apply some filters later in the project.
Here's my setup:
CREATE TABLE `test`.`offers` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`client` TEXT NOT NULL ,
`products` TEXT NOT NULL ,
`contact` TEXT NOT NULL
) ENGINE = MYISAM ;
CREATE TABLE `statuses` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`offer_id` int(11) NOT NULL,
`options` text NOT NULL,
`deadline` date NOT NULL,
`added` datetime NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
Should work but not very optimal imho :
SELECT *
FROM offers
INNER JOIN statuses ON (statuses.offer_id = offers.id
AND statuses.id =
(SELECT allStatuses.id
FROM statuses allStatuses
WHERE allStatuses.offer_id = offers.id
ORDER BY allStatuses.added DESC LIMIT 1))
Try this:
SELECT
o.*
FROM offers o
INNER JOIN statuses s ON o.id = s.offer_id
ORDER BY s.added
LIMIT 1