Simple SQL query but wrong results returned - mysql

I have a simple click tracking system that consists of three tables "tracking" (which holds unique views), "views" (which holds raw views) and "products" (which holds products).
Here's how it works: each time a user clicks on a tracking link, if the hash present in the link does not exist in the database, it will be saved in the "tracking" table as an unique view and also in the "views" table as a raw view. If the hash present in the link does exist in the database, then it will be saved only in the "views" table. So basically the number of "raw views" can not be smaller than the number of "unique views" because each "unique view" also counts as a "raw view".
I wrote a query to create reports based on products, but the number of "raw views" returned is not correct.
I've also created a fiddle which I hope it will give a better overview of my problem.
Here's the table structure:
CREATE TABLE `products` (
`id` int(10) UNSIGNED NOT NULL,
`name` varchar(128) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `products` (`id`, `name`) VALUES
(1, 'Test product');
CREATE TABLE `tracking` (
`id` int(10) UNSIGNED NOT NULL,
`product_id` int(11) NOT NULL,
`hash` varchar(32) NOT NULL,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `tracking` (`id`, `product_id`, `hash`, `created`) VALUES
(1, 1, '7ddf32e17a6ac5ce04a8ecbf782ca509', '2020-02-09 18:50:19'),
(2, 1, '00bb28eaf259ba0c932d67f649d90783', '2020-02-09 18:55:34');
CREATE TABLE `views` (
`id` int(10) UNSIGNED NOT NULL,
`hash` varchar(32) NOT NULL,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `views` (`id`, `hash`, `created`) VALUES
(1, '7ddf32e17a6ac5ce04a8ecbf782ca509', '2020-02-09 18:46:30'),
(2, '7ddf32e17a6ac5ce04a8ecbf782ca509', '2020-02-09 18:46:30'),
(3, '7ddf32e17a6ac5ce04a8ecbf782ca509', '2020-02-09 18:46:35'),
(4, '7ddf32e17a6ac5ce04a8ecbf782ca509', '2020-02-09 18:46:42'),
(5, '00bb28eaf259ba0c932d67f649d90783', '2020-02-09 18:56:31'),
(6, '00bb28eaf259ba0c932d67f649d90783', '2020-02-09 18:57:01');
And here's the query I wrote so far:
SELECT products.name AS `param`,
SUM(IF(tracking.product_id<>24, 1, 0)) AS `uniques`,
IF(SUM(IF(tracking.product_id<>24, 1, 0))=0, 0,
(SELECT COUNT(`hash`)
FROM `views` WHERE tracking.hash = views.hash)) AS `views`
FROM tracking
LEFT JOIN products ON products.id = tracking.product_id
WHERE tracking.created BETWEEN '2019-01-01 00:00:00' AND '2020-02-10 00:00:00'
GROUP BY products.name
As you can see I have 2 unique views and 6 raw views (4 for one hash and 2 for the other hash).
My expectation would be for the query result to be 2 uniques and 6 raw views for this given product, but instead I'm getting 2 uniques and 4 raw views. Like it's counting the views only for the first hash.

The next query can solve your situation:
SELECT
products.name,
COUNT(DISTINCT `tracking`.`hash`) AS `uniques`, -- count unique hashes
COUNT(*) AS `views` -- count total
FROM `tracking`
JOIN `views` ON `views`.hash = tracking.hash
LEFT JOIN products ON products.id = tracking.product_id
WHERE tracking.created BETWEEN '2019-01-01 00:00:00' AND '2020-02-10 00:00:00'
GROUP BY products.name;
;

Related

Return all data from left table, even if where clause is not attended

I have a seller_commissions table, where are related with two other tables: products and sellers (users)
I need to make a painel, where admin can update seller commissions for each product.
Products will be created over time, so I don't want to insert data in seller_commissions table when this occurs, because I would need to do this multiples times. So, my solution was:
get all products data for user's update. If seller_commissions are null for specific product, this means the target seller never has your commission updated. In other words, all sellers have commission = 0 in first moment.
I try the following queries:
-- This is the result what I want, but filtering by seller_id, but, unfortannaly this return all products for each seller (I want to specify the seller_id)
select fpp.name as product_name,
fsc.seller_id,
fsc.commission
from fp_products as fpp
left join fp_sellers_commissions as fsc
on fsc.product_id = fpp.id
left join fp_users as fpu
on fpu.id = fsc.seller_id;
-- If I use 'where' clause, not all products are returned, because seller_id is none
select fpp.name as product_name,
fsc.seller_id,
fsc.commission
from fp_products as fpp
left join fp_sellers_commissions as fsc
on fsc.product_id = fpp.id
left join fp_users as fpu
on fpu.id = fsc.seller_id
where seller_id = 1;
result for first query:
result for second query:
expected results:
product_name
seller_id
commission
shirt
1
250
shoes
null
0
black shirt
null
0
In first query, is something similiar with what I want. Get all products and seller_commission, but I want this for a specific seller, but when I try to use WHERE clause, I don't get all products, because seller_id can be null. I try some variations of these queries, but can't get the expected result :/. Appreciate any help.
to build the schema, use:
-- Create schema
CREATE TABLE `fp_sellers_commissions` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`commission` float NOT NULL DEFAULT '0',
`product_id` int(11) NOT NULL,
`seller_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `fp_products` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(64) CHARACTER SET latin1 NOT NULL,
`createdAt` datetime DEFAULT CURRENT_TIMESTAMP,
`disabled` tinyint(4) DEFAULT '0',
PRIMARY KEY (`id`)
);
CREATE TABLE `fp_users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(32) CHARACTER SET latin1 NOT NULL,
`surname` varchar(32) CHARACTER SET latin1 NOT NULL,
PRIMARY KEY (`id`)
);
-- Inserting data:
INSERT INTO `fp_products`
(`id`, `name`, `createdAt`, `disabled`)
VALUES
(1, 'shirt', '00:00:00', 0),
(2, 'shoes', '00:00:00', 0),
(3, 'black shirt', '00:00:00', 0);
INSERT INTO `fp_users`
(`id`,
`name`,
`surname`)
VALUES
(1, 'bilbo', 'aaa'),
(2, 'frodo', 'aaa');
INSERT INTO `fp_sellers_commissions`
(`id`, `commission`, `product_id`, `seller_id`)
VALUES
(1, 100, 1, 1),
(2, 500, 1, 2);
Or you can acess SQL FIDDLE: http://sqlfiddle.com/#!9/d6559f/5
I'm not sure why the expected result should be with a commission of "250" for the seller "1", but I think I got what you are searching for. If you want to filter the seller's commission and still display the other products with nulls, you could put the filter condition directly on the left join, kinda like the following.
select fpp.name as product_name,
fsc.seller_id,
fsc.commission
from fp_products as fpp
left join fp_sellers_commissions as fsc
on fsc.product_id = fpp.id and fsc.seller_id = 1
left join fp_users as fpu
on fpu.id = fsc.seller_id;
What happens here, is that the filtering condition is applied at the moment you do the left join, so if it does not match, since it is a "left" join, the results will still be returned with nulls. If you put it in the "where" clause, it will be applied after the join is applied, and it will filter out the results that do not match.
My suggestion is
select fpp.name as product_name,
fsc.seller_id,
SUM(ifnull(fsc.commission, 0)) as commission
from fp_products as fpp
left join fp_sellers_commissions as fsc
on fpp.id = fsc.product_id and fsc.seller_id = 1
group by fpp.name, fsc.seller_id
order by fsc.seller_id desc;
with this must be getting the result you need. Note: I added a group summing to commissions, but if not is the goal, just remove the group by and the sum function.
Hoping this can help you.

Max from joined table based on value from first table

I have 2 tables.
First holds job details, second one the history of those job runs. First one also contains job period, per customer which is minimum time to wait before running next job for same customer. The time comparison needs to happen on started_on field of second table.
I need to find out the job ids to run next.
Schemas
job_details table
CREATE TABLE `job_details` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`customer_id` varchar(128) NOT NULL,
`period_in_minutes` int(11) unsigned NOT NULL,
`status` enum('ACTIVE','INACTIVE','DELETED') DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
job_run_history table
CREATE TABLE `job_run_history` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`job_id` int(10) unsigned NOT NULL,
`started_on` timestamp NULL DEFAULT NULL,
`status` enum('STREAMING','STREAMED','UPLOADING','UPLOADED','NO_RECORDS','FAILED') DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `fk_job_id` (`job_id`),
CONSTRAINT `fk_job_id` FOREIGN KEY (`job_id`) REFERENCES `job_details` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Sample data for job_details table:
INSERT INTO `job_details` (`id`, `customer_id`, `period_in_minutes`, `status`)
VALUES
(1, 'cust1', 1, 'ACTIVE'),
(2, 'cust2', 1, 'ACTIVE'),
(3, 'cust3', 2, 'ACTIVE');
Sample data for job_run_history table:
INSERT INTO `job_run_history`(`job_id`, `started_on`, `status`)
VALUES
(1, '2021-07-01 14:38:00', 'UPLOADED'),
(2, '2021-07-01 14:37:55', 'UPLOADED');
Expected output (When run at 2021-07-01 14:38:56):
id
2,3
id => 1 did NOT get selected because the last job started within last 1 minute
id => 2 DID get selected because the last job started more than last 1 minute ago
id => 3 DID get selected because it has no run history
I have tried this, but this doesn't compare with max of start_time, hence, doesn't work:
select jd.id, max(jrh.started_on) from job_details jd
left join job_run_history jrh on jrh.job_id=jd.id
where
jd.status='ACTIVE'
and (jrh.status is null or jrh.status not in ('STREAMING','STREAMED','UPLOADING'))
and (jrh.`started_on` is null or jrh.`started_on` < date_sub(now(), interval jd.`period_in_minutes`*60 second))
group by jd.id;
MySql Version: 5.7.34
Any help please? Thanks in advance..
I'd prefer to use UNION ALL (it must be more fast than one complex query):
-- the subquery for the rows which have matched ones in 2nd table
SELECT t1.id
FROM job_details t1
JOIN job_run_history t2 ON t1.id = t2.job_id
WHERE t1.status = 'ACTIVE'
AND t2.status not in ('STREAMING','STREAMED','UPLOADING')
AND CURRENT_TIMESTAMP - INTERVAL t1.period_in_minutes MINUTE > t2.started_on
UNION ALL
-- the subquery for the rows which have no matched ones in 2nd table
SELECT id
FROM job_details t1
WHERE NOT EXISTS ( SELECT NULL
FROM job_run_history t2
WHERE t1.id = t2.job_id )
AND status = 'ACTIVE';
https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=8dcad95bf43ce711fdf40deda627e879
select jd.id from job_details jd
left join job_run_history jrh on jd.id= jrh.job_id
where jd.status = 'ACTIVE'
group by jd.id
having
max(jrh.started_on) < current_timestamp - interval max(jd.period_in_minutes) minute
or
max(jrh.id) is null
I'm not sure what's this filter about since you didn't explain it in your question so I didn't put it in the query: jrh.status not in ('STREAMING','STREAMED','UPLOADING'). However, I'm sure you can implement it in the query I posted.

MySQL - GROUP results BY a single column alongside ORDER BY and LIMIT

I want to select a list of tasks from my database. The tasks have a category_id. I want to get a singlur task per category_id. So if I, for example, had 10 tasks that are linked to like 6 categories that would result in 6 results. The 6 results I want are determined by their id, the lowest id among the GROUP BY is the correct record for that GROUP. Also the maximum result set can be no larger than 20 ('LIMIT').
SELECT * FROM `task` WHERE `datetime`<NOW() `task_status_id`=1 GROUP BY `category_id` ORDER BY `id` ASC LIMIT 20
What is wrong with the above query, I got no clue, I'm also at a loss getting any google results for this.
ADDED LATER
http://sqlfiddle.com/#!9/fa39cf
CREATE TABLE `category` (
`id` int(10) UNSIGNED NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `category` (`id`) VALUES
(1),
(2),
(3);
CREATE TABLE `task` (
`id` int(10) UNSIGNED NOT NULL,
`category_id` int(10) UNSIGNED NOT NULL,
`task_status_id` int(10) UNSIGNED DEFAULT '1',
`datetime` datetime NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `task` (`id`, `category_id`, `task_status_id`, `datetime`) VALUES
(3, 2, 1, '2018-07-24 11:20:26'),
(4, 2, 1, '2018-07-24 11:20:26'),
(5, 3, 1, '2018-07-24 11:21:35'),
(6, 3, 1, '2018-07-24 11:21:35');
You can try first finding the smallest id for each category and then joining it with the task table to get the remaining details.
SELECT t.* FROM task t
JOIN (SELECT category_id, min(id) id from task group by category_id) tc
ON (t.id = tc.id)
LIMIT 20

SQL group by using distinct column

I am currently collecting lap times in a sql database and are having some difficulties with extracting the drivers with fastest laptimes!
The structure looks like the following!
CREATE TABLE IF NOT EXISTS `leaderboard` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`driver` varchar(50) NOT NULL,
`car` varchar(50) NOT NULL,
`best` double NOT NULL,
`guid` bigint(255) NOT NULL,
`server_name` varchar(255) NOT NULL,
`track` varchar(55) NOT NULL,
PRIMARY KEY (`id`),
KEY `driver` (`driver`),
KEY `server_name` (`server_name`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1213 ;
Data example
INSERT INTO `leaderboard` (`id`, `driver`, `car`, `best`, `guid`, `server_name`, `track`) VALUES
(1, 'dave.38', 'bmw_m3_e30', 88.379, 76561198084629688, 'A++%21+A++%21+------+Saturdaynightracing.tk+-+%5BRACE-SERVER%5D+-+%5BMagione%5D+%23SNR', 'magione'),
(2, 'Gabriel Porfírio', 'bmw_m3_e30', 87.318, 76561197987062834, 'A++%21+A++%21+------+Saturdaynightracing.tk+-+%5BRACE-SERVER%5D+-+%5BMagione%5D+%23SNR', 'magione'),
(3, 'xX_VEGA_Xx', 'bmw_m3_e30', 88.23, 76561198182074333, 'A++%21+A++%21+------+Saturdaynightracing.tk+-+%5BRACE-SERVER%5D+-+%5BMagione%5D+%23SNR', 'magione'),
(4, 'dave.38', 'bmw_m3_e30', 88.379, 76561198084629688, 'A++%21+A++%21+------+Saturdaynightracing.tk+-+%5BRACE-SERVER%5D+-+%5BMagione%5D+%23SNR', 'magione'),
(5, 'Gabriel Porfírio', 'bmw_m3_e30', 87.318, 76561197987062834, 'A++%21+A++%21+------+Saturdaynightracing.tk+-+%5BRACE-SERVER%5D+-+%5BMagione%5D+%23SNR', 'magione');
Now i am trying to sort out the drivers with best time using column best using the following SQL but it appears as if some times are discarded, the combination of sort and order does not work.
SELECT DISTINCT guid, car, best, driver FROM `leaderboard` WHERE `server_name` like '%%' AND `track` = 'magione' GROUP BY(driver) ORDER BY `best` * 1 LIMIT 10
Please help this is driving me mad!
Some fields in your data are not very clear, so I made such assumptions:
guid means driver's guid (because it is the same for the same driver in your data).
car is the same for the same driver.
With these assumptions you can use simple GROUP BY to get the results that you need:
SELECT driver, car, MIN(best) as best_time, guid
FROM leaderboard
WHERE `server_name` like '%%' AND `track` = 'magione'
GROUP BY driver, car, guid
ORDER BY MIN(best)

MySQL Select Records from 2 not-related Tables Ordering by Timestamp

I'd like to collect data from 2 different mysql tables ordering the result by a timestamp but without merging the columns of the 2 tables in a single row.
T_ONE(one_id,one_someinfo,one_ts)
T_TWO(two_id,two_otherinfo,two_ts)
Notice that the field two_otherinfo is not the same as one_someinfo, the only columns in common are id and timestamp.
The result should be a mix of the two tables ordered by the timestamp but each row, depending on the timestamp, should contain only the respective columns of the table.
For example, if the newest record comes from T_TWO that row should have the T_ONE one_someinfo column empty.
I just need to order the latest news from T_ONE and the latest messages posted on T_TWO so the tables are not related. I'd like to avoid using 2 queries and then merging and ordering the results by timestamp with PHP. Does anyone know a solution to this? Thanks in advance
This is the structure of the table
CREATE TABLE `posts` (
`id` int(10) unsigned NOT NULL auto_increment,
`fromid` int(10) NOT NULL,
`toteam` int(10) NOT NULL,
`banned` tinyint(1) NOT NULL default '0',
`replyid` int(15) default NULL,
`cont` mediumtext NOT NULL,
`timestamp` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
CREATE TABLE `stars` (
`id` int(10) unsigned NOT NULL auto_increment,
`daynum` int(10) NOT NULL,
`userid` int(10) NOT NULL,
`vote` tinyint(2) NOT NULL default '3',
`timestamp` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
INSERT INTO `posts` (`fromid`, `toteam`, `banned`, `replyid`, `cont`, `timestamp`) VALUES(5, 12, 0, 0, 'mess posted#1', 1222222220);
INSERT INTO `posts` (`fromid`, `toteam`, `banned`, `replyid`, `cont`, `timestamp`) VALUES(5, 12, 0, 0, 'mess posted#2', 1222222221);
INSERT INTO `posts` (`fromid`, `toteam`, `banned`, `replyid`, `cont`, `timestamp`) VALUES(5, 12, 0, 0, 'mess posted#3', 1222222223);
INSERT INTO `stars` (`daynum`, `userid`, `vote`, `timestamp`) VALUES(3, 160, 4, 1222222222);
INSERT INTO `stars` (`daynum`, `userid`, `vote`, `timestamp`) VALUES(4, 180, 3, 1222222224);
The result ordering by timestamp DESC should be the second record of table stars with timestamp 1222222224 then the third record of table posts with timestamp 1222222223 and following... Since the tables have got different fields from each other, the first row of the result should contain the columns of the table stars while the columns of table posts should be empty.
The columns of a UNION must be the same name and datatype on every row. In fact, declare column aliases in the first UNION subquery, because it ignores any attempt to rename the column in subsequent subqueries.
If you need the columns from the two subqueries to be different, put in NULL as placeholders. Here's an example, fetching the common columns id and timestamp, and then fetching one custom column from each of the subqueries.
(SELECT p.id, p.timestamp AS ts, p.fromid, NULL AS daynum FROM posts)
UNION
(SELECT s.id, s.timestamp, NULL, s.daynum, FROM stars)
ORDER BY ts DESC
Also put the subqueries in parentheses, so the last ORDER BY applies to the whole result of the UNION, not just to the last subquery.
SELECT one_id AS id, one_someinfo AS someinfo, one_ts AS ts
UNION
SELECT two_id AS id, two_someinfo AS someinfo, two_ts AS ts
ORDER BY ts
SELECT one_id AS id
, one_someinfo AS one_someinfo
, NULL AS two_someinfo
, one_ts AS ts
FROM t_ONE
UNION ALL
SELECT two_id
, NULL
, two_someinfo
, two_ts
FROM t_TWO
ORDER BY ts