MySQL - GROUP results BY a single column alongside ORDER BY and LIMIT - mysql

I want to select a list of tasks from my database. The tasks have a category_id. I want to get a singlur task per category_id. So if I, for example, had 10 tasks that are linked to like 6 categories that would result in 6 results. The 6 results I want are determined by their id, the lowest id among the GROUP BY is the correct record for that GROUP. Also the maximum result set can be no larger than 20 ('LIMIT').
SELECT * FROM `task` WHERE `datetime`<NOW() `task_status_id`=1 GROUP BY `category_id` ORDER BY `id` ASC LIMIT 20
What is wrong with the above query, I got no clue, I'm also at a loss getting any google results for this.
ADDED LATER
http://sqlfiddle.com/#!9/fa39cf
CREATE TABLE `category` (
`id` int(10) UNSIGNED NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `category` (`id`) VALUES
(1),
(2),
(3);
CREATE TABLE `task` (
`id` int(10) UNSIGNED NOT NULL,
`category_id` int(10) UNSIGNED NOT NULL,
`task_status_id` int(10) UNSIGNED DEFAULT '1',
`datetime` datetime NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `task` (`id`, `category_id`, `task_status_id`, `datetime`) VALUES
(3, 2, 1, '2018-07-24 11:20:26'),
(4, 2, 1, '2018-07-24 11:20:26'),
(5, 3, 1, '2018-07-24 11:21:35'),
(6, 3, 1, '2018-07-24 11:21:35');

You can try first finding the smallest id for each category and then joining it with the task table to get the remaining details.
SELECT t.* FROM task t
JOIN (SELECT category_id, min(id) id from task group by category_id) tc
ON (t.id = tc.id)
LIMIT 20

Related

Is it possible to COUNT if one value occured more than the other value in a column using SQL

I have this table called task_status which has the following structure:
CREATE TABLE `task_status` (
`task_status_id` int(11) NOT NULL,
`status_id` int(11) NOT NULL,
`task_id` int(11) NOT NULL,
`date_recorded` varchar(255) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
ALTER TABLE `task_status`
ADD PRIMARY KEY (`task_status_id`);
ALTER TABLE `task_status`
MODIFY `task_status_id` int(11) NOT NULL AUTO_INCREMENT;
COMMIT;
INSERT INTO `task_status` (`task_status_id`, `status_id`, `task_id`, `date_recorded`) VALUES
(1, 1, 16, 'Wednesday 6th of January 2021 09:20:35 AM'),
(2, 2, 17, 'Wednesday 6th of January 2021 09:20:35 AM'),
(3, 3, 18, 'Wednesday 6th of January 2021 09:20:36 AM');
and a status_list table that has the possible statuses available
CREATE TABLE `status` (
`statuses_id` int(11) NOT NULL,
`status` varchar(255) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
ALTER TABLE `status`
ADD PRIMARY KEY (`statuses_id`);
ALTER TABLE `status`
MODIFY `statuses_id` int(11) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=4;
COMMIT;
INSERT INTO `status` (`statuses_id`, `status`) VALUES
(1, 'Yes'),
(2, 'Inprogress'),
(3, 'No');
Now what I want to do is check which number occurred more inside the status_id column 1 occurred more, 2 occurred more or 3 occurred more? using SQL.
Is it possible to do and if so how to?
You can try OVER and PARTITION BY clauses, you simply specify the column you want to partition your aggregated results by.
Example code
select status_id,count(*) over (partition by status_id) as Count_1 from task_status
You can count the column first then filter with max
there is a lot of different way to do this but i prefer using cte.
Here is a example :
with cte as(
select status_id,count(*) cnt from task_status
group by status_id
)
select * from cte
where cnt = (select max(cnt) from cte)
also here is db<>fiddle for better examine.
I modify some data to show the much more understandable output. But idea is same.
also I don't really think status table have any work doing here, but remind me if I misunderstand what you mean.
If you want exactly one status that occurs more often than the others, then I would recommend group by with order by and limit:
select status_id, count(*) as cnt
from task_status
group by status_id
order by cnt desc
limit 1;
This always returns one row, so if there are ties for the most common, then you only get one of the ties.

Simple SQL query but wrong results returned

I have a simple click tracking system that consists of three tables "tracking" (which holds unique views), "views" (which holds raw views) and "products" (which holds products).
Here's how it works: each time a user clicks on a tracking link, if the hash present in the link does not exist in the database, it will be saved in the "tracking" table as an unique view and also in the "views" table as a raw view. If the hash present in the link does exist in the database, then it will be saved only in the "views" table. So basically the number of "raw views" can not be smaller than the number of "unique views" because each "unique view" also counts as a "raw view".
I wrote a query to create reports based on products, but the number of "raw views" returned is not correct.
I've also created a fiddle which I hope it will give a better overview of my problem.
Here's the table structure:
CREATE TABLE `products` (
`id` int(10) UNSIGNED NOT NULL,
`name` varchar(128) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `products` (`id`, `name`) VALUES
(1, 'Test product');
CREATE TABLE `tracking` (
`id` int(10) UNSIGNED NOT NULL,
`product_id` int(11) NOT NULL,
`hash` varchar(32) NOT NULL,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `tracking` (`id`, `product_id`, `hash`, `created`) VALUES
(1, 1, '7ddf32e17a6ac5ce04a8ecbf782ca509', '2020-02-09 18:50:19'),
(2, 1, '00bb28eaf259ba0c932d67f649d90783', '2020-02-09 18:55:34');
CREATE TABLE `views` (
`id` int(10) UNSIGNED NOT NULL,
`hash` varchar(32) NOT NULL,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `views` (`id`, `hash`, `created`) VALUES
(1, '7ddf32e17a6ac5ce04a8ecbf782ca509', '2020-02-09 18:46:30'),
(2, '7ddf32e17a6ac5ce04a8ecbf782ca509', '2020-02-09 18:46:30'),
(3, '7ddf32e17a6ac5ce04a8ecbf782ca509', '2020-02-09 18:46:35'),
(4, '7ddf32e17a6ac5ce04a8ecbf782ca509', '2020-02-09 18:46:42'),
(5, '00bb28eaf259ba0c932d67f649d90783', '2020-02-09 18:56:31'),
(6, '00bb28eaf259ba0c932d67f649d90783', '2020-02-09 18:57:01');
And here's the query I wrote so far:
SELECT products.name AS `param`,
SUM(IF(tracking.product_id<>24, 1, 0)) AS `uniques`,
IF(SUM(IF(tracking.product_id<>24, 1, 0))=0, 0,
(SELECT COUNT(`hash`)
FROM `views` WHERE tracking.hash = views.hash)) AS `views`
FROM tracking
LEFT JOIN products ON products.id = tracking.product_id
WHERE tracking.created BETWEEN '2019-01-01 00:00:00' AND '2020-02-10 00:00:00'
GROUP BY products.name
As you can see I have 2 unique views and 6 raw views (4 for one hash and 2 for the other hash).
My expectation would be for the query result to be 2 uniques and 6 raw views for this given product, but instead I'm getting 2 uniques and 4 raw views. Like it's counting the views only for the first hash.
The next query can solve your situation:
SELECT
products.name,
COUNT(DISTINCT `tracking`.`hash`) AS `uniques`, -- count unique hashes
COUNT(*) AS `views` -- count total
FROM `tracking`
JOIN `views` ON `views`.hash = tracking.hash
LEFT JOIN products ON products.id = tracking.product_id
WHERE tracking.created BETWEEN '2019-01-01 00:00:00' AND '2020-02-10 00:00:00'
GROUP BY products.name;
;

Average values from different table on join

CREATE TABLE `reviews` (
`id` int(11) NOT NULL,
`average` decimal(11,2) NOT NULL,
`house_id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `reviews` (`id`, `average`, `house_id`) VALUES
(1, '10.00', 1),
(2, '10.00', 1);
ALTER TABLE `reviews`
ADD PRIMARY KEY (`id`);
ALTER TABLE `reviews`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=3;
CREATE TABLE `dummy_reviews` (
`id` int(11) NOT NULL,
`average` decimal(11,2) NOT NULL,
`house_id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `dummy_reviews` (`id`, `average`, `house_id`) VALUES
(0, '2.00', 1);
ALTER TABLE `dummy_reviews`
ADD PRIMARY KEY (`id`);
AND the query
SELECT
AVG(r.average) AS avg1,
AVG(dr.average) AS avg2
FROM
reviews r
LEFT JOIN
dummy_reviews dr ON r.house_id = dr.house_id
the result is
avg1 avg2
10.000000 2.000000
All good by now but (10 + 2) / 2 = 6 ... wrong result
I need (10+10+2) / 3 = 7,33 ... How can I get this result?
SQLFiddle
You have values joined and as such you wont have 3 rows, you will have 2. What you need is a union so you can have all rows from your average tables and do the calculation from it. Like this:
select avg(average) from
(select average from reviews
union all
select average from dummy_reviews
) queries
See it here: http://sqlfiddle.com/#!9/e0b75f/3
Jorge's answer is the simplest approach (and I duly upvoted it). In response to your comment, you can do the following:
select ( (coalesce(r.suma, 0) + coalesce(d.suma, 0)) /
(coalesce(r.cnt, 0) + coalesce(d.cnt, 0))
) as overall_average
from (select sum(average) as suma, count(*) as cnt
from reviews
) r cross join
(select sum(average) as suma, count(*) as cnt
from dummy_reviews
) d;
Actually, I suggest this not only because of your comment. Under some circumstances, this could be the better performing code.

Mysql how to join tables more than two

I have problem with my query,
I have tables below:
CREATE TABLE IF NOT EXISTS `klik_zona` (
`kode_zona` int(10) unsigned NOT NULL,
`klik` int(10) unsigned NOT NULL,
PRIMARY KEY (`kode_zona`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `klik_zona` (`kode_zona`, `klik`) VALUES
(1, 45);
CREATE TABLE IF NOT EXISTS `tampil_zona` (
`kode_zona` int(10) unsigned NOT NULL,
`tanggal` date NOT NULL,
`tampil` int(10) unsigned NOT NULL,
PRIMARY KEY (`kode_zona`,`tanggal`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `tampil_zona` (`kode_zona`, `tanggal`, `tampil`) VALUES
(1, '2014-03-16', 100),
(1, '2014-03-17', 23);
CREATE TABLE IF NOT EXISTS `zona_iklan` (
`kode_zona` int(10) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`kode_zona`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=4 ;
INSERT INTO `zona_iklan` (`kode_zona`) VALUES
(1),
(2),
(3);
I have query:
SELECT z.kode_zona, SUM( tz.tampil ) , SUM( kz.klik )
FROM zona_iklan z
LEFT JOIN tampil_zona tz ON tz.kode_zona = z.kode_zona
LEFT JOIN klik_zona kz ON kz.kode_zona = z.kode_zona
GROUP BY z.kode_zona
but it give result:
kode_zona SUM(tz.tampil) SUM(kz.klik)
1 123 90
2 NULL NULL
3 NULL NULL
I want get result:
kode_zona SUM(tz.tampil) SUM(kz.klik)
1 123 45
2 NULL NULL
3 NULL NULL
please help me.. how to make query so that I get result that I hope it..
thanks,
In your example you join two records from tampil_zona on to one record from zona_iklan, which essentially causes that one record to duplicate. Then you are joining one record in klik_zona on to both of those duplicated records, causing the doubling of results that you want to avoid.
Instead, you need to aggregate the records before you join them, to ensure that you are always joining the records 1-to-1.
SELECT
z.kode_zona, tz.tampil, kz.klik
FROM
zona_iklan AS z
LEFT JOIN
(SELECT kode_zona, SUM(tampil) AS tampil FROM tampil_zona GROUP BY kode_zona) AS tz
ON tz.kode_zona = z.kode_zona
LEFT JOIN
(SELECT kode_zona, SUM(klik) AS klik FROM klik_zona GROUP BY kode_zona) AS kz
ON kz.kode_zona = z.kode_zona
Try removing the GROUP BY and look at the result. You will see that there are two records with kode_zona = 1. This because there are two records in tampil_zona matching that id. You could divide by count(*) but that seems futile. You probably want to think about how to modify the join.

MySQL Select Records from 2 not-related Tables Ordering by Timestamp

I'd like to collect data from 2 different mysql tables ordering the result by a timestamp but without merging the columns of the 2 tables in a single row.
T_ONE(one_id,one_someinfo,one_ts)
T_TWO(two_id,two_otherinfo,two_ts)
Notice that the field two_otherinfo is not the same as one_someinfo, the only columns in common are id and timestamp.
The result should be a mix of the two tables ordered by the timestamp but each row, depending on the timestamp, should contain only the respective columns of the table.
For example, if the newest record comes from T_TWO that row should have the T_ONE one_someinfo column empty.
I just need to order the latest news from T_ONE and the latest messages posted on T_TWO so the tables are not related. I'd like to avoid using 2 queries and then merging and ordering the results by timestamp with PHP. Does anyone know a solution to this? Thanks in advance
This is the structure of the table
CREATE TABLE `posts` (
`id` int(10) unsigned NOT NULL auto_increment,
`fromid` int(10) NOT NULL,
`toteam` int(10) NOT NULL,
`banned` tinyint(1) NOT NULL default '0',
`replyid` int(15) default NULL,
`cont` mediumtext NOT NULL,
`timestamp` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
CREATE TABLE `stars` (
`id` int(10) unsigned NOT NULL auto_increment,
`daynum` int(10) NOT NULL,
`userid` int(10) NOT NULL,
`vote` tinyint(2) NOT NULL default '3',
`timestamp` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
INSERT INTO `posts` (`fromid`, `toteam`, `banned`, `replyid`, `cont`, `timestamp`) VALUES(5, 12, 0, 0, 'mess posted#1', 1222222220);
INSERT INTO `posts` (`fromid`, `toteam`, `banned`, `replyid`, `cont`, `timestamp`) VALUES(5, 12, 0, 0, 'mess posted#2', 1222222221);
INSERT INTO `posts` (`fromid`, `toteam`, `banned`, `replyid`, `cont`, `timestamp`) VALUES(5, 12, 0, 0, 'mess posted#3', 1222222223);
INSERT INTO `stars` (`daynum`, `userid`, `vote`, `timestamp`) VALUES(3, 160, 4, 1222222222);
INSERT INTO `stars` (`daynum`, `userid`, `vote`, `timestamp`) VALUES(4, 180, 3, 1222222224);
The result ordering by timestamp DESC should be the second record of table stars with timestamp 1222222224 then the third record of table posts with timestamp 1222222223 and following... Since the tables have got different fields from each other, the first row of the result should contain the columns of the table stars while the columns of table posts should be empty.
The columns of a UNION must be the same name and datatype on every row. In fact, declare column aliases in the first UNION subquery, because it ignores any attempt to rename the column in subsequent subqueries.
If you need the columns from the two subqueries to be different, put in NULL as placeholders. Here's an example, fetching the common columns id and timestamp, and then fetching one custom column from each of the subqueries.
(SELECT p.id, p.timestamp AS ts, p.fromid, NULL AS daynum FROM posts)
UNION
(SELECT s.id, s.timestamp, NULL, s.daynum, FROM stars)
ORDER BY ts DESC
Also put the subqueries in parentheses, so the last ORDER BY applies to the whole result of the UNION, not just to the last subquery.
SELECT one_id AS id, one_someinfo AS someinfo, one_ts AS ts
UNION
SELECT two_id AS id, two_someinfo AS someinfo, two_ts AS ts
ORDER BY ts
SELECT one_id AS id
, one_someinfo AS one_someinfo
, NULL AS two_someinfo
, one_ts AS ts
FROM t_ONE
UNION ALL
SELECT two_id
, NULL
, two_someinfo
, two_ts
FROM t_TWO
ORDER BY ts