MySQL `HAVING` issue - mysql

I have two queries which are returning different results when I would expect them to return the same results.
The first query returns the correct result.
The second returns a result, but it is incorrect.
Why is this and how can I fix the second statement so that it returns the same result? I have to use the HAVING clause in this statement.
1.
SELECT
CAST(CONCAT(DATE(`mytable`.`starttime`),' ',HOUR(`mytable`.`starttime`),':',LPAD(60*(MINUTE(`mytable`.`starttime`) DIV 60),2,'0'),':00') AS DATETIME) AS `date`,
`mytable`.`id`
FROM
`mytable`
WHERE
`mytable`.`starttime`>='2011-07-01 00:00:00'
AND `mytable`.`starttime`<='2011-07-01 23:59:59'
AND `id` BETWEEN 1 AND 100
GROUP BY
`mytable`.`id`
2.
SELECT
CAST(CONCAT(DATE(`mytable`.`starttime`),' ',HOUR(`mytable`.`starttime`),':',LPAD(60*(MINUTE(`mytable`.`starttime`) DIV 60),2,'0'),':00') AS DATETIME) AS `date`,
`mytable`.`id`
FROM
`mytable`
WHERE
`id` BETWEEN 1 AND 100
GROUP BY
`mytable`.`id`
HAVING `date` IN ('2011-07-01 00:00:00', '2011-07-01 01:00:00', '2011-07-01 02:00:00', '2011-07-01 03:00:00', '2011-07-01 04:00:00', '2011-07-01 05:00:00', '2011-07-01 06:00:00', '2011-07-01 07:00:00', '2011-07-01 08:00:00', '2011-07-01 09:00:00', '2011-07-01 10:00:00', '2011-07-01 11:00:00', '2011-07-01 12:00:00', '2011-07-01 13:00:00', '2011-07-01 14:00:00', '2011-07-01 15:00:00', '2011-07-01 16:00:00', '2011-07-01 17:00:00', '2011-07-01 18:00:00', '2011-07-01 19:00:00', '2011-07-01 20:00:00', '2011-07-01 21:00:00', '2011-07-01 22:00:00', '2011-07-01 23:00:00')
Thanks in advance for any help you can offer.

WHERE clause is applied before GROUPing while HAVING is applied after. So in your second query where you have GROUP BY with no WHERE clause MySql returns a random (undetermined) single row and then applies the HAVING clause to it.

Group by is used with aggregate functions - sum, min, max, count. Your query doesn't appear to have an aggregate in the "select" - so group by doesn't do anything.
Whilst your query may be valid SQL, it doesn't make sense. Not sure if this is why your having clause is going nuts, though.

I assume "id" is a primary key on your table, is that right? I'm guessing that you want to show the times during the day at which various events occurred but I'm not sure why you would group by id then. Could you give an example of how you would like the output to look?

Related

Find users with activities in all the last 6 months

I'm looking for the best solution on retrieving the list of users ID with activities in all the last 6 months.
Table structure and data, simplified, is the following:
CREATE TABLE activities (
id int,
client_id int,
created_at timestamp
);
insert into activities values
(1, 1, '2019-06-01 00:00:00'),
(2, 2, '2019-06-01 00:00:00'),
(3, 1, '2019-07-01 00:00:00'),
(4, 1, '2019-08-01 00:00:00'),
(5, 1, '2019-09-01 00:00:00'),
(6, 1, '2019-10-01 00:00:00'),
(7, 1, '2019-11-01 00:00:00'),
(8, 2, '2019-11-01 00:00:00'),
(9, 3, '2019-11-01 00:00:00');
I need to retrieve the list of users that has at least one activity for each one of the last 6 months. In the previous example just client_id 1
I thought doing a join, but it seems too expensive. I won't give you any idea on possible solutions in order not to deviate and accept whatever you have in mind.
Please, consider that I have to manage a really big data source (more then 50 millions rows).
Any quick idea?
I make no claims for the supremacy of this solution, partly because I find such requests disingenuous, but it should work, at least...
CREATE TABLE activities (
id int,
client_id int,
created_at timestamp
);
insert into activities values
(1, 1, '2019-06-01 00:00:00'),
(2, 2, '2019-06-01 00:00:00'),
(3, 1, '2019-07-01 00:00:00'),
(4, 1, '2019-08-01 00:00:00'),
(5, 1, '2019-09-01 00:00:00'),
(6, 1, '2019-10-01 00:00:00'),
(7, 1, '2019-11-01 00:00:00'),
(8, 2, '2019-11-01 00:00:00'),
(9, 3, '2019-11-01 00:00:00');
SELECT a.client_id
FROM activities a
WHERE a.created_at >= LAST_DAY(CURDATE() - INTERVAL 7 MONTH)+INTERVAL 1 DAY
GROUP
BY a.client_id
HAVING COUNT(DISTINCT(DATE_FORMAT(a.created_at,'%Y-%m'))) >= 6;
+-----------+
| client_id |
+-----------+
| 1 |
+-----------+

inserting date formatted with date_format(str_to_date()))

I'm putting all 2018 sundays into a table. I got the list and have to reformat it to fit into a datetime column.
This query generates a result
select date_format(str_to_date("February 11 2018","%M %d %Y"),"%Y-%m-%d %I:%i");
so, I thought I could just put that in an insert
INSERT INTO `events` (`id`, `title`, `color`, `start`, `end`)
VALUES (NULL, 'Sunday', '#FFE761',
(select date_format(str_to_date("January 14 2018","%M %d %Y"),"%Y-%m-%d %I:%i")),
'0000-00-00 00:00:00';
But, I'm getting errors regardless of how I try to stick the date in. I've tried removing 'select', the parens, the %I:%i.. I'm stuck.
Can you help?
INSERT INTO `events` (`id`, `title`, `color`, `start`, `end`)
VALUES (NULL, 'Sunday', '#FFE761',
(select date_format(str_to_date("January 14 2018","%M %d %Y"),"%Y-%m-%d %I:%i"),
'0000-00-00 00:00:00';
You have too many closing brackets after the format for the date.
INSERT INTO `events` (`id`, `title`, `color`, `start`, `end`)
VALUES (NULL, 'Sunday', '#FFE761',
(select date_format(str_to_date("January 14 2018","%M %d %Y"),"%Y-%m-%d %I:%i")), '0000-00-00 00:00:00')

Count number of rows in each day grouped by another field

For the purpose of drawing an activity chart, how can we count number of rows for each type (distinct field value) in each day?
Consider a table with a date field and a field for each type:
CREATE TABLE TableName
(`PK` int, `type` varchar(1), `timestamp` datetime)
;
INSERT INTO TableName
(`PK`, `type`, `timestamp`)
VALUES
(11, 'Q', '2013-01-04 22:23:56'),
(7, 'A', '2013-01-03 22:23:41'),
(8, 'C', '2013-01-04 22:23:42'),
(10, 'Q', '2013-01-05 22:23:56'),
(5, 'C', '2013-01-03 22:23:25'),
(12, 'Q', '2013-01-05 22:23:57'),
(6, 'Q', '2013-01-07 22:23:40'),
(4, 'Q', '2013-01-02 22:23:23'),
(9, 'A', '2013-01-05 22:23:55'),
(1, 'A', '2013-01-08 21:29:38'),
(2, 'Q', '2013-01-02 21:31:59'),
(3, 'C', '2013-01-04 21:32:22')
;
For example output can be (last field is the count of rows with that type and in that day):
'Q', 2013-01-04, 1
'C', 2013-01-04, 2
'A', 2013-01-03, 1
'C', 2013-01-03, 2
and so on...
You just need a group by.
select `type`, date(`timestamp`), count(*)
from tableName
group by `type`, date(`timestamp`)
select `type`, date(`timestamp`) as the_date, count(*) as counter
from MyTable
group by `type`, date(`timestamp`)

SQL getting shifts outside of availability

I'm trying to put together an sql query to get employee shifts that are outside of their availability for a scheduling app. Availability entries will be contiguous and will never have availability entries that are back-to-back for the same employee, nor will there be availability entries that overlap for the same employee.
Basically, I need to get the shift rows where (availabilities.start <= shifts.start AND availabilities.end >= shifts.end) does NOT hold true. Phrased another way, I need to get the rows from the shifts table that are not fully contained by an availability entry.
It needs to account for these possibilities:
Shifts that start before availability
Shifts that end after availability
Shifts that do not have any availability during the shift
I'm ok with using a stored procedure instead of a query if this would be more efficient.
Here's what the tables look like:
CREATE TABLE availabilities (`id` int primary key, `employee_id` int, `start` datetime, `end` datetime);
CREATE TABLE shifts (`id` int primary key, `employee_id` int, `start` datetime, `end` datetime);
Here is some sample data:
INSERT INTO availabilities
(`employee_id`, `start`, `end`)
VALUES
(1, '2015-01-01 08:00:00', '2015-01-01 09:00:00'),
(1, '2015-01-02 08:00:00', '2015-01-02 10:00:00'),
(2, '2015-01-03 08:00:00', '2015-01-03 14:00:00'),
(2, '2015-01-04 08:00:00', '2015-01-04 18:00:00')
;
INSERT INTO shifts
(`employee_id`, `start`, `end`)
VALUES
(1, '2015-01-01 08:00:00', '2015-01-01 09:00:00'),
(1, '2015-01-02 08:30:00', '2015-01-02 10:00:00'),
(1, '2015-01-02 10:30:00', '2015-01-02 12:00:00'),
(2, '2015-01-03 08:00:00', '2015-01-03 09:00:00'),
(2, '2015-01-03 09:00:00', '2015-01-03 14:30:00'),
(2, '2015-01-04 09:30:00', '2015-01-04 17:30:00'),
(2, '2015-01-05 08:00:00', '2015-01-05 10:00:00')
;
I would expect the 3rd, 5th and 7th shifts to be output as they are outside of availability.
I've tried something like the following (as well as many others) however all of them either give false positives or leave out shifts.
SELECT s.* FROM `shifts` AS `s`
LEFT JOIN `availabilities` AS `a` ON `s`.`employee_id` = `a`.`employee_id`
WHERE (NOT(a.start <= s.start AND a.end >= s.end) OR a.id IS NULL);
Does this help?
select *
from shifts as s
where not exists (
select 1
from availabilities as a
where a.start <= s.start AND a.end >= s.end and a.employee_id = s.employee_id
)

MySQL query - average amount of lines per order, on a monthly basis

I have a table which shows previous orders.
Each item bought is added as a seperate row in the table, see dump below.
My aim is to show the average amount of lines per order on a monthly basis.
To get an average amount of lines, I need to divide the amount of items bought by the amount of orders placed.
My query currently gives me the monthly totals, and line_count returns the correct number of items bought, but I can't seem to return the amount of orders placed (which in the dump below should give 13). I have tried adding various subqueries, but I'm not sure how to go about this. Any ideas?
SELECT
date,
COUNT(orderno) AS line_count
FROM `orders`
AND
date BETWEEN '2010-01-21' AND CURDATE()
GROUP BY month(date), year(date)
ORDER BY date
Here is the table schema (simplified for clarity)
CREATE TABLE IF NOT EXISTS `orders` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`orderno` varchar(15) COLLATE utf8_unicode_ci NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=22904 ;
--
-- Dumping data for table `orders`
--
INSERT INTO `orders` (`id`, `orderno`, `date`) VALUES
(1, 'rad10000', '2010-01-21'),
(2, 'rad10000', '2010-01-21'),
(3, 'rad10001', '2010-01-21'),
(4, 'rad10001', '2010-01-21'),
(5, 'rad10002', '2010-01-21'),
(6, 'rad10003', '2010-01-21'),
(8, 'rad10003', '2010-01-21'),
(9, 'rad10003', '2010-01-21'),
(10, 'rad10004', '2010-01-22'),
(11, 'rad10004', '2010-01-22'),
(12, 'rad10005', '2010-01-22'),
(13, 'rad10005', '2010-01-22'),
(14, 'rad10006', '2010-01-22'),
(15, 'rad10007', '2010-01-22'),
(16, 'rad10008', '2010-01-22'),
(17, 'rad10009', '2010-01-22'),
(18, 'rad10010', '2010-01-22'),
(19, 'rad10011', '2010-01-22'),
(20, 'rad10012', '2010-01-22');
Oh, I see...
SELECT YEAR(date)
, MONTH(date)
, COUNT(*) line_count
, COUNT(DISTINCT orderno) orders_placed
FROM orders
WHERE date BETWEEN '2010-01-21' AND CURDATE()
GROUP
BY YEAR(date)
, MONTH(date)
ORDER
BY date;