I would like to generate a SQL to list to display the maximum tally count based on enumerated group of values on a monthly basis. As this would be useful for analytics based algorithm in displaying the total impressions of a particular data type.
Please check my sample table:
CREATE TABLE IF NOT EXISTS `company_attendance_tally` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`start_date` datetime NOT NULL,
`sick_type` enum('VACATION','SICK','MATERNITY') COLLATE utf8_unicode_ci NOT NULL,
`leave_count` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `start_date` (`start_date`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=32 ;
--
-- Dumping data for table `company_attendance_tally`
--
INSERT INTO `company_attendance_tally` (`id`, `start_date`, `sick_type`, `leave_count`) VALUES
(1, '2013-03-01 16:58:44', 'VACATION', 5),
(2, '2013-03-15 10:44:35', 'SICK', 43),
(3, '2013-03-21 17:03:33', 'MATERNITY', 44),
(4, '2013-03-07 23:01:30', 'MATERNITY', 10),
(5, '2013-03-22 17:07:07', 'MATERNITY', 1),
(6, '2013-03-08 19:33:04', 'VACATION', 40),
(7, '2013-03-17 12:27:00', 'MATERNITY', 15),
(8, '2013-03-03 23:26:48', 'SICK', 11),
(9, '2013-03-05 02:16:37', 'MATERNITY', 41),
(10, '2013-03-20 12:04:28', 'MATERNITY', 18),
(11, '2013-03-18 02:10:00', 'MATERNITY', 1),
(12, '2013-03-03 09:47:02', 'MATERNITY', 19),
(13, '2013-03-22 10:17:52', 'MATERNITY', 25),
(14, '2013-03-03 19:41:52', 'VACATION', 10),
(15, '2013-03-02 19:28:41', 'SICK', 39),
(16, '2013-03-01 20:45:26', 'SICK', 42),
(17, '2013-03-26 23:52:16', 'MATERNITY', 29),
(18, '2013-03-29 14:10:58', 'SICK', 44),
(19, '2013-03-27 03:11:40', 'MATERNITY', 12),
(20, '2013-03-06 18:38:28', 'MATERNITY', 30),
(21, '2013-03-07 20:49:14', 'VACATION', 27),
(22, '2013-03-13 11:38:45', 'VACATION', 14),
(23, '2013-03-02 19:13:31', 'SICK', 2),
(24, '2013-03-01 10:08:18', 'SICK', 27),
(25, '2013-03-20 01:56:38', 'VACATION', 3),
(26, '2013-03-04 21:02:05', 'SICK', 7),
(27, '2013-03-17 00:47:17', 'MATERNITY', 36),
(28, '2013-03-04 08:12:56', 'VACATION', 5),
(29, '2013-03-18 08:50:57', 'SICK', 34),
(30, '2013-03-26 02:20:58', 'VACATION', 20),
(31, '2013-03-27 10:27:00', 'SICK', 21);
http://sqlfiddle.com/#!2/bbd1e3
I would like to display a similar output below based on the above scenario:
month| day | sick_type | leave_count |
-----------------------------------------------------
3| 08 | VACATION | 40
3| 29 | SICK | 29
3| 21 | MATERNITY | 44
and so on so forth...
4| ... | MATERNITY | ..
4| ... | SICK | ..
4| ... | VACATION | ..
5| ... | MATERNITY | ..
5| ... | SICK | ..
5| ... | VACATION | ..
If I understand correctly what you want you can do following by leveraging non-standard MySQL GROUP BY extension
SELECT MONTH(start_date) month,
DAYOFMONTH(start_date) day,
sick_type,
leave_count
FROM
(
SELECT start_date, sick_type, leave_count
FROM company_attendance_tally
WHERE start_date >= '2013-01-01'
AND start_date < '2014-01-01'
ORDER BY MONTH(start_date), sick_type, leave_count DESC
) q
GROUP BY MONTH(start_date), sick_type
Note: Month values alone (without a year values) in the resultset make sense only if you limit the resultset by one year boundaries (see WHERE clause).
Output:
| MONTH | DAY | SICK_TYPE | LEAVE_COUNT |
|-------|-----|-----------|-------------|
| 3 | 8 | VACATION | 40 |
| 3 | 29 | SICK | 44 |
| 3 | 21 | MATERNITY | 44 |
Here is SQLFiddle demo
Use this:
SELECT DAY( start_date ) AS d,
MONTH( start_date ) AS mon,
sick_type,
MAX( leave_count ) AS leave_count
FROM `company_attendance_tally`
GROUP BY mon, sick_type
If you want to do it 'by-the-book', consider the following...
SELECT x.*
FROM company_attendance_tally x
JOIN
( SELECT MONTH(start_date) start_month
, sick_type
, MAX(leave_count) max_leave_count
FROM company_attendance_tally
GROUP
BY MONTH(start_date)
, sick_type
) y
ON y.start_month = MONTH(x.start_date)
AND y.sick_type = x.sick_type
AND y.max_leave_count = x.leave_count;
Related
Hi I've got the following mysql table:
I need to pivot these datas to look like the following:
The scadenza is not always the same and can change therefore I believe I need to create the pivot table dynamically, This is the stored procedure I've tried:
BEGIN
SELECT
GROUP_CONCAT(
CONCAT("MAX(IF(scadenza='", scadenza, "',importo ,'')) AS '", scadenza, "'"), "
"
)INTO #answers
FROM (
SELECT DISTINCT scadenza,
importo
FROM ripartizione_rate
) A;
SET #query :=
CONCAT(
'SELECT DISTINCT condomino, anagrafica,', #answers,
'FROM ripartizione_rate
GROUP BY condomino, anagrafica'
);
PREPARE statement FROM #query;
EXECUTE statement;
END
The result I get is very close but I get the date repeated as you can see in the image below:
Can please anyboby halp me to fix this problem? Many thanks for your help
This is the dump text for the table
INSERT INTO `ripartizione_rate` (`id`, `preventivo`, `piano_rateale`, `condomino`, `anagrafica`, `immobile`, `descrizione`, `scadenza`, `stato_pagamento`, `importo`, `importo_pagato`, `importo_residuo`) VALUES
(1, 1, 1, 19, 11, 3, 'Rata Num.1', '2021-01-01', 0, '208.38', '0.00', '0.00'),
(2, 1, 1, 12, 15, 3, 'Rata Num.1', '2021-01-01', 0, '208.38', '0.00', '0.00'),
(3, 1, 1, 10, 15, 5, 'Rata Num.1', '2021-01-01', 0, '500.10', '0.00', '0.00'),
(4, 1, 1, 20, 17, 3, 'Rata Num.1', '2021-01-01', 0, '83.35', '0.00', '0.00'),
(5, 1, 1, 19, 11, 3, 'Rata Num.2', '2021-05-01', 0, '208.31', '0.00', '0.00'),
(6, 1, 1, 12, 15, 3, 'Rata Num.2', '2021-05-01', 0, '208.31', '0.00', '0.00'),
(7, 1, 1, 10, 15, 5, 'Rata Num.2', '2021-05-01', 0, '499.95', '0.00', '0.00'),
(8, 1, 1, 20, 17, 3, 'Rata Num.2', '2021-05-01', 0, '83.33', '0.00', '0.00'),
(9, 1, 1, 19, 11, 3, 'Rata Num.3', '2021-09-01', 0, '208.31', '0.00', '0.00'),
(10, 1, 1, 12, 15, 3, 'Rata Num.3', '2021-09-01', 0, '208.31', '0.00', '0.00'),
(11, 1, 1, 10, 15, 5, 'Rata Num.3', '2021-09-01', 0, '499.95', '0.00', '0.00'),
(12, 1, 1, 20, 17, 3, 'Rata Num.3', '2021-09-01', 0, '83.33', '0.00', '0.00');
The subquery part here:
SELECT
GROUP_CONCAT(
CONCAT("MAX(IF(scadenza='", scadenza, "',importo ,'')) AS '", scadenza, "'"), "
"
)INTO #answers
FROM (
SELECT DISTINCT scadenza, <----
importo <---- this subquery
FROM ripartizione_rate <----
) A
returns the following result:
+------------+---------+
| scadenza | importo |
+------------+---------+
| 2021-01-01 | 208.38 |
| 2021-01-01 | 500.10 |
| 2021-01-01 | 83.35 |
| 2021-05-01 | 208.31 |
| 2021-05-01 | 499.95 |
| 2021-05-01 | 83.33 |
| 2021-09-01 | 208.31 |
| 2021-09-01 | 499.95 |
| 2021-09-01 | 83.33 |
+------------+---------+
Each date returns 3 rows due to the DISTINCT combination of scadenza, importo. If you run SELECT #answers; after the variable being assigned then you'll get:
SELECT #answers;
+-------------------------------------------------------------+
| #answers |
+-------------------------------------------------------------+
| MAX(IF(scadenza='2021-01-01',importo ,'')) AS '2021-01-01' |
| ,MAX(IF(scadenza='2021-01-01',importo ,'')) AS '2021-01-01' |
| ,MAX(IF(scadenza='2021-01-01',importo ,'')) AS '2021-01-01' |
| ,MAX(IF(scadenza='2021-05-01',importo ,'')) AS '2021-05-01' |
| ,MAX(IF(scadenza='2021-05-01',importo ,'')) AS '2021-05-01' |
| ,MAX(IF(scadenza='2021-05-01',importo ,'')) AS '2021-05-01' |
| ,MAX(IF(scadenza='2021-09-01',importo ,'')) AS '2021-09-01' |
| ,MAX(IF(scadenza='2021-09-01',importo ,'')) AS '2021-09-01' |
| ,MAX(IF(scadenza='2021-09-01',importo ,'')) AS '2021-09-01' |
+-------------------------------------------------------------+
Whereas what you really want is just 3 distinctive dates instead of 3x3 distinctive dates. Therefore, the fix is quite simple really, you just need to remove the column importo from the subquery:
SELECT
GROUP_CONCAT(
CONCAT("MAX(IF(scadenza='", scadenza, "',importo ,'')) AS '", scadenza, "'"), "") INTO #answers
FROM (SELECT DISTINCT scadenza
FROM ripartizione_rate
) A;
SELECT #answers;
SET #query :=
CONCAT(
'SELECT DISTINCT condomino, anagrafica,', #answers2,
'FROM ripartizione_rate
GROUP BY condomino, anagrafica
ORDER BY condomino, anagrafica'
);
SELECT #query;
PREPARE statement FROM #query;
EXECUTE statement;
Demo fiddle
Im currently have room availability present which displays current open rooms for bookings given a specified date range.
I need to display the same availability but instead of displaying rooms for the FULL availability i need to show partial availability.
Eg: booking 1 is from dates 22nd to 25th (within room 4)
booking 2 is from dates 24th to 28th (within room 3)
queried booking is from 23rd till 25th
22nd 23rd 24th 25th 28th
|-----------------------|
|------------------|
|------| free space
query:
SELECT r.*
, CASE WHEN b.ref IS NULL THEN 'all' ELSE 'partial' END status
FROM roominfo r
LEFT JOIN bookroom br ON br.id = r.id
LEFT JOIN book b ON b.ref = br.ref
AND b.end_date >= '2019-11-23' AND b.start_date <= '2019-11-25'
ORDERBY r.id;
example structure & data:
CREATE SCHEMA TEST;
USE TEST;
CREATE TABLE BOOK( Ref INT NOT NULL AUTO_INCREMENT, Start_Date DATE NOT NULL, End_Date DATE NOT NULL, PRIMARY KEY(Ref));
CREATE TABLE ROOMINFO( ID INT NOT NULL AUTO_INCREMENT, `Type` VARCHAR(10) NOT NULL, Max TINYINT NOT NULL, PRIMARY KEY(ID));
CREATE TABLE BOOKROOM( Ref INT NOT NULL,ID INT NOT NULL, FOREIGN KEY (Ref) REFERENCES BOOK(Ref), FOREIGN KEY (ID) REFERENCES ROOMINFO(ID));
INSERT INTO BOOK(Start_Date, End_Date) VALUES
('2019-11-22', '2019-11-25'),('2019-11-24', '2019-11-28'),('2019-12-01', '2019-12-02'),('2019-12-01', '2019-12-06'),
('2019-12-02', '2019-12-03'),('2019-12-04', '2019-12-10'),('2019-12-04', '2019-12-10'),('2019-12-05', '2019-12-13'),
('2019-12-16', '2019-12-19'),('2019-12-26', '2019-12-28'),('2019-12-26', '2020-01-01'),('2019-12-28', '2020-01-02'),
('2019-12-31', '2020-01-05'),('2020-01-03', '2020-01-08'),('2020-01-05', '2020-01-11'),('2020-01-06', '2020-01-09'),
('2020-01-06', '2020-01-11'),('2020-01-08', '2020-01-18'),('2020-01-11', '2020-01-15'),('2020-01-15', '2020-01-17'),
('2020-01-15', '2020-01-18');
INSERT INTO ROOMINFO (ID, `Type`,Max) VALUES
(1, "Family", 4), (2, "Family", 4), (3, "Family", 4), (4, "Dual", 2),
(5, "Dual", 2), (6, "Dual", 2), (7, "Dual", 2), (8, "Dual", 2),
(9, "Dual", 2), (10, "Dual", 2);
INSERT INTO BOOKROOM( Ref, ID ) VALUES
(1, 4), (2, 3), (3, 4), (4, 5),(5, 6), (6, 7), (7, 3), (8, 2), (9, 1), (10, 8),(11, 3),
(12, 9), (13, 2), (14, 10), (15, 4), (16, 5), (17, 6), (18, 7), (19, 2),(20, 1), (21, 10);
desired output:
id (& some indication of partial availability?)
1 all
2 all
3 partial
4 partial
5 all
6 all
7 all
8 all
9 all
10 all
Ignoring the distressing naming policy...
DROP TABLE IF EXISTS book;
CREATE TABLE BOOK( Ref INT NOT NULL AUTO_INCREMENT, Start_Date DATE NOT NULL, End_Date DATE NOT NULL, PRIMARY KEY(Ref));
DROP TABLE IF EXISTS roominfo;
CREATE TABLE ROOMINFO( ID INT NOT NULL AUTO_INCREMENT, `Type` VARCHAR(10) NOT NULL, capacity TINYINT NOT NULL, PRIMARY KEY(ID));
DROP TABLE IF EXISTS bookroom;
CREATE TABLE BOOKROOM( Ref INT NOT NULL,ID INT NOT NULL);
INSERT INTO BOOK(Start_Date, End_Date) VALUES
("2019-11-03", "2019-11-10"), ("2019-11-05", "2019-11-13");
INSERT INTO ROOMINFO (ID, `Type`,capacity) VALUES
(1, "Family", 4), (2, "Family", 4), (3, "Family", 4), (4, "Dual", 2),
(5, "Dual", 2), (6, "Dual", 2), (7, "Dual", 2), (8, "Dual", 2),
(9, "Dual", 2), (10, "Dual", 2);
INSERT INTO BOOKROOM( Ref, ID ) VALUES (1, 4), (2, 3);
SELECT r.*
, CASE WHEN b.ref IS NULL THEN 'all' ELSE 'partial' END status
FROM roominfo r
LEFT
JOIN bookroom br
ON br.id = r.id
LEFT
JOIN book b
ON b.ref = br.ref
AND b.end_date >= '2019-11-01' AND b.start_date <= '2019-11-13'
ORDER
BY r.id;
+----+--------+----------+---------+
| ID | Type | capacity | status |
+----+--------+----------+---------+
| 1 | Family | 4 | all |
| 2 | Family | 4 | all |
| 3 | Family | 4 | partial |
| 4 | Dual | 2 | partial |
| 5 | Dual | 2 | all |
| 6 | Dual | 2 | all |
| 7 | Dual | 2 | all |
| 8 | Dual | 2 | all |
| 9 | Dual | 2 | all |
| 10 | Dual | 2 | all |
+----+--------+----------+---------+
I am writing a query that will find the youngest student by each major if their average grade score is more than 80 and order them by their name from the following relation. I am using MySQL server and working with MySQL Workbench.
Student:
snum: integer
name: string
major: string
level: string
age: integer
Class:
cname: string
meets_at: time
room: string
fid: integer
Grade:
snum (foreign key)
name (foreign key)
score
Here is how I tried to implement the query.
select S.major, S.name, S.age
from student S , grades G
group by S.major
Having MIN(S.age) and G.score > (Select avg(G.score)
from grades G1 , student S
where S.snum = G1.snum) ;
However this doesn't work and I am really confused about what the query should look like.
Sample data:
CREATE TABLE students
(`snum` int, `name` varchar(18), `major` varchar(22), `standing` varchar(2),
`age` int)
;
INSERT INTO student
(`snum`, `name`, `major`, `standing`, `age`)
VALUES
(578875478, 'Edward Baker', 'Veterinary Medicine', 'SR', 21),
(574489456, 'Betty Adams', 'Economics', 'JR', 20),
(573284895, 'Steven Green', 'Kinesiology', 'SO', 19),
(567354612, 'Karen Scott', 'Computer Engineering', 'FR', 18),
(556784565, 'Kenneth Hill', 'Civil Engineering', 'SR', 21),
(552455318, 'Ana Lopez', 'Computer Engineering', 'SR', 19),
(550156548, 'George Wright', 'Education', 'SR', 21),
(462156489, 'Donald King', 'Mechanical Engineering', 'SO', 19),
(455798411, 'Luis Hernandez', 'Electrical Engineering', 'FR', 17),
(451519864, 'Mark Young', 'Finance', 'FR', 18),
(351565322, 'Nancy Allen', 'Accounting', 'JR', 19),
(348121549, 'Paul Hall', 'Computer Science', 'JR', 18),
(322654189, 'Lisa Walker', 'Computer Science', 'SO', 17),
(320874981, 'Daniel Lee', 'Electrical Engineering', 'FR', 17),
(318548912, 'Dorthy Lewis', 'Finance', 'FR', 18),
(301221823, 'Juan Rodriguez', 'Psychology', 'JR', 20),
(280158572, 'Margaret Clark', 'Animal Science', 'FR', 18),
(269734834, 'Thomas Robinson', 'Psychology', 'SO', 18),
(132977562, 'Angela Martinez', 'History', 'SR', 20),
(115987938, 'Christopher Garcia', 'Computer Science', 'JR', 20),
(112348546, 'Joseph Thompson', 'Computer Science', 'SO', 19),
(99354543, 'Susan Martin', 'Law', 'JR', 20),
(60839453, 'Charles Harris', 'Architecture', 'SR', 22),
(51135593, 'Maria White', 'English', 'SR', 21);
CREATE TABLE grades
(`snum` int, `cname` varchar(23), `score` int);
INSERT INTO grades
(`snum`, `cname`, `score`)
VALUES
(574489456, 'Urban Economics', 45),
(567354612, 'Operating System Design', 98),
(567354612, 'Data Structures', 100),
(552455318, 'Operating System Design', 98),
(552455318, 'Communication Networks', 87),
(455798411, 'Operating System Design', 100),
(455798411, 'Optical Electronics', 87),
(348121549, 'Database Systems', 90),
(322654189, 'Database Systems', 97),
(322654189, 'Operating System Design', 56),
(301221823, 'Perception', 87),
(301221823, 'Social Cognition', 87),
(115987938, 'Database Systems', 100),
(115987938, 'Operating System Design', 98),
(112348546, 'Database Systems', 80),
(112348546, 'Operating System Design', 35),
(99354543, 'Patent Law', 65)
;
Expected Results:
+------------------------+----------------+----+---------+---+
| Computer Engineering | Karen Scott | 18 | 99.0000 | 1 |
+------------------------+----------------+----+---------+---+
| Computer Science | Paul Hall | 18 | 90.0000 | 1 |
+------------------------+----------------+----+---------+---+
| Electrical Engineering | Luis Hernandez | 17 | 93.5000 | 1 |
+------------------------+----------------+----+---------+---+
| Psychology | Juan Rodriguez | 20 | 87.0000 | 1 |
+------------------------+----------------+----+---------+---+
Here is an approach that might work for your use case. The logic is to combine aggregation and window functions.
First, you can use a simple aggregate query to compute the average score of each student:
SELECT s.major, s.name, s.age, AVG(g.score) avg_score
FROM
students s
INNER JOIN grades g ON g.snum = s.snum
GROUP BY s.snum, s.major, s.name, s.age
HAVING AVG(g.score) > 80
This will give you one record per student whose average score is higher than 80, along with his age, name and major, and average score.
Now all that is left to do is to select the youngest student in each group of students that have the same major. This can be done with window function ROW_NUMBER() :
SELECT major, name, age, avg_score
FROM (
SELECT
x.*,
ROW_NUMBER() OVER(PARTITION BY major ORDER BY age) rn
FROM (
SELECT s.major, s.name, s.age, AVG(g.score) avg_score
FROM
students s
INNER JOIN grades g ON g.snum = s.snum
GROUP BY s.snum, s.major, s.name, s.age
HAVING AVG(g.score) > 80
) x
) z WHERE rn = 1
This DB Fiddle with your sample data returns:
| major | name | age | avg_score |
| ---------------------- | -------------- | --- | --------- |
| Computer Engineering | Karen Scott | 18 | 99 |
| Computer Science | Paul Hall | 18 | 90 |
| Electrical Engineering | Luis Hernandez | 17 | 93.5 |
| Psychology | Juan Rodriguez | 20 | 87 |
I am trying to put together a report in SQL that will run in MySQL.
I have a companies table:
INSERT INTO companies
(`id`, `name`, `createdDate`)
VALUES
(1, 'company_1', '2016-02-01 04:00:00'),
(2, 'company_2', '2016-01-01 04:00:00'),
(3, 'company_3', '2016-04-01 04:00:00'),
(4, 'company_4', '2016-03-01 04:00:00'),
(5, 'company_5', '2016-02-01 04:00:00')
;
I have a users table where a bunch of users work for a specific company in a one company to many users scenario. Users accept an invite to join the company and we capture the date as follows:
INSERT INTO users
(`userId`, `companyId`, `acceptedInviteDate`)
VALUES
(1, 1, '2017-01-01 04:00:00'),
(2, 1, '2017-01-02 04:00:00'),
(3, 1, '2017-01-03 04:00:00'),
(4, 1, '2017-01-04 04:00:00'),
(5, 2, '2017-01-05 04:00:00'),
(6, 2, '2017-01-09 04:00:00'),
(7, 2, '2017-01-10 04:00:00'),
(8, 2, '2017-01-11 04:00:00'),
(9, 2, '2017-01-12 04:00:00'),
(10, 3, '2017-01-13 04:00:00'),
(11, 3, '2017-01-15 04:00:00'),
(12, 3, '2017-01-02 04:00:00'),
(13, 3, '2017-01-03 04:00:00'),
(14, 3, '2017-01-04 04:00:00'),
(15, 3, '2017-01-05 04:00:00'),
(16, 3, '2017-01-06 04:00:00'),
(17, 3, '2017-01-07 04:00:00'),
(18, 3, '2017-01-08 04:00:00'),
(19, 3, '2017-01-09 04:00:00'),
(20, 3, '2017-01-11 04:00:00'),
(21, 3, '2017-01-13 04:00:00'),
(22, 3, '2017-01-15 04:00:00'),
(23, 3, '2017-01-16 04:00:00'),
(24, 3, '2017-01-17 04:00:00'),
(25, 3, '2017-01-18 04:00:00'),
(26, 3, '2017-01-19 04:00:00'),
(27, 3, '2017-01-20 04:00:00'),
(28, 1, '2018-01-05 04:00:00'),
(29, 1, '2018-01-10 04:00:00'),
(30, 1, '2018-01-15 04:00:00'),
(31, 1, '2018-01-20 04:00:00'),
(32, 1, '2018-01-22 04:00:00')
;
I also have the following data in a table called activities. Some users have records that they do activities on almost on a daily bases. Some do few times a week and other do activities few times a month as follows
INSERT INTO activities
(`userId`, `activityId`, `type`, `activityDate`)
VALUES
(1, 1, 'commit', '2018-01-01 04:00:00'),
(1, 2, 'commit', '2018-01-02 04:00:00'),
(1, 3, 'commit', '2018-01-03 04:00:00'),
(1, 4, 'commit', '2018-01-04 04:00:00'),
(1, 5, 'did', '2018-01-05 04:00:00'),
(1, 6, 'did', '2018-01-12 04:00:00'),
(1, 7, 'did', '2018-01-14 04:00:00'),
(1, 8, 'did', '2018-01-29 04:00:00'),
(1, 9, 'skipped', '2018-01-29 04:00:00'),
(1, 10, 'did', '2018-01-29 04:00:00'),
(1, 11, 'did', '2018-01-29 04:00:00'),
(1, 12, 'did', '2018-01-29 04:00:00'),
(1, 13, 'did', '2018-01-29 04:00:00'),
(2, 14, 'commit', '2018-01-01 04:00:00'),
(2, 15, 'did', '2018-01-02 04:00:00'),
(2, 16, 'commit', '2018-01-03 04:00:00'),
(2, 17, 'commit', '2018-01-04 04:00:00'),
(2, 18, 'did', '2018-01-05 04:00:00'),
(2, 19, 'did', '2018-01-12 04:00:00'),
(2, 20, 'commit', '2018-01-14 04:00:00'),
(2, 21, 'did', '2018-01-29 04:00:00'),
(2, 22, 'skipped', '2018-01-29 04:00:00'),
(2, 23, 'did', '2018-01-29 04:00:00'),
(2, 24, 'did', '2018-01-29 04:00:00'),
(2, 25, 'skipped', '2018-01-29 04:00:00'),
(2, 26, 'did', '2018-01-29 04:00:00')
I'm trying to create a report based off of the mysql that will give me an output for each company:
1) # of users per week who did activity type did per week where the week is defined as starting from the the date the company was created. Not calendar week. So if the company was created on 03/03/17. The first week is 03/03/17 - 03/10/17 and second week is 7 days later until week #x until it reaches the current date.
2) cumulative number of users where the acceptedInviteDate is not null. Just the ones that accepted. So for example, week 3 = week 1 + week 2 + week 3 for that company.
Here is a sample output:
companyId | week# | users_with_activity_type_did | totalUsersdWhoAcceptedAnInvite
1 | 1 | 0 | 0
1 | 48 | 0 | 0
....
1 | 49 | 3 | 28
1 | 50 | 3 | 29
1 | 51 | 0 | 30
Please see the latest fiddle started by user Sentinel --> http://sqlfiddle.com/#!9/4431be/1
The data inserted is correct but the sql is wrong and returns wrong data
Here's a possible solution using the provided sample data.
To make this work a Weeks dimension table is needed. Note, however, that based on the sample data users 1 and 2 started working for Company_1 before company_1 was created, so the Weeks table needs to have some negative week numbers to pick up that data.
See this SQL Fiddle for complete setup and example code.
Additional MySQL 5.6 Schema Setup:
create table ones (num bigint);
insert into ones values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
create table weeks as
select o.num + t.num * 10 + h.num * 100 week_no
from ones o, ones t, ones h order by 1;
insert into weeks select -num from ones where num > 0;
drop table ones;
Query 1:
select c.id companyid
, n.week_no
, count(distinct case when a.type = 'did' then a.userid end) users_with_activity_type_did
, count(distinct case when a.type = 'commit' then a.userid end) users_with_activity_type_commit
, count(distinct case when a.type = 'skipped' then a.userid end) users_with_activity_type_skip
, count(distinct case when u.acceptedInviteDate < (c.createdDate + interval (7*(n.week_no+1)) day)
then u.userid
end) totalUsersWhoAcceptedAnInvite
from companies c
cross join weeks n
left join users u
on u.companyid = c.id
left join activities a
on a.userid = u.userid
-- and a.type = 'did'
and (c.createdDate + interval (7*n.week_no) day) <= a.activitydate
and a.activitydate < (c.createdDate + interval (7*(n.week_no+1)) day)
group by c.id
, n.week_no with rollup
having max(case when u.acceptedInviteDate < (c.createdDate + interval (7*(n.week_no+1)) day)
and u.acceptedInviteDate >= (c.createdDate + interval (7*(n.week_no)) day)
then 1
when a.activityid is not null then 1
else 0
end) = 1
Results:
| companyid | week_no | users_with_activity_type_did | users_with_activity_type_commit | users_with_activity_type_skip | totalUsersWhoAcceptedAnInvite |
|-----------|---------|------------------------------|---------------------------------|-------------------------------|-------------------------------|
| 1 | 47 | 0 | 0 | 0 | 1 |
| 1 | 48 | 0 | 0 | 0 | 4 |
| 1 | 100 | 2 | 2 | 0 | 5 |
| 1 | 101 | 2 | 1 | 0 | 6 |
| 1 | 102 | 0 | 0 | 0 | 8 |
| 1 | 103 | 0 | 0 | 0 | 9 |
| 1 | 104 | 2 | 0 | 2 | 9 |
| 1 | (null) | 2 | 2 | 2 | 9 |
| 2 | 52 | 0 | 0 | 0 | 1 |
| 2 | 53 | 0 | 0 | 0 | 5 |
| 2 | (null) | 0 | 0 | 0 | 5 |
| 3 | 39 | 0 | 0 | 0 | 4 |
| 3 | 40 | 0 | 0 | 0 | 9 |
| 3 | 41 | 0 | 0 | 0 | 17 |
| 3 | 42 | 0 | 0 | 0 | 18 |
| 3 | (null) | 0 | 0 | 0 | 18 |
| (null) | (null) | 2 | 2 | 2 | 32 |
I've updated this answer based on your updated sample data. Additionally added separate output column for each activity type instead of filtering the activity type during the join. You can remove the extra column and add the join filter back in if desired.
Also since the activity and acceptance data is as sparse as it is, I've added a having clause to only report the weeks where users accept or have activity.
The final change is having added the with rollup clause to the group by clause to get some grand totals.
With the table and data below I am trying to get the highest effective_from values that are less than the current timestamp, per unique brand/model combination - effectively the current price per item.
CREATE TABLE things
(`id` int, `brand` varchar(1), `model` varchar(5), `effective_from` int, `price` int);
INSERT INTO things
(`id`, `brand`, `model`, `effective_from`, `price`)
VALUES
(1, 'a', 'red', 1402351200, 100),
(2, 'b', 'red', 1402351200, 110),
(3, 'a', 'green', 1402391200, 120),
(4, 'b', 'blue', 1402951200, 115),
(5, 'a', 'red', 1409351200, 150),
(6, 'a', 'blue', 1902351200, 140),
(7, 'b', 'green', 1402358200, 135),
(8, 'b', 'blue', 1902358200, 155),
(9, 'b', 'red', 1902751200, 200),
(10, 'a', 'red', 1908351200, 210),
(11, 'a', 'red', 1402264800, 660);
So far I have managed to get the row I'm looking for when I add conditions for a specific brand/model combination, but don't know how to fetch the current prices for all unique row combinations.
SELECT *
FROM things
WHERE effective_from<UNIX_TIMESTAMP()
AND brand='a'
AND model='red'
ORDER BY effective_from DESC
LIMIT 1;
If the current timestamp was 1402404432 the results should be as follows:
(1, 'a', 'red', 1402351200, 100),
(3, 'a', 'green', 1402391200, 120),
(2, 'b', 'red', 1402351200, 110),
(7, 'b', 'green', 1402358200, 135),
I guess you're after this. Advise if otherwise...
SELECT x.*
FROM things x
JOIN
( SELECT brand
, model
, MAX(effective_from) max_effective_from
FROM things
WHERE effective_from <= UNIX_TIMESTAMP()
GROUP
BY brand
, model
) y
ON y.brand = x.brand
AND y.model = x.model
AND y.max_effective_from = x.effective_from;
+------+-------+-------+----------------+-------+
| id | brand | model | effective_from | price |
+------+-------+-------+----------------+-------+
| 1 | a | red | 1402351200 | 100 |
| 2 | b | red | 1402351200 | 110 |
| 3 | a | green | 1402391200 | 120 |
| 7 | b | green | 1402358200 | 135 |
+------+-------+-------+----------------+-------+
SELECT UNIX_TIMESTAMP();
+------------------+
| UNIX_TIMESTAMP() |
+------------------+
| 1402404432 |
+------------------+