MySQL GROUP BY with multiple parameters hiding zeros - mysql

I've read similar questions here on stackoverflow, but the OP's table structure is never quite the same as mine, so the answer doesn't work for me. The posts I've read are only trying to GROUP BY one column as opposed to two. I'm using MySQL, latest stable release.
Here's my table "reference":
id formatID referenceTime
1 1 2011-6-12 12:40
2 2 2011-6-12 1:04
3 4 2011-6-12 1:03
4 2 2011-6-12 15:20
5 3 2011-6-12 9:30
6 3 2011-6-12 2:55
7 5 2011-6-12 13:15
8 1 2011-6-12 12:32
(etc)
I want to create a query that show how many of each type of format occurred by hour of day. The point of this is to see what is the busiest time of day. I am trying to write a query that will create output that I can use for some simple graph web apps (Highcharts.js). I want it to look like this:
Timeofday Subgroup Count
12AM 1 2
12AM 2 6
12AM 3 7
12AM 4 2
12AM 5 0
1AM 1 3
1AM 2 3
1AM 3 0
1AM 4 0
1AM 5 1
(etc)
I'm using this query:
SELECT date_format(referenceTime,'%I %p') AS timeofday,
reference.referenceFormatID AS subgroup,
count(*) AS count
FROM reference
GROUP BY timeofday,subgroup ASC
However, the output skips "rows" where the count equals zero and so ends up looking like this:
Timeofday Subgroup Count
12AM 1 2
12AM 2 6
1AM 3 7
1AM 4 2
1AM 5 1
3AM 1 3
6AM 2 3
7AM 3 1
7AM 4 1
9AM 5 1
(etc)
I need those zeros to be able to create a properly formatted data series for my app.
The LEFT JOIN method where you put all the times into a second table isn't working for me because I am grouping by two different columns. Apparently, the LEFT JOIN criteria is satisfied as long as each hour shows up somewhere in the output table, but I need each hour to appear for each format.
Any suggestions?

You have two options, either create a lookup table with the possible hours in it, or use strange query involving the dual table and union to get the values that you are looking for.
In the first case, you would have a table with maybe a single field for the moment, let's just call it hours and the field is timeofday.
In the hours timeofday, you would have the following data:
timeofday
12AM
1AM
2AM
....
Then your query is as simple as
SELECT hours.timeofday,
reference.referenceFormatID AS subgroup,
count(reference.referenceFormatID) AS count
FROM hours
LEFT JOIN reference on date_format(referenceTime,'%I %p') = hours.timeofday
GROUP BY hours.timeofday,subgroup ASC
EDIT
To get all combinations, you would also need a formats table with all the possible formatIDs as was mentioned by rfausak. You could also do this with a distinct, but let's just assume that you have this table, let's call it formats. Again, this table could have a single column.
Part 1 is to get all the combinations:
SELECT hours.timeofday,
formats.ID
from hours
join formats
This is a Cartesian join that would merge all possible hours and format IDs.
Now we add in the LEFT JOIN
SELECT hours.timeofday,
formats.ID,
count(reference.subgroup)
FROM hours
JOIN formats
LEFT JOIN reference on date_format(referenceTime,'%I %p') = hours.timeofday
AND reference.subgroup = formats.ID
GROUP BY hours.timeofday,formats.ID ASC
If you try to do it using a DUAL table look up, you can use a method similar to generate days from date range

Related

Average date of visiting website

I'd like to know the average dates per week that users have been visited the website 'A'. If the user hasn't visited the website 'A', I exclude the data (e.g., id = 2). And I also need to consider the date range (limit it to a week range, e.g., 01-JAN-2018 to 07-JAN-2018)
Sample input (Table:User)
id date website
1 01-JAN-2018 A
1 03-JAN-2018 B
1 04-JAN-2018 C
1 04-JAN-2018 C
2 03-JAN-2018 C
3 03-JAN-2018 A
3 05-JAN-2018 B
4 05-JAN-2018 A
The first step will like this:
id date website
1 01-JAN-2018 A
1 03-JAN-2018 B
1 04-JAN-2018 C
1 04-JAN-2018 C
3 03-JAN-2018 A
3 05-JAN-2018 B
4 05-JAN-2018 A
The output will only return the average dates that users visiting websites (including ABC). In this case, user 1 visited three days a week (ignore duplicates) and user 3 visited two days a week. The average dates of hits will be sum(days)/number of users.
My first thought:
SELECT COUNT(Date), Date
FROM user
WHERE id IN (
SELECT id FROM user
WHERE web = 'A'
);
Assume that I only want to consider this week range (01-JAN-2018 to 07-JAN-2018). I want to figure out the average of dates of visiting in one week. Any thoughts for this? Thanks!
Link for Demo
If you want to group by hits in a week, you might try something more like this:
select year(STR_TO_DATE(date,'%d-%b-%y')) year,
weekofyear(STR_TO_DATE(date,'%d-%b-%y')) week,
count(*) hits
from user
group by year(STR_TO_DATE(date,'%d-%b-%y')), weekofyear(STR_TO_DATE(date,'%d-%b-%y'))
The group by is the key: this will group all the hits for a particular week together, whereas group by date will keep each day separate.
If you want an average for multiple weeks, you would need to use this query as a subquery, and do an average on the count column.
And as was stated in comments, this would be a LOT easier (not to mention more efficient) if the date was stored as a date and not as a varchar

Mysql get user list of every week interval

I want to write query to fetch user from table who register before week interval.
For ex. todays date is 2017-08-17, then I need user who register on 2017-08-10, 2017-08-03,2017-07-27 and so on. Like this if todays date is 2017-08-20 then user will be register on 2017-08-13, 2017-08-06.
id name date
1 ABC 2018-08-16
2 PQR 2018-08-10
3 LMN 2018-07-27
4 AAA 2018-01-01
Output will be
id name date
2 PQR 2018-08-10
3 LMN 2018-07-27
One way to express this problem is to recognize that we want to retain dates whose difference from today are multiple of 7 days. We can compare the UNIX timestamps of each record and check to see if the number of seconds, when divided by the number of seconds in 7 days, is zero.
SELECT *
FROM yourTable
WHERE
MOD(UNIX_TIMESTAMP(CURDATE()) -
UNIX_TIMESTAMP(DATE(reg_date)), 7*24*60*60) = 0
Demo here:
Rextester
SELECT * FROM user WHERE WEEKDAY(`date`) = WEEKDAY(NOW());
This will get you all users that registered 0, 7, 14, 21 etc. days ago.

Select rows with the lowest M values, out of N values

I have a table that's structured like so:
id value hour
1 4 176475
2 2 176475
3 3 176475
4 2 176475
1 2 184563
2 1 184563
3 4 184563
4 3 184563
... ... ...
1 2 N
2 3 N
3 1 N
4 4 N
The key property is that the data is split into hours which are in ascending order. The 'hours' are timestamps truncated to enforce 24 buckets per day. I want to do several things:
Pull all of the rows for the first hour
Sum values for each ID over 3 hours, 8 hours...N hours.
Is there a simple way to do this? I am aware that I could use NTILE to label the data but that's a very expensive operation in Spark.
EDIT:
Expected Result for aggregating hours 1-3:
id value
1 9
2 7
3 10
4 8
The values are made up, but the idea is to sum the values of the IDs in each of the 3 hours, so that I have one value per ID, instead of three.
This is the query you're looking for:
SELECT id, SUM(value) as `value`
FROM yourTableHere
WHERE hour between (NOW() - INTERVAL X HOUR) AND NOW()
GROUP BY id, hour
Breaking the query down.
Select the ID and count the value from yourTable.
Where hour is between X hours ago and now.
Group the results by id and hour between the given timestamps.
Replace X with 1/3/8 or more for the hours you wish.

SQL query for various time periods

I have a table that contains Following entries:
completed_time|| BOOK_CNT
*********************************************
2013-07-23 | 2
2013-07-22 | 1
2013-07-19 | 3
2013-07 16 |5
2013-07-12 |4
2013-07-11 |2
2013-07-02 |9
2013-06-30 |5
Now, I want to use above entries for data analysis.
Lets say DAYS_FROM, DAYS_TO and PERIOD are three variables.
I need to fire following sort of queries:
"Total book from DAYS_FROM to DAYS_TO in interval of PERIOD."
DAYS_FROM is a date in format YYYY-MM-DD
,DAYS_TO is a date in format YYYY-MM-DD
PERIOD is {1W,2W,1M,2M,1Y}
where W,M,Y represents WEEK,MONTH and YEAR.
Example: The queries DAYS_FROM=2013-07-23 , DAYS_TO=2013-07-03 and PERIOD=1W should return:
ith week - total
1 - 3
2- 8
3- 6
4- 14
Explanation:
1-3 means (The total book from 2013-07-21(sun) to 2013-07-23(tue) is 3 )
2-8 means (The total book from 2013-07-14(sun) to 2013-07-21(sun) is 8 )
3-16 means (The total book from 2013-07-07(sun) to 2013-07-14(sun) is 6 )
4-14 means (The total book from 2013-07-03(wed) to 2013-07-07(sun) is 14 )
Please refer the calendar image for better understanding.
How to fire such query?
What I tried?
SELECT DAY(completed_time), COUNT(total) AS Total
FROM my_tab
WHERE completed_time BETWEEN '2013-07-23' - INTERVAL 1 WEEK AND '2013-07-03'
GROUP BY DAY(completed_time);
The above queries subtracted 7 days from 2013-07-23 and thus considered 2013-07-16 to 2013-07-23 as first week, 2013-07-09 to 2013-07-16 as second week and so on.
A simple starting point would be something like below, of course you may want to adjust the ith value to suit your needs;
SET #period='1M';
SELECT CASE WHEN #period='1Y' THEN YEAR(completed_time)
WHEN #period='1M' THEN YEAR(completed_time)*100+MONTH(completed_time)
WHEN #period='2M' THEN FLOOR((YEAR(completed_time)*100+MONTH(completed_time))/2)*2
WHEN #period='1W' THEN YEARWEEK(completed_time)
WHEN #period='2W' THEN FLOOR(YEARWEEK(completed_time)/2)*2
END ith,
SUM(BOOK_CNT) Total
FROM my_tab
GROUP BY ith
ORDER BY ith DESC;
An SQLfiddle to test with.

MS Access: Using Single form to enter query parameters in MS access

compliment of the day.
Based on the previous feedback received,
After creating a Ticket sales database in MS Access. I want to use a single form to Query the price of a particular ticket at a particular month and have the price displayed back in the form in a text field or label.
Below are sample tables and used query
CompanyTable
CompID CompName
A Ann
B Bahn
C Can
KK Seven
- --
TicketTable
TicketCode TicketDes
10 Two people
11 Monthly
12 Weekend
14 Daily
TicketPriceTable
ID TicketCode Price ValidFrom
1 10 $35.50 8/1/2010
2 10 $38.50 8/1/2011
3 11 $20.50 8/1/2010
4 11 $25.00 11/1/2011
5 12 $50.50 12/1/2010
6 12 $60.50 1/1/2011
7 14 $15.50 2/1/2010
8 14 $19.00 3/1/2011
9 10 $40.50 4/1/2012
Used query:
SELECT TicketPriceTable.Price
FROM TicketPriceTable
WHERE (((TicketPriceTable.ValidFrom)=[DATE01]) AND ((TicketPriceTable.TicketCode)=[TCODE01]));
In MS Access, a mini boxes pops up to enter the parameters when running the query. How can I use a single form to enter the parameters for [DATE01] and [TCODE01]. and the price displayed in the same form in a textfield (For further calculations).
Such as 'Month' field equals to input to [DATE01] parameter
'Ticket Code' equals to input for [TCODE01] parameter
Textfield equals to output of the query result (Ticket price)
If possible, I would like to use only the Month and Year in this format MM/YYYY.The day is not necessarry. How can I achieve it in MS Access?
If any question, please don't hesitate to ask
Thanks very much for your time and anticipated feedback.
You can refer to the values in the form fields by using expressions like: [Forms]![NameOfTheForm]![NameOfTheField]
Entering up to 300 different types of tickets
Answer to your comment referring to Accessing data from a ticket database, based on months in MS Access)
You can use Cartesian products to create a lot of records. If you select two tables in a query but do not join them, the result is a Cartesian product, which means that every record from one table is combined with every record from the other.
Let's add a new table called MonthTable
MonthNr MonthName
1 January
2 February
3 March
... ...
Now if you combine this table containing 12 records with your TicketTable containing 4 records, you will get a result containing 48 records
SELECT M.MonthNr, M.MonthName, T.TicketCode, T.TicketDes
FROM MonthTable M, TicketTable T
ORDER BY M.MonthNr, T.TicketCode
You get something like this
MonthNr MonthName TicketCode TicketDes
1 January 10 Two people
1 January 11 Monthly
1 January 12 Weekend
1 January 14 Daily
2 February 10 Two people
2 February 11 Monthly
2 February 12 Weekend
2 February 14 Daily
3 March 10 Two people
3 March 11 Monthly
3 March 12 Weekend
3 March 14 Daily
... ... ... ...
You can also get the price actually valid for a ticket type like this
SELECT TicketCode, Price, ActualPeriod AS ValidFrom
FROM (SELECT TicketCode, MAX(ValidFrom) AS ActualPeriod
FROM TicketPriceTable
WHERE ValidFrom <= Date
GROUP BY TicketCode) X
INNER JOIN TicketPriceTable T
ON X.TicketCode = T.TicketCode AND X.ActualPeriod=T.ValidFrom
The WHERE ValidFrom <= Date is in case that you entered future prices.
Here the subquery selects the actually valid period, i.e. the ValidFrom that applies for each TicketCode. If you find sub-selects a bit confusing, you can also store them as query in Access or as view in MySQL and base a subsequent query on them. This has the advantage that you can create them in the query designer.
Consider not creating all your 300 records physically, but just getting them dynamically from a Cartesian product.
I let you put all the pieces together now.
In Access Forms you can set the RecordSource to be a query, not only a table. This can be either the name of a stored query or a SQL statement. This allows you to have controls bound to different tables through this query.
You can also place subforms on the main form that are bound to other tables than the main form.
You can also display the result of an expression in a TextBox by setting the ControlSource to an expression by starting with an equal sign
=DLookUp("Price", "TicketPriceTable", "TicketCode=" & Me!cboTicketCode.Value)
You can set the Format of a TextBox to MM\/yyyy or use the format function
s = Format$(Now, "MM\/yyyy")