I have table containing two DATE columns. TS_customer and TS_verified
I am searching for a way to get a result where in the first column I have dates where either someone created a user (TS_customer) or someone got verified (TS_verified).
In the second column I want count(TS_customer) grouped by the first column.
The third column I want count(TS_verified) grouped by the first column.
It might be 0 customers verified on a sign up date, and in another case 0 signups on a date someone got verified.
I guess it should be an easy one, but I've spent so many hours on it now. Would really appreciate some help. I need this for a graph in excel, so i basicly want how many customers signed up and how many got verified one day without having the hassle to have two selects and combinding them manually.
EDIT: link to SQLfiddle http://sqlfiddle.com/#!2/b14fc/1/0
Thanks
First, we need the list of days.
That looks like this http://sqlfiddle.com/#!2/b14fc/14/0:
SELECT DISTINCT days
FROM (
SELECT DISTINCT DATE(TS_customer) days
FROM customer
UNION
SELECT DISTINCT DATE(TS_verified) days
FROM customer
) AS alldays
WHERE days IS NOT NULL
ORDER BY days
Next we need a summary of customer counts by day. That's pretty easy http://sqlfiddle.com/#!2/b14fc/16/0:
SELECT DATE(TS_customer) days, COUNT(TS_customer)
FROM customer
GROUP BY days
The summary of verifications by day is similarly easy.
Next we need to join these three subqueries together http://sqlfiddle.com/#!2/b14fc/29/0.
SELECT alldays.days, custcount, verifycount
FROM (
SELECT DISTINCT DATE(TS_customer) days
FROM customer
UNION
SELECT DISTINCT DATE(TS_verified) days
FROM customer
) AS alldays
LEFT JOIN (
SELECT DATE(TS_customer) days, COUNT(TS_customer) custcount
FROM customer
GROUP BY days
) AS cust ON alldays.days = cust.days
LEFT JOIN (
SELECT DATE(TS_verified) days, COUNT(TS_verified) verifycount
FROM customer
GROUP BY days
) AS verif ON alldays.days = verif.days
WHERE alldays.days IS NOT NULL
ORDER BY alldays.days
Finally, if you want 0 displayed rather than (null) for days when there weren't any customers and/or verifications, change the SELECT line to this http://sqlfiddle.com/#!2/b14fc/30/0.
SELECT alldays.days,
IFNULL(custcount,0) AS custcount,
IFNULL(verifycount,0) AS verifycount
See how that goes? We build up your result set step by step.
I'm a bit confused on why you created a fiddle that can not hold null values on the TS_Customer and then mention that the field can hold null values.
Having said that, I've modified the solution to work with null values and still be pretty efficient and simple:
SELECT days, sum(custCount) custCount, sum(verifCount) verifCount FROM (
SELECT DATE(TS_customer) days, count(*) custCount, 0 verifCount
FROM customer
WHERE TS_customer IS NOT NULL
GROUP BY days
UNION ALL
SELECT DATE(TS_verified) days, 0, count(*)
FROM customer
WHERE TS_verified IS NOT NULL
GROUP BY days
) s
GROUP BY days
I've also created a different fiddle containing some null values here.
Related
I have 2 tables, one with hostels (effectively a single-room hotel with lots of beds), and the other with bookings.
Hostel table: unique ID, total_spaces
Bookings table: start_date, end_date, num_guests, hostel_ID
I need a (My)SQL query to generate a list of all hostels that have at least num_guests free spaces between start_date and end_date.
Logical breakdown of what I'm trying to achieve:
For each hostel:
Get all bookings that overlap start_date and end_date
For each day between start_date and end_date, sum the total bookings for that day (taking into account num_guests for each booking) and compare with total_spaces, ensuring that there are at least num_guests spaces free on that day (if there aren't on any day then that hostel can be discounted from the results list)
Any suggestions on a query that would do this please? (I can modify the tables if necessary)
I built an example for you here, with more comments, which you can test out:
http://sqlfiddle.com/#!9/10219/9
What's probably tricky for you is to join ranges of overlapping dates. The way I would approach this problem is with a DATES table. It's kind of like a tally table, but for dates. If you join to the DATES table, you basically break down all the booking ranges into bookings for individual dates, and then you can filter and sum them all back up to the particular date range you care about. Helpful code for populating a DATES table can be found here: Get a list of dates between two dates and that's what I used in my example.
Other than that, the query basically follows the logical steps you've already outlined.
Ok, if you are using mysql 8.0.2 and above, then you can use window functions. In such case you can use the solution bellow. This solution does not need to compute the number of quests for each day in the query interval, but only focuses on days when there is some change in the number of hostel guests. Therefore, there is no helping table with dates.
with query as
(
select * from bookings where end_date > '2017-01-02' and start_date < '2017-01-05'
)
select hostel.*, bookingsSum.intervalMax
from hostel
join
(
select tmax.id, max(tmax.intervalCount) intervalMax
from
(
select hostel.id, t.dat, sum(coalesce(sum(t.gn),0)) over (partition by t.id order by t.dat) intervalCount
from hostel
left join
(
select id, start_date dat, guest_num as gn from query
union all
select id, end_date dat, -1 * guest_num as gn from query
) t on hostel.id = t.id
group by hostel.id, t.dat
) tmax
group by tmax.id
) bookingsSum on hostel.id = bookingsSum.id and hostel.total_spaces >= bookingsSum.intervalMax + <num_of_people_you_want_accomodate>
demo
It uses a simple trick, where each start_date represents +guest_num to the overall number of quests and each 'end_date' represents -guest_num to the overall number of quests. We than do the necessary sumarizations in order to find peak number of quests (intervalMax) in the query interval.
You change '2017-01-05' in my query to '2017-01-06' (then only two hostels are in the result) and if you use '2017-01-07' then just hostel id 3 is in the result, since it does not have any bookings yet.
This site has answered many a SQL questions for me. Finally signed up to ask one of my own and get active here.
Anyways, I'm working in a table that has Date_Effective and Date_Lapse. A Client can have multiple rows so what I'm trying to get to is the number of days between a Date_Lapse and the next Date_Effective for the same client. The date values in this table are int's that I'll convert to dates later.
The below code doesn't work. It doesn't like the second value I'm joining on. Why can't I get it to give me min date_effective that's greater than each date_effective? If I run the below I just get no results because it's seeing it as there are no effective dates greater than the min effective date.
SELECT ClientID, c1.Date_Lapse, c2.Date_Effective
FROM Fact_Episodes c1
LEFT JOIN (
SELECT ClientID, min(Date_Effective) as Date_Effective
FROM Fact_Episodes
GROUP BY ClientID
) c2
ON c1.ClientID = c2.ClientID
AND c2.Date_Effective > c1.Date_Effective
If you wanted to stick with the left join, this would work.
select c1.ClientID, c1.Date_Lapse, min(c2.Date_Effective) as Date_Effective
from Fact_Episodes c1
left join Fact_Episodes c2
on c1.ClientID = c2.ClientID
and c2.Date_Effective > c1.Date_Effective
group by c1.ClientId, c1.Date_Lapse
I have a query that shows me the number of calls per day for the last 14 days within my app.
The query:
SELECT count(id) as count, DATE(FROM_UNIXTIME(timestamp)) as date FROM calls GROUP BY DATE(FROM_UNIXTIME(timestamp)) DESC LIMIT 14
On days where there were 0 calls, this query does not show those days. Rather than skip those days, I'd like to have a 0 or NULL in that spot.
Any ideas for how I can achieve this? If you have any questions as to what I'm asking please let me know.
Thanks
I don't believe your query is "skipping over NULL values", as your title suggests. Rather, your data probably looks something like this:
id | timestamp
----+------------
1 | 2014-01-01
2 | 2014-01-02
3 | 2014-01-04
As a result, there are no rows that contain the missing date, so there are no rows to be counted. The answer is that you need to generate a list of all the dates you want and then do a LEFT or RIGHT JOIN to it.
Unfortunately, MySQL doesn't make this as easy as other databases. There doesn't seem to be an effective way of generating a list of anything inline. So you'll need some sort of table.
I think I would create a static table containing a set of integers to be subtracted from the current date. Then you can use this table to generate your list of dates inline and JOIN to it.
CREATE TABLE days_ago_list (days_ago INTEGER);
INSERT INTO days_ago_list VALUES
(0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13)
;
Then:
SELECT COUNT(id), list_date
FROM (SELECT SUBDATE(CURDATE(), days_ago) AS list_date FROM days_ago_list) dates_to_list
LEFT JOIN (SELECT id, DATE(FROM_UNIXTIME(timestamp)) call_date FROM calls) calls_with_date
ON calls_with_date.call_date = dates_to_list.list_date
GROUP BY list_date
It is very important that you group by list_date; call_date will be NULL for any days without calls. It is also important to COUNT on id since NULL ids will not be counted. (That ensures you get a correct count of 0 for days with no calls.) If you need to change the dates listed, you simply update the table containing the integer list.
Here is a SQL Fiddle demonstrating this.
Alternatively, if this is for a web application, you could generate the list of dates code side and match up the counts with the dates after the query is done. This would make your web app logic somewhat more complicated, but it would also simplify the query and eliminate the need for the extra table.
create a table that contains a row for each date you want to ensure is in the results, left outer join with results of your current query, use temp table's date, count of above query and 0 if that count is null
I have two tables in my database that I"m trying to use one Query to get data from both for a specific report.
Table one is "Movies" and it has these fields:
Movies_ID
Name
Season
Table two is "Boxoffice" sales income for each movie:
Boxoffice_ID
Movies_ID
Date
Amount
I want to run a query to compare the opening weekends for each movie in a given season and return them as one dataset with the amounts collected added together. So I want to take each movie and get the first three days of box office for each film and add them up so that I get back a query like this
Movie A, 49.1 Million
Movie B, 42.2 Million
Movie C, 29.5 Million
Please note the amount collected only needs to output the number and I'll take care of the formatting. I'm just having trouble figuring out how to only query the first three days of box office for each movie and adding them together.
I know I could run one query and get the movies with box office and then loop over that and re-query the database but I know that with a lot of movies that isn't the most efficient way of doing things. I'm not sure if there is a way to do all of this (first three days of each movie added together) in one query but I wanted to see if someone with more advanced knowledge could help me out.
SELECT a.Name, SUM(COALESCE(b.Amount,0)) totalAmount
FROM Movies a
LEFT JOIN BoxOffice b
ON a.Movies_ID = b.Movies_ID
WHERE b.date BETWEEN DATE_ADD(CURDATE(),INTERVAL -3 DAY) AND CURDATE()
GROUP BY a.Name
if the value of CURDATE() is 2012-11-06 (which is today), it will calculate from 2012-11-03 until 2012-11-06.
followup question, how do you calculate the date? by day? by week? or what?
UPDATE 1
SELECT a.Name, SUM(COALESCE(b.Amount,0)) totalAmount
FROM Movies a
LEFT JOIN BoxOffice b
ON a.Movies_ID = b.Movies_ID
LEFT JOIN
(
SELECT movies_ID, MIN(date) minDate
FROM BoxOffice
GROUP BY Movies_ID
) c ON a.Movies_ID = c.Movies_ID
WHERE DATE(b.date) BETWEEN DATE(c.minDate) AND
DATE(DATE_ADD(c.minDate,INTERVAL 3 DAY))
GROUP BY a.Name
just join the tables on Movies_ID and add WHERE with TIMEDIFF between issue date and Date being 3 days.
Assume you have a table with a stock time series on a daily basis.
Now you need to filter one data point per week, because you need weekly data for some analysis. You don't to have weekly averages, since this would leave much of the variation out.
This would be my initial approach, but it's not clear which of the data points falling in a given week is selected.
SELECT date, price from stock_series
GROUP BY WEEK(date)
1 How do I make sure it's always the first data point existing for a given week that gets picked?
EDIT:
2 If the above query stayed the way it is - which data point gets chosen every week? What's the MySQL logic in this case? Or is it just unpredictible?
If you want to have a better control over it, you could try using a subquery :
SELECT date,price
FROM stock_series
WHERE date IN
(
SELECT MIN(inner.date)
FROM stock_series inner
GROUP BY WEEK(inner.date)
) GROUP BY date
I've added GROUP BY date in the main query because you probably have more than one entry per day, otherwise it could be ommited.
EDIT:
or try joining with it:
SELECT date,price
FROM stock_series
JOIN
(
SELECT MIN(date) AS innerdate
FROM stock_series
GROUP BY WEEK(date)
) inner ON date=innerdate;
You can order by date ascending, which should give you just the first result of the WEEK() group.
SELECT date,price from stock_series
GROUP BY WEEK(date)
ORDER BY date