I am finding difficulty in writing mysql query to categorize my customers. I am categorizing customers based on number of hits on my website.like
New customer with one hits.
New customer with multiple hits.
Old customer
My Log table schema is as follows
Unique customer ID, Current Date, Subscribed, Hits Count
To categorize customer how can I compare current date customer logs with all the previous date logs through single query
It's not clear from your description, is customer_id unique?
Or is it the tuple (customer_id,current_date,subscribed,hits_count) that is unique?
If customer_id is unique, then something like this will return the specified result:
SELECT t.customer_id
, CASE
WHEN t.hits_count = 1 AND t.current_date = DATE(NOW())
THEN 'New customer with one hits.'
WHEN t.hits_count > 1 AND t.current_date = DATE(NOW())
THEN 'New customer with multiple hits.'
ELSE 'Old customer'
END AS category
FROM mytable t
If customer_id is not unique, then one way (but not the most efficient way) to get the specified result:
SELECT t.customer_id
, CASE
WHEN t.total_hits_count = 1 AND t.min_current_date = DATE(NOW())
THEN 'New customer with one hits.'
WHEN t.total_hits_count > 1 AND t.min_current_date = DATE(NOW())
THEN 'New customer with multiple hits.'
ELSE 'Old customer'
END AS category
FROM ( SELECT h.customer_id
, MIN(h.current_date) AS min_current_date
, SUM(h.hits_count) AS total_hits_count
FROM mytable h
GROUP BY h.customer_id
) t
The inline view aliased as t gets us unique values for customer_id, along with the earliest current_date, and the total of the hits_count. (You can run just the query inside the parens to verify it's returning the desired result.) The outer query is identical to the first query, with just some renamed columns.
The inline view isn't necessary, you could get an equivalent result (more efficiently) with something like this:
SELECT t.customer_id
, CASE
WHEN SUM(t.hits_count) = 1 AND MIN(t.current_date) = DATE(NOW())
THEN 'New customer with one hits.'
WHEN SUM(t.hits_count) > 1 AND MIN(t.current_date) = DATE(NOW())
THEN 'New customer with multiple hits.'
ELSE 'Old customer'
END AS category
FROM mytable t
GROUP BY t.customer_id
NOTE There's some corner cases that will cause customer_id to be categorized as 'Old customer', such as SUM(t.hits_count) < 1, or t.current_date IS NULL, etc.
To specifically test for a row with a current_date before today's date, make a specific test for that in the CASE expression:
SELECT t.customer_id
, CASE
WHEN SUM(t.hits_count) = 1 AND MIN(t.current_date) = DATE(NOW())
THEN 'New customer with one hits.'
WHEN SUM(t.hits_count) > 1 AND MIN(t.current_date) = DATE(NOW())
THEN 'New customer with multiple hits.'
WHEN MIN(t.current_date) < DATE(NOW())
THEN 'Old customer'
ELSE 'Some other category'
END AS category
FROM mytable t
GROUP BY t.customer_id
NOTE
I assumed that the current_date column was of type DATE, and not DATETIME or TIMESTAMP. If that column also includes a time component which is not equal to midnight 00:00:00, then the equality comparison to DATE(NOW()) is not going to return TRUE whenever that time component is not midnight.
In that case, we'd prefer to check a range of datetime values, replacing
... AND t.current_date = DATE(NOW())
with something like this:
... AND t.current_date >= DATE(NOW()) AND t.current_date < DATE(NOW()) + INTERVAL 1 DAY
Related
I have a query that looks like this
SELECT customer, totalvolume
FROM orders
WHERE deliverydate BETWEEN '2020-01-01' AND CURDATE()
Is there any way to select totalvolume for specific date range and make it a separate column?
So for example, I already have totalvolume. I'd like to also add totalvolume for the previous month as a separate column (totalvolume where deliverydate BETWEEN '2020-08-01' AND '2020-08-31'). Is there a function for that?
Simply use 2 table copies:
SELECT t1.customer, t1.totalvolume, t2.totalvolume previousvolume
FROM orders t1
LEFT JOIN orders t2 ON t1.customer = t2.customer
AND t1.deliverydate = t2.deliverydate + INTERVAL 1 MONTH
WHERE t1.deliverydate BETWEEN '2020-08-01' AND '2020-08-31';
You can do it with case/when construct in your columns and just expand your WHERE clause. Sometimes I would do it by having a secondary #variables to simplify my clauses. Something like
SELECT
o.customer,
sum( case when o.deliveryDate < #beginOfMonth
then o.TotalVolume else 0 end ) PriorMonthVolume,
sum( case when o.deliveryDate >= #beginOfMonth
then o.TotalVolume else 0 end ) ThisMonthVolume,
sum( o.totalvolume ) TwoMonthsVolume
FROM
( select #myToday := date(curdate()),
#beginOfMonth := date_sub( #myToday, interval dayOfMonth( #myToday ) -1 day ),
#beginLastMonth := date_sub( #beginOfMonth, interval 1 month ) ) SqlVars,
orders o
WHERE
o.deliverydate >= #beginLastMonth
group by
o.customer
To start, the "from" clause of the query alias "SqlVars" will dynamically create 3 variables and return a single row for that set. With no JOIN condition, is always a 1:1 ratio for everything in the orders table. Nice thing, you don't have to pre-declare variables and the #variables are available for the query.
By querying for all records on or after the beginning of the LAST month, you get all records for both months in question. The sum( case/when ) can now use those variables as the demarcation point for the respective volume totals.
I know you mentioned this was a simplified query, but masking that might not be a perfect answer to what you need, but may help you look at it from a different querying perspective.
I want to develop a SQL query to check if a given date is in at least each of document group.
The following the table
DocID UserID StartDAte EndDAte OfficialName
1 1 10/1/18 10/3/18 A
2 1 10/5/18 10/10/18 A
3 1 10/1/18 10/9/18 B
4 1 10/1/18 10/9/18 C
5 1 10/1/18 10/5/18 D
6 1 10/7/18 10/20/18 D
There are 4 document groups namely, A,B,C,D. Need to check if a given date is in atleast each of the documents in each group.
eg date : 10/2/18 is in first record of A,B,C, and first record of D. So it is passed.
eg date : 10/4/18 is not in either of documents in A hence failed.
eg date : 10/8/18 is second document in A,B,C, and second document in D hence passed.
eg date : 10/6/18 is in A but not in D hence failed.
Since I have to write this for a given user and date, I have to use "IN" clause for "OfficialName" but how could I add "OR" to check date is in any of the files in each "OfficialName" group for all documents for the given user ?
Any help is appreciated.
Need to add something not clear. Number of documents in Official name is not fixed. It could be one or many.
Aggregate and get the distinct count of groups. If you get 4, you have a match otherwise you don't.
SELECT count(DISTINCT t.officialname)
FROM elbat t
WHERE t.userid = <given user>
AND t.startdate <= <given date>
AND t.enddate >= <given date>;
You can also add a HAVING count(DISTINCT t.officialname) = 4 to get an empty set if and only if there's no match.
I think you want:
select (case when count(distinct t.officialname) = 4 then 'passed' else 'failed' end) as flag_4groups
from t
where #date <= t.startdate and
#date >= t.enddate and
t.user_id = #user;
If you want this for all users (but a given date):
select t.user_id,
(case when count(distinct t.officialname) = 4 then 'passed' else 'failed' end) as flag_4groups
from t
where #date <= t.startdate and
#date >= t.enddate
group by t.user_id
you can use this:
SELECT count(DISTINCT t.officialname)
FROM elbat t
WHERE #date between t.startDate AND t.enddate and
t.userid = #userId;
The first case statement i got the correct result but in the second one
Why i got an NULL result Where my second case statement the counter = 2
this is the result i have an image
Query Result that i got Null data in second statement when i grouped by on my date
SELECT DISTINCT date,log,
CASE
WHEN note = 'HOLIDAY' AND counter = 1
THEN 'HOLIDAY'
END note1,
CASE
WHEN note = 'HOLIDAY' AND counter = 2
THEN 'HOLIDAY'
END note2,
FROM timesheet
WHERE timesheet.empid='40' AND date <= CURDATE() AND YEAR(date)= YEAR(CURDATE())
AND MONTH(date) = MONTH(CURDATE())
GROUP BY date
ORDER BY date DESC;
You're using GROUP BY wrong. The rule is that each column in your SELECT clause is either also in your GROUP BY clause or an aggregate function (like count, min, max, avg) must be applied to it.
When you don't follow this rule, a random row for each group is displayed. In your case, when you really have data with note = 'HOLIDAY' AND counter = 2, the rows for the group might look like this
NULL
HOLIDAY
NULL
NULL
but after collapsing (when it's outputted by the select), just the first row is displayed, therefore the NULL value.
Try it like this:
SELECT date,
MIN(log), /*or maybe you want to group by this column, too? */
MAX(CASE
WHEN note = 'HOLIDAY' AND counter = 1
THEN 'HOLIDAY'
END) note1,
MAX(CASE
WHEN note = 'HOLIDAY' AND counter = 2
THEN 'HOLIDAY'
END) note2,
FROM timesheet
WHERE timesheet.empid='40' AND date <= CURDATE() AND YEAR(date)= YEAR(CURDATE())
AND MONTH(date) = MONTH(CURDATE())
GROUP BY date
ORDER BY date DESC;
Also note, that I removed the DISTINCT. Your GROUP BY already does that.
I'm using mysql. I have a table time_record. There are four columns namely id, time_in, time_out and students_id. I only want to query the record of the latest time_in of the student but if the latest record of the student is in time_out, it will not query. How do I query the student's latest time_in record if given that the student still does not times out?
Thanks
Here is my sample code (though it returns both records from timein and timeout)
select concat (
st.student_fname,
' ',
st.student_lname
) as 'Name',
t.students_id,
t.time_in,
t.time_out,
case
when t.time_in > t.time_out
then t.time_in
else t.time_out
end as MostRecentDate
from classes c
join student_classes s on c.id = s.classes_id
join timerecords t on t.students_id = s.students_id
join students st on s.students_id = st.student_id
where c.employees_id = 'sessionvalue2'
and
where date (t.time_in) between date (now()) and date (now())
From what I understand, you want to query all results in which time_in is the latest entry and exclude results where time_out is the latest entry.
Try this:
SELECT DISTINCT(tr.id), tr.time_in
FROM time_record tr
WHERE tr.time_in > tr.time_out
ORDER BY tr.time_in DESC
I think your time_out columns may have nulls due to which the comparison results in false in that case and it return time_out.
Also,
date (t.time_in) between date (now()) and date (now())
doesn't make any sense. If you want to check the time_in for today's date, do:
date(t.time_in) = curdate();
Use curdate() instead of date(now()).
or Sargable so that index can be used if any:
date >= curdate()
and date < date_add(curdate(), interval 1 day)
Try flipping the condition like this:
select concat (
st.student_fname,
' ',
st.student_lname
) as 'Name',
t.students_id,
t.time_in,
t.time_out,
case
when t.time_in < t.time_out
then t.time_out
else t.time_in
end as MostRecentDate
from classes c
join student_classes s on c.id = s.classes_id
join timerecords t on t.students_id = s.students_id
join students st on s.students_id = st.student_id
where c.employees_id = 'sessionvalue2'
and date >= curdate()
and date < date_add(curdate(), interval 1 day)
I currently have a query that finds all rows (with status=0) that have occurred before now:
SELECT id, COUNT(1) FROM tbl WHERE status = 0 AND date < UNIX_TIMESTAMP() GROUP BY id;
However, now I'd also like to be able to retrieve the values on the other side of this--i.e., I want to get all dates available after and before now, as two distinct values.
Is there any way to optimize this besides simply running two separate queries?
SELECT id
, SUM(date < UNIX_TIMESTAMP()) AS BeforeNow
, SUM(date > UNIX_TIMESTAMP()) AS AfterNow
FROM tbl
WHERE status = 0
GROUP BY id;
date < UNIX_TIMESTAMP() is a boolean expression, which equates to 1 or 0. The SUM of the expression is equal to the amount of times it was true, or its count.
You can do a conditional count.
SELECT id,
COUNT(CASE WHEN date < UNIX_TIMESTAMP() THEN 1 ELSE null END ) ,
COUNT(CASE WHEN date > UNIX_TIMESTAMP() THEN 1 ELSE null END )
FROM tbl GROUP BY id