MySQL group by calculated field - mysql

I have members that have to pay for their membership. And I store: payment date, and membership length (they can pay for 1 month or several).
Now I'd like to know which payments are overdue, or soon to be.
My logic was: get an expiration date for each membership (last payment date + membership length) and then just look at the highest value of that for each member.
Here's my query, but I did want to explain my reasoning, as you may want to question that or even the format of the DB (but please don't ask me to store the expiration date).
SELECT tbl.company AS company,
MAX(ADDDATE( paydate, INTERVAL paylength MONTH ))) AS expiration,
tbl.id
FROM tblPayments
JOIN tbl ON tblPayments.comp_id = tbl.id
GROUP BY expiration
ORDER BY expiration ASC
I've read that grouping by calculated fields may not be possible, but my knowledge of MySQL is not strong enough to understand the workarounds. I'd appreciate any help you can provide! Thanks!

You are grouping by the aggregated result of your group which is not possible and not what you intended. Based on your explanation I would think you are trying to do this
SELECT
tbl.company AS company,
MAX(ADDDATE( paydate, INTERVAL paylength MONTH )) AS expiration,
tbl.id
FROM tblPayments
JOIN tbl
ON tblPayments.comp_id = tbl.id
GROUP BY tbl.company , tbl.id
HAVING expiration < NOW()
ORDER BY expiration ASC
And #dleiftah brings up a good point, you cannot always use the alias although MySQL seems to let you in a GROUP BY whereas MSSQL never does. I forget exactly when...

Related

MySQL - Hard Query with Max, Datediff, Subquery, Distinct/Limit

In short: MySQL - I need to bring company that have been inactive for a while (or 365 days for the fiddle example).
How I check this? each company have at least a contact, who is related to an event, and each event have (many) subevents, in this last table I have the last date of activity, the days that considers that one company is on inactivity is decided for the user, I don't have problem to do this calculation
sql.Append("where DATEDIFF(CURDATE(),DATE(lastdate)) > " +days.ToString()+ "
The problem is, that this check ALL the subevents, so this not only check the last date, but every date... and this means, bad output.
I was thinking on subqueries to get or the max date on the subevent of a contact, or the max date of the subevent of a event.
Then with a friend we get close with sort of this, but the query is infinite.
select * from subevent se
where DATEDIFF(CURDATE(),DATE(
(select se2.dates from subevent se2
where se2.dates in
(select max(se3.dates)
from subevent se3
where se.idev = se3.idev)
group by se2.dates)));
I'm stuck and I would appreciate the help...
Tried group by, subquery and MAX (obviously max is necessary, but don't how where to apply...)
https://www.db-fiddle.com/f/wgSQGn7Z26tHnwm6nAaNSA/8
(On the Fiddle link, should only bring the companyname2 and companyname4)
You can use aggregation to get the last subevent date for each company. Then filter using a having clause:
select c.idcomp
from contact c join
events e
on e.idcont = c.idcont join
subevent se
on se.idev = e.idev
group by c.idcomp
having max(se.date) < current_date - interval 365 day;
Here is a db-fiddle.

COUNT() domain names in emails based on the current month returning all records

I have a query as such
SELECT right(accounts.username, length(accounts.username)-
INSTR(accounts.username, '#')) domain,
COUNT(*) email_count
FROM tickets
LEFT JOIN accounts ON tickets.user = accounts.ID
WHERE (tickets.timestamp >= UNIX_TIMESTAMP(MONTH(CURRENT_DATE())))
GROUP BY domain
ORDER BY email_count DESC
I have a ticket table that I LEFT JOIN to associate the user accounts of that ticket to get the email(username) of that user.
I am trying to count the users email and how many tickets appear with a particular domain name of that user for the current MONTH. Problem is that it is ignoring the MONTH and returning all records that match.
For instance
yahoo.com 3,356
gmail.com 1,345
If I do a search for all records I get these numbers, but it should be much lower if it is just for the month. I am using UNIX timestamps for this.
Can anyone help me?
If you consider the UNIX_TIMESTAMP(MONTH(CURRENT_DATE()))) expression:
MONTH(CURRENT_DATE()) => 1
UNIX_TIMESTAMP(1) => this should result either in an error (1292 incorrect datetime value) or warning of the same and 0 as a result, depending on whether strict sql mode is enabled.
Since you wrote the query returns all records, strict sql mode must be turned off, which can cause issues like this. It would have been easier to get a straight error message.
If you want to return records from the current month, then you can use the following expression, where I used year() and month() functions to get current year and month and concatenated 1 to it to get the 1st day of the month:
tickets.timestamp >= UNIX_TIMESTAMP(CONCAT(YEAR(CURRENT_DATE()),'-',MONTH(CURRENT_DATE()),'-','1')
WHERE tickets.timestamp >= UNIX_TIMESTAMP(MONTH(CURRENT_DATE()))
This expression probably does not do what you think. MONTH() returns the number of the month (1 to 12), while you want the beginning of the current month.
You can use the following expression to compute the beginning of the month:
date_format(current_date(), '%Y-%m-01')
In your condition:
where tickets.timestamp >= unix_timestamp(date_format(current_date(), '%Y-%m-01'))
Modified for only current month:
SELECT
RIGHT(accounts.username, length(accounts.username)-INSTR(accounts.username, '#')) AS domain, COUNT(1) AS email_count
FROM tickets
LEFT JOIN accounts ON tickets.user = accounts.ID
WHERE
YEAR(tickets.timestamp) = YEAR(NOW())
AND MONTH(tickets.timestamp) = MONTH(NOW())
GROUP BY domain
ORDER BY email_count DESC

COUNT number distinct when they a row hasn't existed before the time period

I have kind of an interesting situation that I will try my best to explain.
I have a table called appointments in that table holds many appointments that a sales person can have with a potential customer. The relationship between appointments to salespeople is many to one and it is the same for potential customers.
I need to count how many appointments a salesperson has set with a lead when that salesperson has never set an appointment with that lead before.
Here is how far I have gotten in the code (I'm trying to see how many appointments a salesperson set yesterday, hence the date scrub):
SELECT COUNT(DISTINCT lead)
FROM appointments
WHERE status = 3
and DATE(appointment_created_at) = CURDATE() - interval 1 day
AND creator = 'xxx';
(the column creator represents the individual sales person and the column lead represents the individual potential customer)
The problem with this SQL query is that if a salesperson is resetting an appointment with a lead they have already set an appointment with, it still counts it as a "set appointment".
How can I count the number of rows in my appointments table without counting leads who have already been set before?
You can utilize NOT EXISTS() to check if an appointment already exists earlier or not.
SELECT COUNT(DISTINCT a1.lead)
FROM appointments a1
WHERE a1.status = 3
and a1.appointment_created_at >= CURRENT_DATE() - INTERVAL 1 DAY
AND a1.appointment_created_at < CURRENT_DATE()
AND a1.creator = 'xxx'
AND NOT EXISTS (SELECT 1
FROM appointments a2
WHERE a2.creator = 'xxx'
AND a2.lead = a1.lead
AND a2.appointment_created_at < a1.appointment_created_at)
For good performance, for the Correlated subquery in the NOT EXISTS() portion, you can use the following composite index: (creator, lead, appointment_created_at)
And, for the main select query, you can add the following the composite index: (creator, status, appointment_created_at)
If you want the number of "first-time" appointments, you can use row_number() or a correlated subquery:
SELECT COUNT(*)
FROM appointments a
WHERE a.status = 3 AND
a.appointment_created_at >= CURDATE() - interval 1 day AND
a.appointment_created_at < CURDATE() AND
a.creator = 'xxx' AND
a.appointment_created_at = (SELECT MIN(a2.appointment_created_at)
FROM appointments a2
WHERE a2.creator = a.creator AND
a2.lead = a.lead
);
Notice that I changed the date comparisons so an index can be used for the WHERE clause. If you care about performance, you want indexes on:
appointments(creator, status, appointment_created_at, lead)
appointments(creator, lead, appointment_created_at).
If the sales people can reschedule appointments then you are going to need an additional field to store original appointment date, at least. There are other more complex solutions, but this is probably the easiest approach.

MySQL - get users who placed 25th order during period

I have users and orders tables with this structure (simplified for question):
USERS
userid
registered(date)
ORDERS
id
date (order placed date)
user_id
I need to get array of users (array of userid) who placed their 25th order during specified period (for example in May 2019), date of 25th order for each user, number of days to place 25th order (difference between registration date for user and date of 25th order placed).
For example if user registered in April 2018, then placed 20 orders in 2018, and then placed 21-30th orders in Jan-May 2019 - this user should be in this array, if he placed 25th (overall for his account) order in May 2019.
How I can do this with MySQL request?
Sample data and structure: http://www.sqlfiddle.com/#!9/998358 (for testing you can get 3rd order as ex., not 25th, to not add a lot of sample data records).
One request is not required - if this can't be done in one request, few is possible and allowed.
You can use a correlated subquery to get the count of orders placed before the current one by a user. If that's 24 the current order is the 25th. Then check if the date is in the desired range.
SELECT o1.user_id,
o1.date,
datediff(o1.date, u1.registered)
FROM orders o1
INNER JOIN users u1
ON u1.userid = o1.user_id
WHERE (SELECT count(*)
FROM orders o2
WHERE o2.user_id = o1.user_id
AND o2.date < o1.date
OR o2.date = o1.date
AND o2.id < o1.id) = 24
AND o1.date >= '2019-01-01'
AND o1.date < '2019-06-01';
The basic inefficient way of doing this would be to get the user_id for every row in ORDERS where the date is in your target range AND the count of rows in ORDERS with the same user_id and a lower date is exactly 24.
This can get very ugly, very quickly, though.
If you're calling this from code you control, can't you do it from the code?
If not, there should be a way to assign to each row an index describing its rank among orders for its specific user_id, and select from this all user_id from rows with an index of 25 and a correct date. This will give you a select from select from select, but it should be much faster. The difficulty here is to control the order of the rows, so here are the selects I envision:
Select all rows, order by user_id asc, date asc, union-ed to nothing from a table made of two vars you'll initialize at 0.
from this, select all while updating a var to know if a row's user_id is the same as the last, and adding a field that will report so (so for each user_id the first line in order will have a specific value like 0 while the other rows for the same user_id will have a 1)
from this, select all plus a field that equals itself plus one in case the first added field is 1, else 0
from this, select the user_id from the rows where the second added field is 25 and the date is in range.
The union thingy is only necessary if you need to do it all in one request (you have to initialize them in a lower select than the one they're used in).
Edit: Well if you need the date too you can just select it along with the user_id, but calculating the number of days in sql will be a pain. Just join the result table to the users table and get both the date of 25th order and their date of registration, you'll surely be able to do the difference in code.
I'll try building an actual request, however if you want to truly understand what you need to make this you gotta read up on mysql variables, unions, and conditional statements.
"Looks too complicated. I am sure that this can be done with current DB structure and 1-2 requests." Well, yeah. Use the COUNT request, it will be easy, and slow as hell.
For the complex answer, see http://www.sqlfiddle.com/#!9/998358/21
Since you can use multiple requests, you can just initialize the vars first.
It isn't actually THAT complicated, you just have to understand how to concretely express what you mean by "an user's 25th command" to a SQL engine.
See http://www.sqlfiddle.com/#!9/998358/24 for the difference in days, turns out there's a method for that.
Edit 5: seems you're going with the COUNT method. I'll pray your DB is small.
Edit 6: For posterity:
The count method will take years on very large databases. Since OP didn't come back, I'm assuming his is small enough to overlook query speed. If that's not your case and let's say it's 10 years from now and the sqlfiddle links are dead; here's the two-queries solution:
SET #PREV_USR:=0;
SELECT user_id, date_ FROM (
SELECT user_id, date_, SAME_USR AS IGNORE_SMUSR,
#RANK_USR:=(CASE SAME_USR WHEN 0 THEN 1 ELSE #RANK_USR+1 END) AS RANK FROM (
SELECT orders.*, CASE WHEN #PREV_USR = user_id THEN 1 ELSE 0 END AS SAME_USR,
#PREV_USR:=user_id AS IGNORE_USR FROM
orders
ORDER BY user_id ASC, date_ ASC, id ASC
) AS DERIVED_1
) AS DERIVED_2
WHERE RANK = 25 AND YEAR(date_) = 2019 AND MONTH(date_) = 4 ;
Just change RANK = ? and the conditions to fit your needs. If you want to fully understand it, start by the innermost SELECT then work your way high; this version fuses the points 1 & 2 of my explanation.
Now sometimes you will have to use an API or something and it wont let you keep variable values in memory unless you commit it or some other restriction, and you'll need to do it in one query. To do that, you put the initialization one step lower and make it so it does not affect the higher statements. IMO the best way to do this is in a UNION with a fake table where the only row is excluded. You'll avoid the hassle of a JOIN and it's just better overall.
SELECT user_id, date_ FROM (
SELECT user_id, date_, SAME_USR AS IGNORE_SMUSR,
#RANK_USR:=(CASE SAME_USR WHEN 0 THEN 1 ELSE #RANK_USR+1 END) AS RANK FROM (
SELECT DERIVED_4.*, CASE WHEN #PREV_USR = user_id THEN 1 ELSE 0 END AS SAME_USR,
#PREV_USR:=user_id AS IGNORE_USR FROM
(SELECT * FROM orders
UNION
SELECT * FROM (
SELECT (#PREV_USR:=0) AS INIT_PREV_USR, 0 AS COL_2, 0 AS COL_3
) AS DERIVED_3
WHERE INIT_PREV_USR <> 0
) AS DERIVED_4
ORDER BY user_id ASC, date_ ASC, id ASC
) AS DERIVED_1
) AS DERIVED_2
WHERE RANK = 25 AND YEAR(date_) = 2019 AND MONTH(date_) = 4 ;
With that method, the thing to watch for is the amount and the type of columns in your basic table. Here orders' first field is an int, so I put INIT_PREV_USR in first then there are two more fields so I just add two zeroes with names and call it a day. Most types work, since the union doesn't actually do anything, but I wouldn't try this when your first field is a blob (worst comes to worst you can use a JOIN).
You'll note this is derived from a method of pagination in mysql. If you want to apply this to other engines, just check out their best pagination calls and you should be able to work thinks out.

Query with three tables, no common column

I've just started a job and my boss wants me to learn mySQL so please bear with me, i've been learning for only 2 days and i'm not that good at it yet.
So i've been given 3 tables and several tasks to do.
The tables are:
mobile_log_messages_sms
mobile_providers
service_instances
And in them i've got to:
Find out how many messages there were in the last 25 days and how
much income did they make
Then i need to group them by day (so per day, exclude hours) and
provider name.
Also i need to ignore all the messages that have an empty string
under the service column
Also i need to ignore the messages that made 0 income and count only
those that have the column service_enabled = 1
And then i need to sort it descending, by date.
in the tables
mobile_log_messages_sms:
message_id - used to count the messages
price - using for price obviously, exlude those with 0
time - date in yyyy/mm/dd hh:mm:ss format
service - exclude all those that have an empty string (or null)
mobile_providers
provider_name - to use to group with
service_instances
enabled - only use if value is 1
I've started with:
SELECT message_id, price, time
FROM mobile_log_messages_sms
WHERE time BETWEEN '2017-02-26 00:00:00'
AND time AND '2017-03-22 00:00:00'
But i need to change the date format and then use the JOIN commands but i don't know how, and i know i need to add more to it, but i'm stumped even at the start. Also the starting just lists the messages but i need to count the total sum of the income (price) per day.
Can anyone point me in the right direction at least since i'm still a noob? Many thanks in advance and sorry if i worded something badly, english is not my first language.
Find out how many messages there were in the last 25 days and how much income did they make
1.
SELECT COUNT(message_id), SUM(price)
FROM mobile_log_messages_sms
WHERE CAST(time AS DATE) BETWEEN DATE_SUB(CURRENT_DATE,INTERVAL 25 DAY)
AND CURRENT_DATE;
2.
SELECT COUNT(message_id), SUM(price)
FROM mobile_log_messages_sms
WHERE CAST(time AS DATE) BETWEEN DATE_SUB(CURRENT_DATE,INTERVAL 25 DAY)
AND CURRENT_DATE
GROUP BY CAST(time AS DATE);
3.
SELECT COUNT(message_id), SUM(price)
FROM mobile_log_messages_sms
WHERE CAST(time AS DATE) BETWEEN DATE_SUB(CURRENT_DATE,INTERVAL 25 DAY)
AND CURRENT_DATE AND service IS NULL
GROUP BY CAST(time AS DATE);
rest can't done with join so make sure that at least one column should be common in tables.