MySQL - Getting age and numbers of days between two dates - mysql

I am trying to query a huge database (aprroximately 20 millions records) to get some data. This is the query I am working on right now.
SELECT a.user_id, b.last_name, b.first_name, c.birth_date FROM users a
INNER JOIN users_signup b ON a.user_id a = b.user_id
INNER JOIN users_personal c ON a.user_id a = c.user_id
INNER JOIN
(
SELECT distinct d.a.user_id FROM users_signup d
WHERE d.join_date >= '2013-01-01' and d.join_date < '2014-01-01'
)
AS t ON a.user_id = t.user_id
I have some problems trying to retrieve additional data from the database. I would like to add 2 additional field to the results table:
I am able to get the birth date but I would like to get the age of the members in the results table. The data is stored as 'yyyy-mm-dd' in the users_personal table.
I would like to get the total days since a member joined till the day the left (if any) from a table called user_signup using data from join_date & left_date (format: yyyy-mm-dd).

Or you can do just this ...
SELECT
TIMESTAMPDIFF(YEAR, birthday, CURDATE()) AS age_in_years,
TIMESTAMPDIFF(MONTH, birthday, CURDATE()) AS age_in_month,
TIMESTAMPDIFF(DAY, birthday, CURDATE()) AS age_in_days,
TIMESTAMPDIFF(MINUTE, birthday, NOW()) AS age_in_minutes,
TIMESTAMPDIFF(SECOND, birthday, NOW()) AS age_in_seconds
FROM
table_name

Try this:
SELECT a.user_id, b.last_name, b.first_name, c.birth_date,
FLOOR(DATEDIFF(CURRENT_DATE(), c.birth_date) / 365) age,
DATEDIFF(b.left_date, b.join_date) workDays
FROM users a
INNER JOIN users_signup b ON a.user_id a = b.user_id
INNER JOIN users_personal c ON a.user_id a = c.user_id
WHERE b.join_date >= '2013-01-01' AND b.join_date < '2014-01-01'
GROUP BY a.user_id

You can use datediff function to find number of days between two days like
select datediff(date1,date2) from table
select datediff(curdate(),date2) from table

Getting the current age in years:
SELECT DATE_FORMAT(FROM_DAYS(DATEDIFF(DATE(NOW()), birthday)), '%Y') * 1 AS age FROM table_name;
How this works:
datediff(date1, date2) gives the difference between two dates in days. Note that the date format of 'birthday' here is date: YYYY-MM-DD.
from_days converts days into a date format
date_format function extracts with '%Y' only the four digit year. Don't use '%y', because you only get a two digit year and some people are older then 99 years.
multiply the string with 1. This is a 'hack'. MySQL will convert a string like 'YYYY' into an integer.
Getting the current age in month (unlikley, but someone may need this)
SELECT (DATE_FORMAT(FROM_DAYS(DATEDIFF(DATE(NOW()), birthday)), '%Y') * 1 * 12)
+ (DATE_FORMAT(FROM_DAYS(DATEDIFF(DATE(NOW()), birthday)), '%m') * 1) AS age_in_months
FROM table_name;
How this works:
Mostly the same as age in years above.
The years get muliplied by 12. A (earth) year has 12 months.
In the next step the months are extracted the same way as the years, but instead the flag '%Y' must be changed to '%m'.
At the end the two values are added together.
Getting the current age in days is as simple as this:
SELECT DATEDIFF(DATE(NOW()), birthday) AS age_in_days FROM table_name;
Alternative code:
SELECT
DATE_FORMAT(age_date, '%Y') * 1 AS age_in_years,
(DATE_FORMAT(age_date, '%Y') * 1 * 12) + (DATE_FORMAT(age_date, '%m') * 1) AS age_in_months,
age_in_days
FROM
(SELECT
FROM_DAYS(DATEDIFF(DATE(NOW()), birthday)) AS age_date,
DATEDIFF(DATE(NOW()), birthday) AS age_in_days
FROM table_name) AS age_date;

Related

Get the last 3 months from the time of the last order date

My task is to get the total commission in the last 5 months. This is my code. I am using mysql.
SELECT CONCAT(a.first_name, " ", a.last_name) AS sales_reps,
YEAR(c.order_date),
ROUND(SUM((d.quantity_ordered*d.price_each)*.01), 2) AS commission_last_6mos
FROM employees a
LEFT JOIN customers b ON b.sales_rep_employee_no=a.employee_no
LEFT JOIN orders c on b.customer_no = c.customer_no
LEFT JOIN order_details d ON c.order_no = d.order_no
WHERE job_title='Sales Rep'AND c.order_date >= CURDATE()- INTERVAL 5 MONTH
GROUP BY CONCAT(a.first_name, " ", a.last_name)
ORDER BY commission_last_6mos DESC
LIMIT 1;
I have also used now(). They do not show any results.
It looks to me like the table containing job_title is not specified. If it is in the table employees, then you should have a.job_title, for example. For the time range, try:
AND c.order_date >= DATE_SUB(now(), INTERVAL 6 MONTH)
For more information about the DATE_SUB function, check out https://www.w3schools.com/sql/func_mysql_date_sub.asp

Can I combine separate month and year column for this query?

I currently am trying to track the number of messages sent by month as well as the volume's percent change in comparison to one year prior.
Here is my current query:
Select
a.mo,
a.ye,
a.Messages,
((a.Messages - b.Messages) / b.Messages) as "% Change"
from(
select
MONTH(post_date) as mo,
count(*) as "Messages",
YEAR(post_date) as ye
from
pm_messages
WHERE
post_date > "2018-01-01 00:00:00"
group by
year(post_date),
month(post_date)
) a
left join (
select
MONTH(post_date) as mo,
YEAR(post_date) as ye,
count(*) as "Messages"
from
pm_messages
group by
year(post_date),
month(post_date)
) b on a.mo = b.mo
and a.ye -1 = b.ye
This works great, however, it places month and year in separate columns, which has been messing up the graphs I am working with. However, when I try to pull month and year into one columns as I've done in other queries from the same table, i.e. using:
SELECT DATE_FORMAT(`post_date`,'%M %Y')
My query does not work.
Does anyone know how I can combine my current query to still calculate the return from a year prior but have month and date come up as one column, as opposed to (Month | Year | Messages | % Change)
Thanks!!
you can use extract instead of separate year() and month() functions :
EXTRACT(YEAR_MONTH from post_date)
of course you have to group by this instead of year, month . for example :
select
EXTRACT(YEAR_MONTH from post_date) yearmonth,
count(*) as "Messages"
from
pm_messages
group by
EXTRACT(YEAR_MONTH from post_date)
If you have data for every month, you can use lag():
select year(post_date) as ye, month(post_date) as mo,
count(*) as Messages,
lag(count(*)) over (partition by month(post_date) order by year(post_date)) as prev_year
from pm_messages
where post_date >= '2018-01-01'
group by year(post_date), month(post_date)

mysql how to mulitply some values in the same column but not others if it meets a condition

Is it possible to mulitply some values in the same column but not others if the value meets a certain condition? I don't want to create another column.
Query I am working with:
SELECT
name ,
ROUND(SUM(orderline_sales.amount * orderline_sales.price) * orders_sales.discount * customers.annual_discount) AS total_revenue
FROM
orderline_sales
JOIN
orders_sales ON orders_sales.id = orderline_sales.orders_sales_id
JOIN
employee ON orders_sales.empoyee_id = employee.id
JOIN
customers ON orders_sales.customer_id = customers.id
WHERE
date BETWEEN DATE_SUB(CURRENT_DATE, INTERVAL 365 DAY) AND CURRENT_DATE
GROUP BY employee.name
ORDER BY totale_omzet DESC
LIMIT 1;
The orders_sales table contains a date attributetype and the orders_sales table has a 1:n cardinal relationship with orderline_sales. I only want to multiply the SUM result with customers.annual_discount if the YEAR of the order is higher than 2017. How would I go about doing this?
you can use CASE.
SELECT
CASE WHEN YEAR > 2017 THEN
ROUND(SUM(orderline_sales.amount * orderline_sales.price) *
orders_sales.discount *
customers.annual_discount)
ELSE
(orderline_sales.price * orders_sales.discount * customers.annual_discount)
END AS total_revenue FROM orderline_sales
JOIN
orders_sales ON orders_sales.id = orderline_sales.orders_sales_id
JOIN
employee ON orders_sales.empoyee_id = employee.id
JOIN
customers ON orders_sales.customer_id = customers.id
WHERE
date BETWEEN DATE_SUB(CURRENT_DATE, INTERVAL 365 DAY) AND CURRENT_DATE
GROUP BY employee.name
ORDER BY totale_omzet DESC

SQL selecting average score over range of dates

I have 3 tables:
doctors (id, name) -> has_many:
patients (id, doctor_id, name) -> has_many:
health_conditions (id, patient_id, note, created_at)
Every day each patient gets added a health condition with a note from 1 to 10 where 10 is a good health (full recovery if you may).
What I want to extract is the following 3 statistics for the last 30 days (month):
- how many patients got better
- how many patients got worst
- how many patients remained the same
These statistics are global so I don't care right now of statistics per doctor which I could extract given the right query.
The trick is that the query needs to extract the current health_condition note and compare with the average of past days (this month without today) so one needs to extract today's note and an average of the other days excluding this one.
I don't think the query needs to define who went up/down/same since I can loop and decide that. Just today vs. rest of the month will be sufficient I guess.
Here's what I have so far which obv. doesn't work because it only returns one result due to the limit applied:
SELECT
p.id,
p.name,
hc.latest,
hcc.average
FROM
pacients p
INNER JOIN (
SELECT
id,
pacient_id,
note as LATEST
FROM
health_conditions
GROUP BY pacient_id, id
ORDER BY created_at DESC
LIMIT 1
) hc ON(hc.pacient_id=p.id)
INNER JOIN (
SELECT
id,
pacient_id,
avg(note) AS average
FROM
health_conditions
GROUP BY pacient_id, id
) hcc ON(hcc.pacient_id=p.id AND hcc.id!=hc.id)
WHERE
date_part('epoch',date_trunc('day', hcc.created_at))
BETWEEN
(date_part('epoch',date_trunc('day', hc.created_at)) - (30 * 86400))
AND
date_part('epoch',date_trunc('day', hc.created_at))
The query has all the logic it needs to distinguish between what is latest and average but that limit kills everything. I need that limit to extract the latest result which is used to compare with past results.
Something like this assuming created_at is of type date
select p.name,
hc.note as current_note,
av.avg_note
from patients p
join health_conditions hc on hc.patient_id = p.id
join (
select patient_id,
avg(note) as avg_note
from health_conditions hc2
where created_at between current_date - 30 and current_date - 1
group by patient_id
) avg on t.patient_id = hc.patient_id
where hc.created_at = current_date;
This is PostgreSQL syntax. I'm not sure if MySQL supports date arithmetics the same way.
Edit:
This should get you the most recent note for each patient, plus the average for the last 30 days:
select p.name,
hc.created_at as last_note_date
hc.note as current_note,
t.avg_note
from patients p
join health_conditions hc
on hc.patient_id = p.id
and hc.created_at = (select max(created_at)
from health_conditions hc2
where hc2.patient_id = hc.patient_id)
join (
select patient_id,
avg(note) as avg_note
from health_conditions hc3
where created_at between current_date - 30 and current_date - 1
group by patient_id
) t on t.patient_id = hc.patient_id
SELECT SUM(delta < 0) AS worsened,
SUM(delta = 0) AS no_change,
SUM(delta > 0) AS improved
FROM (
SELECT patient_id,
SUM(IF(DATE(created_at) = CURDATE(),note,NULL))
- AVG(IF(DATE(created_at) < CURDATE(),note,NULL)) AS delta
FROM health_conditions
WHERE DATE(created_at) BETWEEN CURDATE() - INTERVAL 1 MONTH AND CURDATE()
GROUP BY patient_id
) t

Trying to correct a mysql query

I currently have the following query;
SELECT a.schedID,
a.start AS eventDate, b.div_id AS divisionID, b.div_name AS divisionName
FROM schedules a
INNER JOIN divisions b ON b.div_id = a.div_id
WHERE date_format(a.start, '%Y-%m-%d') >= '2010-01-01'
AND DATE_ADD(a.start, INTERVAL 5 DAY) <= CURDATE()
AND NOT EXISTS (SELECT results_id FROM results e WHERE e.schedID = a.schedID)
ORDER BY eventDate ASC;
Im trying to basically find any schedules that do not have any results 5 days after the schedule date. My current query has major performance issues. It also times out inconsistently. Is there a different way to write the query? Im at a mental roadblock. Any help is appreciated.
Without antcipating much on the outcome I would suggest the following leads :
* try to remove the date_format as this generates one function call per record. I don't know the format of your column a.start but this should be possible.
* same for DATE_ADD, you could probably put it on the other member like :
a.start <= DATE_SUB(CURDATE(), INTERVAL 5 DAYS)
you get a chance the result is cached rather than being calculated for each line, you could even define it as a parameter upfront
* the NOT EXISTS is very expensive, it seems to mee you could replace this by a left join like :
schedules a LEFT JOIN results e ON a.schedId = e.schedId WHERE e.schedId is NULL
double-check that all join fields are well indexed.
Good luck
Maybe something like:
SELECT
a.schedID, a.start AS eventDate, b.div_id AS divisionID, b.div_name AS divisionName
FROM
schedules a
INNER JOIN divisions b ON b.div_id = a.div_id
WHERE
date_format(a.start, '%Y-%m-%d') >= '2010-01-01'
AND NOT EXISTS (
SELECT
*
FROM
results e
INNER JOIN schedules a2 ON e.schedID = a2.schedID
WHERE
DATE_ADD(a2.start, INTERVAL 5 DAY) <= CURDATE()
AND a2.id = a.id
)
ORDER BY eventDate ASC;
dont know if mysql is same as oracle but are you converting a date to a string here and then comparing it with a string '2010-01-01' ? Can you convvert 2010-01-01 to a date instead so that if there is an index on a.start, it can be used ?
Also does this query definitely return the right answer ?
You mention you want schedules without results 5 days after the schedule date but it looks like you are aksing for anything in the last 5 days ?
a.start >= 1-Jan-10 and start date + 5 days is before today
try this query
SELECT a.schedID,
a.start AS eventDate,
b.div_id AS divisionID,
b.div_name AS divisionName
FROM (SELECT * FROM schedules s WHERE DATE(s.start) >= '2010-01-01' AND DATE_ADD(s.start, INTERVAL 5 DAY) <= CURDATE()) a
INNER JOIN divisions b
ON b.div_id = a.div_id
LEFT JOIN (SELECT results_id FROM results) e
ON e.schedID = a.schedID
WHERE e.results_id = ''
ORDER BY eventDate ASC;