Left join tables on nearest datetime without duplicating values - mysql

I am attempting to join table_a and table_b together on their nearest/closest datetime fields (date_a and date_b), but I am wanting to ensure that I do not receive duplicate values from table_b for each joined row. If the available date_b rows from table_b are used up on closer table_a values, then the joined row should just remain blank.
Another way of putting it: the datetime values from table_b can only be used once, and they should only be used on the absolute closest/nearest value from table_a.
Here's an example of table_a:
| entry_a | date_a |
|---------|---------------------|
| 1 | 2019-02-20 01:05:00 |
| 2 | 2019-02-20 01:10:00 |
| 3 | 2019-02-21 01:15:00 |
| 4 | 2019-02-22 01:20:00 |
| 5 | 2019-02-23 01:25:00 |
| 6 | 2019-02-24 01:30:00 |
| 7 | 2019-02-25 01:35:00 |
| 8 | 2019-02-26 01:40:00 |
| 9 | 2019-02-27 01:45:00 |
| 10 | 2019-02-28 01:50:00 |
Here's table_b:
| entry_b | date_b | filename |
|---------|---------------------|----------------|
| 1 | 2019-02-20 01:03:00 | 20190220010300 |
| 2 | 2019-02-20 01:07:00 | 20190220010700 |
| 3 | 2019-02-23 01:23:00 | 20190223012300 |
| 4 | 2019-02-24 01:26:00 | 20190224012600 |
| 5 | 2019-02-25 01:30:00 | 20190225013000 |
| 6 | 2019-02-26 01:34:00 | 20190226013400 |
| 7 | 2019-02-27 01:40:00 | 20190227014000 |
| 8 | 2019-02-28 01:50:00 | 20190228015000 |
| 9 | 2019-02-28 01:51:00 | 20190228015100 |
And here's the desired result:
| entry_a | date_a | entry_b | date_b | filename |
|---------|---------------------|---------|---------------------|----------------|
| 1 | 2019-02-20 01:05:00 | 1 | 2019-02-20 01:03:00 | 20190220010300 |
| 2 | 2019-02-20 01:10:00 | 2 | 2019-02-20 01:07:00 | 20190220010700 |
| 3 | 2019-02-21 01:15:00 | (null) | (null) | (null) |
| 4 | 2019-02-22 01:20:00 | 3 | 2019-02-23 01:23:00 | 20190223012300 |
| 5 | 2019-02-23 01:25:00 | 4 | 2019-02-24 01:26:00 | 20190224012600 |
| 6 | 2019-02-24 01:30:00 | 5 | 2019-02-25 01:30:00 | 20190225013000 |
| 7 | 2019-02-25 01:35:00 | 6 | 2019-02-26 01:34:00 | 20190226013400 |
| 8 | 2019-02-26 01:40:00 | 7 | 2019-02-27 01:40:00 | 20190227014000 |
| 9 | 2019-02-27 01:45:00 | 8 | 2019-02-28 01:50:00 | 20190228015000 |
| 10 | 2019-02-28 01:50:00 | 9 | 2019-02-28 01:51:00 | 20190228015100 |
One thing to particularly note in the desired result: the last two rows show that date_b.8 and date_a.10 match exactly ... but if date_b.8 and date_a.9 are allowed to match, then date_b.9 and date_a.10 can match on something fairly close, also. (If this is an impossible complication, I understand. It's not critical. What's more important is the situation illustrated in rows 2-4 of the result_table.)
I am using MySQL 5.6. I've built a SQL fiddle here with the tables loaded up: DEMO
Thank you all very kindly for your help and for the many answers you've provided to guide me over the years.

Related

SQL INNER JOIN Multiple Tables and columns

The Problem:
Construct the SQL statement to find all of the people that have meetings only before Dec. 25, 2016 at noon using INNER JOINs. Display the following columns:
 Person’s first name
 Person’s last name
 Meeting ID
 Meeting start date and time
 Meeting end date and time
The Tables:
There are 5 tables in this database(person, building, room, meeting, person_meeting
+-----------+------------+------------+
| person_id | first_name | last_name |
+-----------+------------+------------+
| 1 | Tom | Hanks |
| 2 | Anne | Hathaway |
| 3 | Tom | Cruise |
| 4 | Meryl | Streep |
| 5 | Chris | Pratt |
| 6 | Halle | Berry |
| 7 | Robert | De Niro |
| 8 | Julia | Roberts |
| 9 | Denzel | Washington |
| 10 | Melissa | McCarthy |
+-----------+------------+------------+
+-------------+----------------------+
| building_id | building_name |
+-------------+----------------------+
| 1 | Headquarters |
| 2 | Main Street Buidling |
+-------------+----------------------+
+---------+-------------+-------------+----------+
| room_id | room_number | building_id | capacity |
+---------+-------------+-------------+----------+
| 1 | 100 | 1 | 5 |
| 2 | 200 | 1 | 4 |
| 3 | 300 | 1 | 10 |
| 4 | 10 | 2 | 4 |
| 5 | 20 | 2 | 4 |
+---------+-------------+-------------+----------+
+------------+---------+---------------------+---------------------+
| meeting_id | room_id | meeting_start | meeting_end |
+------------+---------+---------------------+---------------------+
| 1 | 1 | 2016-12-25 09:00:00 | 2016-12-25 10:00:00 |
| 2 | 1 | 2016-12-25 10:00:00 | 2016-12-25 12:00:00 |
| 3 | 1 | 2016-12-25 11:00:00 | 2016-12-25 12:00:00 |
| 4 | 2 | 2016-12-25 09:00:00 | 2016-12-25 10:00:00 |
| 5 | 4 | 2016-12-25 09:00:00 | 2016-12-25 10:00:00 |
| 6 | 5 | 2016-12-25 14:00:00 | 2016-12-25 16:00:00 |
+------------+---------+---------------------+---------------------+
+-----------+------------+
| person_id | meeting_id |
+-----------+------------+
| 1 | 1 |
| 10 | 1 |
| 1 | 2 |
| 2 | 2 |
| 3 | 2 |
| 4 | 2 |
| 5 | 2 |
| 6 | 2 |
| 7 | 2 |
| 8 | 2 |
| 9 | 3 |
| 10 | 3 |
| 1 | 4 |
| 2 | 4 |
| 8 | 5 |
| 9 | 5 |
| 1 | 6 |
| 2 | 6 |
| 3 | 6 |
+-----------+------------+
My Solution so Far:
SELECT first_name,last_name ,building_name,meeting_start,meeting_end
FROM person P
INNER JOIN building B
ON P.person_id=PM.person_id
INNER JOIN person_meeting PM
ON M.room_id
I'm having trouble completing the SQL statement, please help if possible.
this might do the trick for you. I used aliases for each of the tables in the join and used them in the select statements for the columns you needed. Then I joined the needed tables and in the where it determines the ones with a meeting_end before dec 12 at noon. (i assume, if you wanted it start just switch it to meeting_start)
select p.first_name,p.last_name,pm.meeting_id,m.meeting_start,m.meeting_end from person p
inner join person_meeting pm on pm.person_id = p.person_id
inner join meeting m on m.meeting_id = pm.meeting_id
where m.meeting_end > '2016-12-25 12:00:00'

MySQL: Get everyday incremental data

I want to fetch the data from Table based on date but in an incremental way.
Suppose I have data like this which is grouped by date
| DATE | Count |
| 2015-06-23 | 10 |
| 2015-06-24 | 8 |
| 2015-06-25 | 6 |
| 2015-06-26 | 3 |
| 2015-06-27 | 2 |
| 2015-06-29 | 2 |
| 2015-06-30 | 3 |
| 2015-07-01 | 1 |
| 2015-07-02 | 3 |
| 2015-07-03 | 4 |
So the result should come like this
| DATE | Count| Sum|
| 2015-06-23 | 10 | 10 |
| 2015-06-24 | 8 | 18 |
| 2015-06-25 | 6 | 24 |
| 2015-06-26 | 3 | 27 |
| 2015-06-27 | 2 | 29 |
| 2015-06-29 | 2 | 31 |
| 2015-06-30 | 3 | 34 |
| 2015-07-01 | 1 | 35 |
| 2015-07-02 | 3 | 38 |
| 2015-07-03 | 4 | 42 |
You would join every other previous date on that date, and then sum the count on that
If you give me your table structure, I can make it run.
id, name, date_joined
SELECT counts.theCount, sum(counts.theCount), table.date_joined
FROM yourTable
LEFT JOIN
(SELECT count(*) as theCount, table.date_joined
FROM yourTable
GROUP BY table.date_joined
) as counts
ON
yourTable.date_joined> counts.date_joined
GROUP BY yourTable.date_joined

SQL joining three tables and split into columns

I have three tables, mess_stock, mess_voucher, add_grocery.
Mess_stock table is below,
+-----+------------+-----------------+-----------------+--------+---------+---------+------------+----------+
| sno | voucher_id | particular_name | opening_balance | inward | outward | balance | pay_amount | pay_type |
+-----+------------+-----------------+-----------------+--------+---------+---------+------------+----------+
| 49 | 5 | 4 | 100 | 10 | 100 | 10 | 10.00 | 1 |
| 50 | 17 | 5 | 111 | 10 | 20 | 101 | 60.00 | 1 |
| 51 | 7 | 3 | 123 | 2 | 1 | 124 | 300.00 | 1 |
| 52 | 7 | 1 | 123 | 20 | 20 | 123 | 500.00 | 2 |
| 53 | 14 | 8 | 100 | 5 | 95 | 10 | 60.00 | 2 |
+-----+------------+-----------------+-----------------+--------+---------+---------+------------+----------+
Mess_voucher table is below
+------------+--------------+--------------+
| voucher_id | voucher_name | voucher_date |
+------------+--------------+--------------+
| 5 | VG1001 | 2015-02-19 |
| 6 | VG1001 | 2015-02-20 |
| 7 | VG1002 | 2015-02-20 |
| 8 | VG1002 | 2015-02-19 |
| 9 | MS1001 | 2015-02-20 |
| 10 | VG10012 | 2015-02-19 |
| 11 | 0 | 2015-02-23 |
| 12 | 1 | 2015-02-24 |
| 13 | MS1001 | 2015-02-25 |
| 14 | MS1001 | 2015-02-28 |
| 15 | VG1003 | 2015-02-28 |
| 16 | MS1001 | 2015-02-19 |
| 17 | MS1001 | 2015-02-21 |
+------------+--------------+--------------+
Add_grocery table is below
+-----+-----------------+------------------+
| sno | particular_name | particular_price |
+-----+-----------------+------------------+
| 1 | Rice | 25.00 |
| 3 | Mango | 150.00 |
| 4 | Coconut | 22.00 |
| 5 | Banana | 6.00 |
| 6 | Raddish | 12.00 |
| 7 | Apple | 150.00 |
| 8 | Pumkin | 12.00 |
+-----+-----------------+------------------+
I want to group the sum of pay_amount of mess_stock table. I have used the below query
SELECT opening_balance AS ope_stock,
balance AS clo_stock,
SUM(IF(pay_type = 1, pay_amount, 0)) mess_pay,
SUM(IF(pay_type=2, pay_amount, 0)) est_pay
FROM mess_stock;
That works fine. The particular_name is the auto increment id of add_grocery table. I need the inward outward amount total. For example the inward amount 10 means it has to get the particular_price from add_grocery using the particular_name provided in the mess_stock table, similarly I need all the answer. And I want to sort that by date wise. The date of the entry is stored in the mess_voucher table that is connected to mess_stock table.
Try this it will work :
Use Inner Join :
SELECT t2.`particular_name`,t1.`inward`,t1.`outward`,t2.`particular_price`,t3.`voucher_date` from Mess_stock t1 JOIN Add_grocery t2 ON t1.`particular_name`=t2.`sno` JOIN Mess_voucher t3 ON t3.`voucher_id`=t1.`voucher_id` ORDER BY t3.`voucher_date` DESC

MYSQL Average datetime

I have a following SQL statement and it generates the relevant output correctly (I want to group every 3 minutes values) :
SELECT date_time date, UNIX_TIMESTAMP(date_time) AS time_value,
FLOOR((MINUTE(date_time) + (HOUR(date_time)*60))/3) AS minute_value, ph1_active_power AS p1
FROM powerpro1
GROUP BY date_time
Generated output :
+-----------+---------------------+------------+--------------+---------+
| record_no | date | time_value | minute_value | p1 |
+-----------+---------------------+------------+--------------+---------+
| 1 | 2014-12-01 00:00:00 | 1417372200 | 0 | 73.0767 |
| 2 | 2014-12-01 00:01:00 | 1417372260 | 0 | 73.0293 |
| 3 | 2014-12-01 00:02:00 | 1417372320 | 0 | 72.9818 |
| 4 | 2014-12-01 00:03:00 | 1417372380 | 1 | 72.9343 |
| 5 | 2014-12-01 00:04:00 | 1417372440 | 1 | 72.8868 |
| 6 | 2014-12-01 00:05:00 | 1417372500 | 1 | 72.8392 |
| 7 | 2014-12-01 00:06:00 | 1417372560 | 2 | 72.7916 |
| 8 | 2014-12-01 00:07:00 | 1417372620 | 2 | 72.744 |
| 9 | 2014-12-01 00:08:00 | 1417372680 | 2 | 72.6963 |
| 10 | 2014-12-01 00:09:00 | 1417372740 | 3 | 72.6486 |
| 11 | 2014-12-01 00:10:00 | 1417372800 | 3 | 72.6009 |
| 12 | 2014-12-01 00:11:00 | 1417372860 | 3 | 72.5531 |
| 13 | 2014-12-01 00:12:00 | 1417372920 | 4 | 72.5053 |
| 14 | 2014-12-01 00:13:00 | 1417372980 | 4 | 72.4575 |
| 15 | 2014-12-01 00:14:00 | 1417373040 | 4 | 72.4096 |
| 16 | 2014-12-01 00:15:00 | 1417373100 | 5 | 72.3617 |
| 17 | 2014-12-01 00:16:00 | 1417373160 | 5 | 72.3137 |
| 18 | 2014-12-01 00:17:00 | 1417373220 | 5 | 72.2657 |
| 19 | 2014-12-01 00:18:00 | 1417373280 | 6 | 72.2177 |
| 20 | 2014-12-01 00:19:00 | 1417373340 | 6 | 72.1697 |
| 21 | 2014-12-01 00:20:00 | 1417373400 | 6 | 72.1216 |
| 22 | 2014-12-01 00:21:00 | 1417373460 | 7 | 72.0734 |
| 23 | 2014-12-01 00:22:00 | 1417373520 | 7 | 72.0253 |
| 24 | 2014-12-01 00:23:00 | 1417373580 | 7 | 71.9771 |
+-----------+---------------------+------------+--------------+---------+
But, I want to get the average of time_value and the average of p1 and then need to GROUP by minute_ value. If I used above query for that with the relevant changes as follows,
SELECT date_time date, AVG(UNIX_TIMESTAMP(date_time)) AS time_value, FLOOR((MINUTE(date_time) + (HOUR(date_time)*60))/3) AS minute_value, ROUND(AVG(ph1_active_power),4) AS p1
FROM powerpro1
GROUP BY minute_value
I got the incorrect out put as mentioned below.
+-----------+---------------------+-----------------+--------------+--------+
| record_no | date | time_value | minute_value | p1 |
+-----------+---------------------+-----------------+--------------+--------+
| 1 | 2014-12-01 00:00:00 | 1418754688.6364 | 0 | 2.2622 |
| 4 | 2014-12-01 00:03:00 | 1418754868.6364 | 1 | 2.2541 |
| 7 | 2014-12-01 00:06:00 | 1418755048.6364 | 2 | 2.246 |
| 10 | 2014-12-01 00:09:00 | 1418755228.6364 | 3 | 2.2378 |
| 13 | 2014-12-01 00:12:00 | 1418755408.6364 | 4 | 2.2297 |
| 16 | 2014-12-01 00:15:00 | 1418755588.6364 | 5 | 2.2216 |
| 19 | 2014-12-01 00:18:00 | 1418755768.6364 | 6 | 2.2134 |
| 22 | 2014-12-01 00:21:00 | 1418755948.6364 | 7 | 2.2052 |
+-----------+---------------------+-----------------+--------------+--------+
Required Output :
+-----------+---------------------+--------------+------------+---------+
| record_no | time_value | minute_value | time_value | p1 |
+-----------+---------------------+--------------+------------+---------+
| 2 | 2014-12-01 00:01:00 | 0 | 1417372260 | 73.0293 |
| 5 | 2014-12-01 00:04:00 | 1 | 1417372440 | 72.8868 |
| 8 | 2014-12-01 00:07:00 | 2 | 1417372620 | 72.744 |
| 11 | 2014-12-01 00:10:00 | 3 | 1417372800 | 72.6009 |
| 14 | 2014-12-01 00:13:00 | 4 | 1417372980 | 72.4575 |
+-----------+---------------------+--------------+------------+---------+
What may be the wrong.
Can anyone help me using the valuable time and knowledge.
can you try this?
SELECT date_time date, SUM(UNIX_TIMESTAMP(date_time))/COUNT(record_no) AS time_value, FLOOR((MINUTE(date_time) + (HOUR(date_time)*60))/3)*3 AS minute_value, ROUND((SUM(ph1_active_power)/COUNT(record_no)),4) AS p1
FROM powerpro1
GROUP BY minute_value
I have done it by the following query :
SELECT record_no, date_time,
ROUND(AVG(UNIX_TIMESTAMP(date_time))) AS time_value,
ROUND(AVG(ph1_active_power),4) AS p1
FROM powerpro1
WHERE date_time <= '2014-12-20 00:00:00'
GROUP BY date_time DIV 300

Selecting and grouping with subqueries in MySQL - Returning latest record

I'm new here, someone would have a possible solution to a problem I could not solve with subquery, any idea how to solve the problem?
Basically I need all patients "pa_name", most current exam for each "field: pe_d2" Like "Expected Result:"
I tried to make a sketch of the result, might help understand the problem ...
The "pacient_exams" table has very many records, the query needs to be very fast.
Thanks in advance for possible solutions! []
patient_exams
+-------+----------+----------+------------+------------+
| pe_id | pe_pa_id | pe_ex_id | pe_d1 | pe_d2 |
+-------+----------+----------+------------+------------+
| 1 | 1 | 1 | 2014-05-19 | 2016-05-19 |
| 2 | 1 | 2 | 2014-05-19 | 2015-05-19 |
| 3 | 1 | 3 | 2014-05-26 | 2014-11-26 |
| 4 | 1 | 3 | 2014-05-19 | 2014-11-19 |
| 5 | 1 | 4 | 2013-05-19 | 2013-11-19 |
| 6 | 1 | 4 | 2014-05-19 | 2014-11-19 |
| 7 | 3 | 1 | 2013-08-19 | 2014-08-19 |
| 8 | 3 | 1 | 2014-05-01 | 2017-05-01 |
| 9 | 4 | 2 | 2013-05-02 | 2014-05-02 |
| 10 | 4 | 2 | 2013-11-01 | 2014-05-01 |
| 11 | 4 | 4 | 2013-05-02 | 2014-05-02 |
| 12 | 4 | 4 | 2013-11-01 | 2014-05-01 |
+-------+----------+----------+------------+------------+
patient exams
+-------+---------+ +-------+---------+
| pa_id | pa_name | | ex_id | ex_name |
+-------+---------+ +-------+---------+
| 1 | John M. | | 1 | Exam 1 |
| 2 | Slater | | 2 | Exam 2 |
| 3 | Jonny | | 3 | Exam 3 |
| 4 | Jessy | | 4 | Exam 4 |
| ... | ... | | ... | ... |
+-------+---------+ +-------+---------+
Expected Result:
+-------+---------+---------+------------+------------+
| pe_id | pa_name | ex_name | pe_d1 | pe_d2 |
+-------+---------+---------+------------+------------+
| 9 | Jessy | Exam 2 | 2013-05-02 | 2014-05-02 |
| 11 | Jessy | Exam 4 | 2013-05-02 | 2014-05-02 |
| 1 | John M. | Exam 1 | 2014-05-19 | 2016-05-19 |
| 2 | John M. | Exam 2 | 2014-05-19 | 2015-05-19 |
| 3 | John M. | Exam 3 | 2014-05-26 | 2014-11-26 |
| 6 | John M. | Exam 4 | 2014-05-26 | 2014-11-26 |
| 8 | Jonny | Exam 1 | 2014-05-01 | 2017-05-01 |
+-------+---------+---------+------------+------------+
You need to first get the latest records from the patient_exams table and then join all the 3 tables with the filtered results, like this:
SELECT pe_id, pa_name, ex_name, pe_d1, pe_d2
FROM patient_exams pe
JOIN patient p
ON pe.pe_pa_id = p.pa_id
JOIN exams e
ON pe.pe_ex_id = e.ex_id
JOIN (
SELECT pe_pa_id, pe_ex_id, MAX(pe_d2) AS max_pe_d2
FROM patient_exams
GROUP BY pe_pa_id, pe_ex_id
) AS t
ON pe.pe_pa_id = t.pe_pa_id
AND pe.pe_ex_id = t.pe_ex_id
AND pe.pe_d2 = t.max_pe_d2
ORDER BY pa_name, ex_name
Demo/Solution
Thanks to everyone, works fine!
You can use joins among your tables,for the max exam date you need an additional self join to patient_exams with a subquery to get the maxima of exam date i.e max(pe_d2)
select
pe.pe_id,
p.pa_name ,
e.ex_name ,
pe.pe_d1 ,
pe.pe_d2
from exams e
join patient_exams pe on(e.ex_id = pe.pe_ex_id)
join patient p on(p.pa_id= pe.pe_pa_id)
join (select `pe_pa_id`, `pe_ex_id` ,max(pe_d2) pe_d2
from patient_exams
group by `pe_pa_id`, `pe_ex_id`) pee
on (pe.`pe_pa_id`= pee.`pe_pa_id` and
pe.`pe_ex_id` = pee.`pe_ex_id` and
pe.pe_d2 = pee.pe_d2
)
order by p.pa_name ,pee.pe_d2 desc
Demo