I am attempting to join table_a and table_b together on their nearest/closest datetime fields (date_a and date_b), but I am wanting to ensure that I do not receive duplicate values from table_b for each joined row. If the available date_b rows from table_b are used up on closer table_a values, then the joined row should just remain blank.
Another way of putting it: the datetime values from table_b can only be used once, and they should only be used on the absolute closest/nearest value from table_a.
Here's an example of table_a:
| entry_a | date_a |
|---------|---------------------|
| 1 | 2019-02-20 01:05:00 |
| 2 | 2019-02-20 01:10:00 |
| 3 | 2019-02-21 01:15:00 |
| 4 | 2019-02-22 01:20:00 |
| 5 | 2019-02-23 01:25:00 |
| 6 | 2019-02-24 01:30:00 |
| 7 | 2019-02-25 01:35:00 |
| 8 | 2019-02-26 01:40:00 |
| 9 | 2019-02-27 01:45:00 |
| 10 | 2019-02-28 01:50:00 |
Here's table_b:
| entry_b | date_b | filename |
|---------|---------------------|----------------|
| 1 | 2019-02-20 01:03:00 | 20190220010300 |
| 2 | 2019-02-20 01:07:00 | 20190220010700 |
| 3 | 2019-02-23 01:23:00 | 20190223012300 |
| 4 | 2019-02-24 01:26:00 | 20190224012600 |
| 5 | 2019-02-25 01:30:00 | 20190225013000 |
| 6 | 2019-02-26 01:34:00 | 20190226013400 |
| 7 | 2019-02-27 01:40:00 | 20190227014000 |
| 8 | 2019-02-28 01:50:00 | 20190228015000 |
| 9 | 2019-02-28 01:51:00 | 20190228015100 |
And here's the desired result:
| entry_a | date_a | entry_b | date_b | filename |
|---------|---------------------|---------|---------------------|----------------|
| 1 | 2019-02-20 01:05:00 | 1 | 2019-02-20 01:03:00 | 20190220010300 |
| 2 | 2019-02-20 01:10:00 | 2 | 2019-02-20 01:07:00 | 20190220010700 |
| 3 | 2019-02-21 01:15:00 | (null) | (null) | (null) |
| 4 | 2019-02-22 01:20:00 | 3 | 2019-02-23 01:23:00 | 20190223012300 |
| 5 | 2019-02-23 01:25:00 | 4 | 2019-02-24 01:26:00 | 20190224012600 |
| 6 | 2019-02-24 01:30:00 | 5 | 2019-02-25 01:30:00 | 20190225013000 |
| 7 | 2019-02-25 01:35:00 | 6 | 2019-02-26 01:34:00 | 20190226013400 |
| 8 | 2019-02-26 01:40:00 | 7 | 2019-02-27 01:40:00 | 20190227014000 |
| 9 | 2019-02-27 01:45:00 | 8 | 2019-02-28 01:50:00 | 20190228015000 |
| 10 | 2019-02-28 01:50:00 | 9 | 2019-02-28 01:51:00 | 20190228015100 |
One thing to particularly note in the desired result: the last two rows show that date_b.8 and date_a.10 match exactly ... but if date_b.8 and date_a.9 are allowed to match, then date_b.9 and date_a.10 can match on something fairly close, also. (If this is an impossible complication, I understand. It's not critical. What's more important is the situation illustrated in rows 2-4 of the result_table.)
I am using MySQL 5.6. I've built a SQL fiddle here with the tables loaded up: DEMO
Thank you all very kindly for your help and for the many answers you've provided to guide me over the years.
Related
The Problem:
Construct the SQL statement to find all of the people that have meetings only before Dec. 25, 2016 at noon using INNER JOINs. Display the following columns:
Person’s first name
Person’s last name
Meeting ID
Meeting start date and time
Meeting end date and time
The Tables:
There are 5 tables in this database(person, building, room, meeting, person_meeting
+-----------+------------+------------+
| person_id | first_name | last_name |
+-----------+------------+------------+
| 1 | Tom | Hanks |
| 2 | Anne | Hathaway |
| 3 | Tom | Cruise |
| 4 | Meryl | Streep |
| 5 | Chris | Pratt |
| 6 | Halle | Berry |
| 7 | Robert | De Niro |
| 8 | Julia | Roberts |
| 9 | Denzel | Washington |
| 10 | Melissa | McCarthy |
+-----------+------------+------------+
+-------------+----------------------+
| building_id | building_name |
+-------------+----------------------+
| 1 | Headquarters |
| 2 | Main Street Buidling |
+-------------+----------------------+
+---------+-------------+-------------+----------+
| room_id | room_number | building_id | capacity |
+---------+-------------+-------------+----------+
| 1 | 100 | 1 | 5 |
| 2 | 200 | 1 | 4 |
| 3 | 300 | 1 | 10 |
| 4 | 10 | 2 | 4 |
| 5 | 20 | 2 | 4 |
+---------+-------------+-------------+----------+
+------------+---------+---------------------+---------------------+
| meeting_id | room_id | meeting_start | meeting_end |
+------------+---------+---------------------+---------------------+
| 1 | 1 | 2016-12-25 09:00:00 | 2016-12-25 10:00:00 |
| 2 | 1 | 2016-12-25 10:00:00 | 2016-12-25 12:00:00 |
| 3 | 1 | 2016-12-25 11:00:00 | 2016-12-25 12:00:00 |
| 4 | 2 | 2016-12-25 09:00:00 | 2016-12-25 10:00:00 |
| 5 | 4 | 2016-12-25 09:00:00 | 2016-12-25 10:00:00 |
| 6 | 5 | 2016-12-25 14:00:00 | 2016-12-25 16:00:00 |
+------------+---------+---------------------+---------------------+
+-----------+------------+
| person_id | meeting_id |
+-----------+------------+
| 1 | 1 |
| 10 | 1 |
| 1 | 2 |
| 2 | 2 |
| 3 | 2 |
| 4 | 2 |
| 5 | 2 |
| 6 | 2 |
| 7 | 2 |
| 8 | 2 |
| 9 | 3 |
| 10 | 3 |
| 1 | 4 |
| 2 | 4 |
| 8 | 5 |
| 9 | 5 |
| 1 | 6 |
| 2 | 6 |
| 3 | 6 |
+-----------+------------+
My Solution so Far:
SELECT first_name,last_name ,building_name,meeting_start,meeting_end
FROM person P
INNER JOIN building B
ON P.person_id=PM.person_id
INNER JOIN person_meeting PM
ON M.room_id
I'm having trouble completing the SQL statement, please help if possible.
this might do the trick for you. I used aliases for each of the tables in the join and used them in the select statements for the columns you needed. Then I joined the needed tables and in the where it determines the ones with a meeting_end before dec 12 at noon. (i assume, if you wanted it start just switch it to meeting_start)
select p.first_name,p.last_name,pm.meeting_id,m.meeting_start,m.meeting_end from person p
inner join person_meeting pm on pm.person_id = p.person_id
inner join meeting m on m.meeting_id = pm.meeting_id
where m.meeting_end > '2016-12-25 12:00:00'
I want to fetch the data from Table based on date but in an incremental way.
Suppose I have data like this which is grouped by date
| DATE | Count |
| 2015-06-23 | 10 |
| 2015-06-24 | 8 |
| 2015-06-25 | 6 |
| 2015-06-26 | 3 |
| 2015-06-27 | 2 |
| 2015-06-29 | 2 |
| 2015-06-30 | 3 |
| 2015-07-01 | 1 |
| 2015-07-02 | 3 |
| 2015-07-03 | 4 |
So the result should come like this
| DATE | Count| Sum|
| 2015-06-23 | 10 | 10 |
| 2015-06-24 | 8 | 18 |
| 2015-06-25 | 6 | 24 |
| 2015-06-26 | 3 | 27 |
| 2015-06-27 | 2 | 29 |
| 2015-06-29 | 2 | 31 |
| 2015-06-30 | 3 | 34 |
| 2015-07-01 | 1 | 35 |
| 2015-07-02 | 3 | 38 |
| 2015-07-03 | 4 | 42 |
You would join every other previous date on that date, and then sum the count on that
If you give me your table structure, I can make it run.
id, name, date_joined
SELECT counts.theCount, sum(counts.theCount), table.date_joined
FROM yourTable
LEFT JOIN
(SELECT count(*) as theCount, table.date_joined
FROM yourTable
GROUP BY table.date_joined
) as counts
ON
yourTable.date_joined> counts.date_joined
GROUP BY yourTable.date_joined
I have three tables, mess_stock, mess_voucher, add_grocery.
Mess_stock table is below,
+-----+------------+-----------------+-----------------+--------+---------+---------+------------+----------+
| sno | voucher_id | particular_name | opening_balance | inward | outward | balance | pay_amount | pay_type |
+-----+------------+-----------------+-----------------+--------+---------+---------+------------+----------+
| 49 | 5 | 4 | 100 | 10 | 100 | 10 | 10.00 | 1 |
| 50 | 17 | 5 | 111 | 10 | 20 | 101 | 60.00 | 1 |
| 51 | 7 | 3 | 123 | 2 | 1 | 124 | 300.00 | 1 |
| 52 | 7 | 1 | 123 | 20 | 20 | 123 | 500.00 | 2 |
| 53 | 14 | 8 | 100 | 5 | 95 | 10 | 60.00 | 2 |
+-----+------------+-----------------+-----------------+--------+---------+---------+------------+----------+
Mess_voucher table is below
+------------+--------------+--------------+
| voucher_id | voucher_name | voucher_date |
+------------+--------------+--------------+
| 5 | VG1001 | 2015-02-19 |
| 6 | VG1001 | 2015-02-20 |
| 7 | VG1002 | 2015-02-20 |
| 8 | VG1002 | 2015-02-19 |
| 9 | MS1001 | 2015-02-20 |
| 10 | VG10012 | 2015-02-19 |
| 11 | 0 | 2015-02-23 |
| 12 | 1 | 2015-02-24 |
| 13 | MS1001 | 2015-02-25 |
| 14 | MS1001 | 2015-02-28 |
| 15 | VG1003 | 2015-02-28 |
| 16 | MS1001 | 2015-02-19 |
| 17 | MS1001 | 2015-02-21 |
+------------+--------------+--------------+
Add_grocery table is below
+-----+-----------------+------------------+
| sno | particular_name | particular_price |
+-----+-----------------+------------------+
| 1 | Rice | 25.00 |
| 3 | Mango | 150.00 |
| 4 | Coconut | 22.00 |
| 5 | Banana | 6.00 |
| 6 | Raddish | 12.00 |
| 7 | Apple | 150.00 |
| 8 | Pumkin | 12.00 |
+-----+-----------------+------------------+
I want to group the sum of pay_amount of mess_stock table. I have used the below query
SELECT opening_balance AS ope_stock,
balance AS clo_stock,
SUM(IF(pay_type = 1, pay_amount, 0)) mess_pay,
SUM(IF(pay_type=2, pay_amount, 0)) est_pay
FROM mess_stock;
That works fine. The particular_name is the auto increment id of add_grocery table. I need the inward outward amount total. For example the inward amount 10 means it has to get the particular_price from add_grocery using the particular_name provided in the mess_stock table, similarly I need all the answer. And I want to sort that by date wise. The date of the entry is stored in the mess_voucher table that is connected to mess_stock table.
Try this it will work :
Use Inner Join :
SELECT t2.`particular_name`,t1.`inward`,t1.`outward`,t2.`particular_price`,t3.`voucher_date` from Mess_stock t1 JOIN Add_grocery t2 ON t1.`particular_name`=t2.`sno` JOIN Mess_voucher t3 ON t3.`voucher_id`=t1.`voucher_id` ORDER BY t3.`voucher_date` DESC
I have a following SQL statement and it generates the relevant output correctly (I want to group every 3 minutes values) :
SELECT date_time date, UNIX_TIMESTAMP(date_time) AS time_value,
FLOOR((MINUTE(date_time) + (HOUR(date_time)*60))/3) AS minute_value, ph1_active_power AS p1
FROM powerpro1
GROUP BY date_time
Generated output :
+-----------+---------------------+------------+--------------+---------+
| record_no | date | time_value | minute_value | p1 |
+-----------+---------------------+------------+--------------+---------+
| 1 | 2014-12-01 00:00:00 | 1417372200 | 0 | 73.0767 |
| 2 | 2014-12-01 00:01:00 | 1417372260 | 0 | 73.0293 |
| 3 | 2014-12-01 00:02:00 | 1417372320 | 0 | 72.9818 |
| 4 | 2014-12-01 00:03:00 | 1417372380 | 1 | 72.9343 |
| 5 | 2014-12-01 00:04:00 | 1417372440 | 1 | 72.8868 |
| 6 | 2014-12-01 00:05:00 | 1417372500 | 1 | 72.8392 |
| 7 | 2014-12-01 00:06:00 | 1417372560 | 2 | 72.7916 |
| 8 | 2014-12-01 00:07:00 | 1417372620 | 2 | 72.744 |
| 9 | 2014-12-01 00:08:00 | 1417372680 | 2 | 72.6963 |
| 10 | 2014-12-01 00:09:00 | 1417372740 | 3 | 72.6486 |
| 11 | 2014-12-01 00:10:00 | 1417372800 | 3 | 72.6009 |
| 12 | 2014-12-01 00:11:00 | 1417372860 | 3 | 72.5531 |
| 13 | 2014-12-01 00:12:00 | 1417372920 | 4 | 72.5053 |
| 14 | 2014-12-01 00:13:00 | 1417372980 | 4 | 72.4575 |
| 15 | 2014-12-01 00:14:00 | 1417373040 | 4 | 72.4096 |
| 16 | 2014-12-01 00:15:00 | 1417373100 | 5 | 72.3617 |
| 17 | 2014-12-01 00:16:00 | 1417373160 | 5 | 72.3137 |
| 18 | 2014-12-01 00:17:00 | 1417373220 | 5 | 72.2657 |
| 19 | 2014-12-01 00:18:00 | 1417373280 | 6 | 72.2177 |
| 20 | 2014-12-01 00:19:00 | 1417373340 | 6 | 72.1697 |
| 21 | 2014-12-01 00:20:00 | 1417373400 | 6 | 72.1216 |
| 22 | 2014-12-01 00:21:00 | 1417373460 | 7 | 72.0734 |
| 23 | 2014-12-01 00:22:00 | 1417373520 | 7 | 72.0253 |
| 24 | 2014-12-01 00:23:00 | 1417373580 | 7 | 71.9771 |
+-----------+---------------------+------------+--------------+---------+
But, I want to get the average of time_value and the average of p1 and then need to GROUP by minute_ value. If I used above query for that with the relevant changes as follows,
SELECT date_time date, AVG(UNIX_TIMESTAMP(date_time)) AS time_value, FLOOR((MINUTE(date_time) + (HOUR(date_time)*60))/3) AS minute_value, ROUND(AVG(ph1_active_power),4) AS p1
FROM powerpro1
GROUP BY minute_value
I got the incorrect out put as mentioned below.
+-----------+---------------------+-----------------+--------------+--------+
| record_no | date | time_value | minute_value | p1 |
+-----------+---------------------+-----------------+--------------+--------+
| 1 | 2014-12-01 00:00:00 | 1418754688.6364 | 0 | 2.2622 |
| 4 | 2014-12-01 00:03:00 | 1418754868.6364 | 1 | 2.2541 |
| 7 | 2014-12-01 00:06:00 | 1418755048.6364 | 2 | 2.246 |
| 10 | 2014-12-01 00:09:00 | 1418755228.6364 | 3 | 2.2378 |
| 13 | 2014-12-01 00:12:00 | 1418755408.6364 | 4 | 2.2297 |
| 16 | 2014-12-01 00:15:00 | 1418755588.6364 | 5 | 2.2216 |
| 19 | 2014-12-01 00:18:00 | 1418755768.6364 | 6 | 2.2134 |
| 22 | 2014-12-01 00:21:00 | 1418755948.6364 | 7 | 2.2052 |
+-----------+---------------------+-----------------+--------------+--------+
Required Output :
+-----------+---------------------+--------------+------------+---------+
| record_no | time_value | minute_value | time_value | p1 |
+-----------+---------------------+--------------+------------+---------+
| 2 | 2014-12-01 00:01:00 | 0 | 1417372260 | 73.0293 |
| 5 | 2014-12-01 00:04:00 | 1 | 1417372440 | 72.8868 |
| 8 | 2014-12-01 00:07:00 | 2 | 1417372620 | 72.744 |
| 11 | 2014-12-01 00:10:00 | 3 | 1417372800 | 72.6009 |
| 14 | 2014-12-01 00:13:00 | 4 | 1417372980 | 72.4575 |
+-----------+---------------------+--------------+------------+---------+
What may be the wrong.
Can anyone help me using the valuable time and knowledge.
can you try this?
SELECT date_time date, SUM(UNIX_TIMESTAMP(date_time))/COUNT(record_no) AS time_value, FLOOR((MINUTE(date_time) + (HOUR(date_time)*60))/3)*3 AS minute_value, ROUND((SUM(ph1_active_power)/COUNT(record_no)),4) AS p1
FROM powerpro1
GROUP BY minute_value
I have done it by the following query :
SELECT record_no, date_time,
ROUND(AVG(UNIX_TIMESTAMP(date_time))) AS time_value,
ROUND(AVG(ph1_active_power),4) AS p1
FROM powerpro1
WHERE date_time <= '2014-12-20 00:00:00'
GROUP BY date_time DIV 300
I'm new here, someone would have a possible solution to a problem I could not solve with subquery, any idea how to solve the problem?
Basically I need all patients "pa_name", most current exam for each "field: pe_d2" Like "Expected Result:"
I tried to make a sketch of the result, might help understand the problem ...
The "pacient_exams" table has very many records, the query needs to be very fast.
Thanks in advance for possible solutions! []
patient_exams
+-------+----------+----------+------------+------------+
| pe_id | pe_pa_id | pe_ex_id | pe_d1 | pe_d2 |
+-------+----------+----------+------------+------------+
| 1 | 1 | 1 | 2014-05-19 | 2016-05-19 |
| 2 | 1 | 2 | 2014-05-19 | 2015-05-19 |
| 3 | 1 | 3 | 2014-05-26 | 2014-11-26 |
| 4 | 1 | 3 | 2014-05-19 | 2014-11-19 |
| 5 | 1 | 4 | 2013-05-19 | 2013-11-19 |
| 6 | 1 | 4 | 2014-05-19 | 2014-11-19 |
| 7 | 3 | 1 | 2013-08-19 | 2014-08-19 |
| 8 | 3 | 1 | 2014-05-01 | 2017-05-01 |
| 9 | 4 | 2 | 2013-05-02 | 2014-05-02 |
| 10 | 4 | 2 | 2013-11-01 | 2014-05-01 |
| 11 | 4 | 4 | 2013-05-02 | 2014-05-02 |
| 12 | 4 | 4 | 2013-11-01 | 2014-05-01 |
+-------+----------+----------+------------+------------+
patient exams
+-------+---------+ +-------+---------+
| pa_id | pa_name | | ex_id | ex_name |
+-------+---------+ +-------+---------+
| 1 | John M. | | 1 | Exam 1 |
| 2 | Slater | | 2 | Exam 2 |
| 3 | Jonny | | 3 | Exam 3 |
| 4 | Jessy | | 4 | Exam 4 |
| ... | ... | | ... | ... |
+-------+---------+ +-------+---------+
Expected Result:
+-------+---------+---------+------------+------------+
| pe_id | pa_name | ex_name | pe_d1 | pe_d2 |
+-------+---------+---------+------------+------------+
| 9 | Jessy | Exam 2 | 2013-05-02 | 2014-05-02 |
| 11 | Jessy | Exam 4 | 2013-05-02 | 2014-05-02 |
| 1 | John M. | Exam 1 | 2014-05-19 | 2016-05-19 |
| 2 | John M. | Exam 2 | 2014-05-19 | 2015-05-19 |
| 3 | John M. | Exam 3 | 2014-05-26 | 2014-11-26 |
| 6 | John M. | Exam 4 | 2014-05-26 | 2014-11-26 |
| 8 | Jonny | Exam 1 | 2014-05-01 | 2017-05-01 |
+-------+---------+---------+------------+------------+
You need to first get the latest records from the patient_exams table and then join all the 3 tables with the filtered results, like this:
SELECT pe_id, pa_name, ex_name, pe_d1, pe_d2
FROM patient_exams pe
JOIN patient p
ON pe.pe_pa_id = p.pa_id
JOIN exams e
ON pe.pe_ex_id = e.ex_id
JOIN (
SELECT pe_pa_id, pe_ex_id, MAX(pe_d2) AS max_pe_d2
FROM patient_exams
GROUP BY pe_pa_id, pe_ex_id
) AS t
ON pe.pe_pa_id = t.pe_pa_id
AND pe.pe_ex_id = t.pe_ex_id
AND pe.pe_d2 = t.max_pe_d2
ORDER BY pa_name, ex_name
Demo/Solution
Thanks to everyone, works fine!
You can use joins among your tables,for the max exam date you need an additional self join to patient_exams with a subquery to get the maxima of exam date i.e max(pe_d2)
select
pe.pe_id,
p.pa_name ,
e.ex_name ,
pe.pe_d1 ,
pe.pe_d2
from exams e
join patient_exams pe on(e.ex_id = pe.pe_ex_id)
join patient p on(p.pa_id= pe.pe_pa_id)
join (select `pe_pa_id`, `pe_ex_id` ,max(pe_d2) pe_d2
from patient_exams
group by `pe_pa_id`, `pe_ex_id`) pee
on (pe.`pe_pa_id`= pee.`pe_pa_id` and
pe.`pe_ex_id` = pee.`pe_ex_id` and
pe.pe_d2 = pee.pe_d2
)
order by p.pa_name ,pee.pe_d2 desc
Demo