SQL SELECT QUERY - from two tables - mysql

I faced a difficult problem about SQL query.
Please help me
I have two tables like below.
TABLE_A TABLE_B
Date Value_A Date Value_B
20180201 52 20180131 120
20180202 50 20180201 114
20180205 48 20180203 127
20180206 57 20180204 140
20180207 60 20180206 129
And I want to get this result.
Date Value_A PreValue_B
20180201 52 120
20180202 50 114
20180205 48 140
20180206 57 140
20180207 60 129
Date and Value_A are the same as TABLE_A.
PreValue_B is from Value_B.
But it is the value of the maximum (closest) & previous date of TABLE_B.
so, the closest previous date of 20180201 of TABLE_B is "20180131".
...
the closest previous date of 20180205 is "20180204", so PreValue_B is 140.
20180206 is "20180204", so PreValue_B is 140 again.
and so on...
How to make the SQL Query?
Thanks to everyone!

A typical approach uses correlated subqueries:
select a.*,
(select b.value
from b
where b.date < a.date
order by b.date desc
fetch first 1 row only
) as prevalue_b
from a;
This uses the ANSI standard method for limiting to one row. Some databases spell this as limit or select top 1.

Try this:
SELECT sub.date, sub.a, b.b
FROM
(SELECT a.date, a.a, MAX(b.date) AS b_date
FROM a
INNER JOIN b
ON (a.date > b.date)
GROUP BY a.date, a.a) sub
INNER JOIN b
ON sub.b_date = b.date
ORDER BY sub.date
In the sub-query, find the date that should be selected in b for each date in a. Then join the results back to b, in order to show the b value.
Tested here: http://rextester.com/ERP28040

Related

How to add subquery to get last record in group by

I would like assistance with adding a subquery into the below query as I understand this is the method I need to use to get the result from the last record for scan_type column, not the first record in the group by due to mysql server running 5.7.
I have tried doing this but I am not understanding how I can put the subquery into the current query. I have tried unsuccessfully which causes the query to error.
Currently I am able to get the date/time stamp by using MAX which gives me the last record for the person's attendance, but I am having trouble getting the related "scan_type". Apart from this, the remainder of the query returns all of the expected results.
Below is the current query:
SELECT A.attendance_sessions_id, A.person_id, A.scan_type, A.absence_type, MAX(A.date_time), B.name, B.student_level
FROM `attendance_record` A
LEFT JOIN `person` B ON A.person_id = B.student_no
WHERE A.scan_type IS NULL
OR A.scan_type <> 'evac_scan'
OR A.scan_type NOT LIKE 'evac_%'
GROUP BY A.attendance_sessions_id, A.person_id
Below is the current output of the above query:
attendance_sessions_id
person_id
scan_type
absence_type
MAX(A.date_time)
name
student_level
1
65
scan_in
NULL
2022-02-06 12:59:48
Chris
Year 1
Expecting scan_type = "scan_out"
attendance_record table:
attendance_record_id
attendance_sessions_id
person_id
scan_type
absence_type
date_time
4
1
65
scan_in
NULL
2022-02-05 20:13:17
5
1
65
scan_out
NULL
2022-02-05 20:14:39
6
1
65
scan_in
NULL
2022-02-06 12:06:45
7
1
65
evac_scan
NULL
2022-02-06 12:53:01
8
1
65
scan_out
NULL
2022-02-06 12:59:48
person table:
person_id
student_no
name
student_level
9
65
Chris
Year 1
attendance_sessions table:
attenance_sessions_id
session_name
session_date_time
1
February Weekend 1
2022-02-05 00:01:00
Since some time only_full_group_by is the default, (at least for MySQL 8+ ). It would be great to change this query in such a way that it's handled correctly, als in the furture.
SELECT
x.attendance_sessions_id,
x.person_id,
A.scan_type,
A.absence_type,
x.max_date_time,
B.name,
B.student_level
FROM (
SELECT
A.attendance_sessions_id,
A.person_id,
-- A.scan_type,
-- A.absence_type,
MAX(A.date_time) as max_date_time,
-- B.name,
-- B.student_level
FROM `attendance_record` A
-- LEFT JOIN `person` B ON A.person_id = B.student_no
WHERE A.scan_type IS NULL
OR A.scan_type <> 'evac_scan'
OR A.scan_type NOT LIKE 'evac_%'
GROUP BY
A.attendance_sessions_id,
A.person_id
) x
INNER JOIN `attendance_record` A ON A.attendance_sessions_id = x.attendance_sessions_id
AND A.person_id = x.person_id
AND A.date_time = x.max_date_time
LEFT JOIN `person` B ON B.student_no = A.person_id
Removed some columns (--) because of the only_full_group_by setting, and removed the LEFT JOIN because in the current sub-query the table person is no longer used.
Changed query to sub-query, and added all (remove)fields to the outer query which also includes a JOIN to get the MAX record from attendance_record
NOTE: When there are multiple records with the same date_time for one attendance_sessions_id,person_id, this query will not produce correct results.

MySQL Query trying to (CROSS?) JOIN on one table

Been trying to figure this out for a couple hours and hoping for some expert assistance:
I have a single Mysql table with data such as:
Date version amount
2021-03-01 A 100
2021-03-02 A 35
2021-03-02 B 80
2021-03-03 A 7
2021-03-03 B 90
2021-03-03 C 3
2021-03-03 A 8
2021-03-04 B 15
2021-03-04 C 90
2021-03-04 B 10
And trying to get output for each version for every day, with amount populated as '0' for null;
Result:
Date version SUM(amount)
2021-03-01 A 100
2021-03-01 B 0
2021-03-01 c 0
2021-03-02 A 35
2021-03-02 B 80
2021-03-02 C 0
2021-03-03 A 15
2021-03-03 B 90
2021-03-03 C 3
2021-03-04 A 0
2021-03-04 B 25
2021-03-04 C 90
I tried various 'JOIN', 'LEFT JOIN' and 'CROSS JOIN' permutations without success.
SELECT distinct c1.date, c2.version
FROM crash_log c1
LEFT OUTER JOIN crash_log c2 ON c1.date = c2.date
GROUP BY c1.date, c2.version
(not even messing with the SUM, just trying to get all the rows with this one)
For now, I have a script that does this by brute force: gets DISTINCT date, then get DISTINCT version, then do a nested loop and build an array for each combination. One trouble is it's not scalable and seems the web connection is timing out before the process finishes on a large set.
I'm thinking there's one (semi-?) efficient query that can do this, but I haven't been able to figure it out.
Write subqueries to get all the dates and versions. Cross join these to get every combination.
Then left join that with the table to get either the actual value or default to 0 when NULL.
SELECT d.date, v.version, IFNULL(c.sum, 0) AS sum
FROM (
SELECT DISTINCT date
FROM crash_log) AS d
CROSS JOIN (
SELECT DISTINCT version
FROM crash_log) AS v
LEFT JOIN (
SELECT date, version, SUM(amount) AS sum
FROM crash_log
GROUP BY date, version) AS c ON d.date = c.date AND v.version = c.version
ORDER BY d.date, v.version
Just like your script, but in SQL.
Cross join the distinct dates to the distinct versions and left join to the table and finally aggregation:
SELECT d.Date, v.version, COALESCE(SUM(t.amount), 0) sum_amount
FROM (SELECT DISTINCT Date FROM tablename) d
CROSS JOIN (SELECT DISTINCT version FROM tablename) v
LEFT JOIN tablename t
ON t.Date = d.Date AND t.version = v.version
GROUP BY d.Date, v.version

Join and Group by Clause gives wrong output

I have two table L with columns (Code, Qtr, Fy, Limit) and R with (Code, Qtr,Fy,Limit). I want to get sum of limit of left and right table group by code, Qtr an Fy
The following query runs with no error but gives wrong output, can anyone help me in getting right output. IF I use only one table it works fine. I guess problem is with join
select L.Code, L. Qtr, L.FY, sum(L.limit),sum(R.Limit)
from tbl L,tbl R Where
L.Code=R.Code AND
L.Qtr=R.Qtr AND
L.FY=R.FY
group by L.Code,L.Qtr,L.FY
Sample Data ( the table contains other column as well but here i m keeping only selected)
Tbl L
Code Qtr, Fy Limit
001 1 70 200
001 1 70 700
001 2 70 500
001 2 70 300
Table R
Code Qtr Fy Limit
001 1 70 1000
001 1 70 200
001 2 70 50
001 2 70 125
Result
Code Qtr Fy Sum(l.Limit) sum(R.Limit)
001 1 70 900 1200
001 2 70 800 175
I m Using Mysql
Try this query:
select code, qtr, fy, sum(lsum), sum(rsum)
from (
select L.Code, L.Qtr, L.FY, L.limit as lsum, 0 as rsum
from L
union all
select R.Code, R.Qtr, R.FY, 0 as lsum, R.limit as rsum
from R) as combined
group by code, qtr, fy
Using join in this case would be a wrong idea because it will create multiple records (one for each match between L and R) and then when you do a sum you get incorrect results.
The problem is indeed the join - specifically, you're running into problems because you are using a GROUP BY after the join, when the join criteria results in non-unique rows. Usually, the way to solve this is to group before the join:
SELECT L.code, L.qtr, L.fy, L.lim as L_lim, R.lim as R_lim
FROM (SELECT code, qtr, fy, SUM(lim) as lim
FROM L
GROUP BY code, qtr, fy) L
JOIN (SELECT code, qtr, fy, SUM(lim) as lim
FROM R
GROUP BY code, qtr, fy) R
ON R.code = L.code
AND R.qtr = L.qtr
AND R.fy = L.fy
(have a working SQL Fiddle example)
Note that this will only show results for rows that are in both tables. Also, LIMIT is a reserved word (in MySQL and some other RDBMSs), so you're better off avoiding that for a column name.

Write SQL query to find rows that are near min() value

I have about 5000 rows of data as follow:
id date temperature room
--------------------------------------
0 2013-07-15 76 A
1 2013-08-15 72 A
2 2013-09-15 74 B
3 2013-02-15 71 B
4 2013-03-15 72 B
5 2013-04-15 70 A
...
...
...
5000 2013-08-01 68 A
I can use the query from below to find min temperature in each room.
select room, min(temperature) from table_record group by room.
Now, I need to find all rows that are close to the the min temperature for each room.
I have try using "join" on the same table as below, but it cannot be run.
select t1.room, min(t1.temperature) from table_record t1
join on table_record t2 on
t2.room = t1.room and
t2.temperature * 0.95 (less_or_equal) min(t1.temperature)
group by room
You need to do this in two steps.
SELECT
*
FROM
(
SELECT room, MIN(temperature) AS min_temp FROM TABLE_RECORD GROUP BY room
)
AS ROOM_TEMP
INNER JOIN
TABLE_RECORD
ON TABLE_RECORD.room = ROOM_TEMP.room
AND TABLE_RECORD.temperature <= ROOM_TEMP.min_temp / 0.95
Provided that you have an index on (room, temperature) this should be pretty quick.
Also, note that I use x <= y / 0.95 rather than x * 0.95 <= y. This is to make the lookup faster (manipulate the search criteria once, rather than the searched field on every row).

Retrieving data from joined MySQL tables using an AVG in the WHERE clause?

I am trying to select data from multiple tables which uses an AVG in the WHERE clause.
SELECT company_metrics.*, companies.company_name, companies.permalink
FROM company_metrics LEFT JOIN companies
ON companies.company_id = company_metrics.company_id
WHERE MONTH(date) = '04' AND YEAR(date) = '2011'
HAVING (SELECT avg(company_unique_visitors)
FROM (SELECT company_metrics.company_unique_visitors
FROM company_metrics
ORDER BY company_metrics.date DESC LIMIT 3)
average ) >'2000'
ORDER BY date DESC
Example Data:
###Company Metrics#### Table
company_id company_unique_visitors date
----------- ----------------------- ----
604 2054 2011-04-01
604 3444 2011-03-01
604 2122 2011-02-01
604 2144 2011-01-01
604 2001 2010-12-01
602 2011 2011-04-01
602 11 2011-03-01
602 411 2011-02-01
602 611 2011-01-01
602 111 2010-12-01
EDIT
I would like only the 3 latest numbers from company_unique_visitors AVG'ed
/EDIT
So the query would select company_id 604 but it wouldn't select company_id 602 because 602 doesn't have an AVG greater than 2000.
I need help writing the correct query to do as I have described. I can clarify if needed.
Thanks for your help!
There are several problems with your query as written. I'm not completely clear as to the structure of all the tables, but I believe I understand the gist based on the query you posted. Your first problem with the posted query is that you're not grouping by or using any aggregates in the query where you're using the HAVING clause. You use aggregates in one of the subqueries, but the HAVING where it is right now doesn't make much sense.
I believe you wanted to group by the company_id before you did an aggregate of the averages, so I made that the primary group by on the outer query. You were also using too many nested queries to accomplish what was was a seemingly simple task of only selecting the 3 most recent measurements. I moved that subquery into the primary join so that the data was only selected once and in a logical way.
And, without further ceremony, here's the fixed query:
SELECT limited_metrics.*, companies.company_name, companies.permalink,
avg(limited_metrics.company_unique_visitors) AS avg_visitors
FROM
(SELECT *
FROM company_metrics
ORDER BY company_metrics.date DESC LIMIT 3) AS limited_metrics
LEFT JOIN companies
ON companies.company_id = limited_metrics.company_id
WHERE MONTH(limited_metrics.date) = '04' AND YEAR(limited_metrics.date) = '2011'
GROUP BY companies.company_id
HAVING avg_visitors > 2000
Ok based off of Jared Harding's answer and this post: Moving average - MySQL
I was able to figure out the query.
SELECT metrics.*,companies.company_name,companies.permalink
FROM (SELECT company_id,AVG(company_unique_visitors) AS met_avg
FROM company_metrics
WHERE `date` BETWEEN DATE_SUB(NOW(), INTERVAL 4 MONTH) AND NOW()
GROUP BY company_id HAVING met_avg>2000) AS metrics
LEFT JOIN companies ON companies.company_id=metrics.company_id
Thanks Jared for all your help!