Join to another table only matching specific records - mysql

I have a table of ports:
drop table if exists ports;
create table ports(id int, name char(20));
insert into ports (id, name ) values
(1, 'Port hedland'),
(2, 'Kwinana');
And a table of tariffs connected to those ports:
drop table if exists tariffs;
create table tariffs(id int, portId int, price decimal(12,2), expiry bigint(11));
insert into tariffs (id, portId, price, expiry ) values
(1, 2, 11.00, 1648408400),
(2, 2, 12.00, 1648508400),
(3, 2, 13.00, 1648594800),
(4, 2, 14.00, 1651273200),
(5, 2, 15.00, 2250000000 );
insert into tariffs (id, portId, price, expiry ) values
(1, 1, 21.00, 1648408400),
(2, 1, 22.00, 1648508400),
(3, 1, 23.00, 1648594800),
(4, 1, 24.00, 1651273200),
(5, 1, 25.00, 2250000000 );
Each tariff has an expiry.
I can easily make a query to figure out the right tariff for as specific date for each port. For example at timestamp 1648594700 the right tariff is:
SELECT * FROM tariffs
WHERE 1648594700 < expiry AND portId = 2
ORDER BY expiry
LIMIT 1
Result:
id portId price expiry
3 2 13.00 1648594800
However, in my application I want to be able to pull in the right tariff starting from the ports record.
For one record, I can do this:
SELECT * FROM ports
LEFT JOIN tariffs on tariffs.portId = ports.id
WHERE 1648594700 < tariffs.expiry AND ports.id = 2
LIMIT 1
Result:
id name id portId price expiry
2 Kwinana 3 2 13.00 1648594800
This feels a little 'dirty', especially because I am doing a lookup on a record, and then forcing only one result using LIMIT. But, OK.
What I cannot do, and can't work out how to do, is a query that will return a list of ports, and each port having a price field that matches the constraint above (that is, the record with the highest expiry compared to 1648594700 for each port).
This obviously won't work:
SELECT * FROM ports
left join tariffs on tariffs.portId = ports.id
where 1648594700 < tariffs.expiry
Since the result of the query, testing with timestamp 1648594700, would be:
id name id portId price expiry
2 Kwinana 3 2 13.00 1648594800
2 Kwinana 4 2 14.00 1651273200
2 Kwinana 5 2 15.00 2250000000
1 Port he 3 1 23.00 1648594800
1 Port he 4 1 24.00 1651273200
1 Port he 5 1 25.00 2250000000
Instead, the result for all ports (before further filtering) should be:
id name id portId price expiry
2 Kwinana 3 2 13.00 1648594800
1 Port he 3 1 23.00 1648594800
Is there a clean, non-hacky way to have such a result?
As an added constraint, is this possible for this to be done in ONE query, without temp tables etc.?

You can select the lowest expiry, do your join and only take the rows having this minimum expiry:
SELECT p.id, p.name, t.id, t.portId, t.price, t.expiry
FROM ports p
LEFT JOIN tariffs t ON p.id = t.portId
WHERE expiry = (SELECT MIN(expiry) FROM tariffs WHERE 1648594700 < expiry)
ORDER BY p.id;
This will get your desired result, please see here: db<>fiddle

On MySQL 8+, ROW_NUMBER should work here:
WITH cte AS (
SELECT p.id, p.name, t.price, t.expiry,
ROW_NUMBER() OVER (PARTITION BY p.id ORDER BY t.expiry) rn
FROM ports p
LEFT JOIN tariffs t ON t.portId = p.id
WHERE t.expiry > 1648594700
)
SELECT id, name, price, expiry
FROM cte
WHERE rn = 1
ORDER BY id;
This logic would return one record for each port having the nearest expiry.

Related

Finding the 3rd merchant with the highest lifetime transaction amount

I am trying to write a query to get the 3rd merchant with the highest lifetime
transaction amount. Also, I have to provide the total transactions to date for this
merchant.
This is the create table statement.
CREATE TABLE transaction(
transaction_id int , user_id int , merchant_name varchar(255), transaction_date date , amount int
);
INSERT INTO transaction(transaction_id, user_id, merchant_name, transaction_date, amount)
VALUES (1, 1 ,'abc', '2015-01-17', 100),(2, 2, 'ced', '2015-2-17', 100),(3, 1, 'def', '2015-2-16', 120),
(4, 1 ,'ced', '2015-3-17', 110),(5, 1, 'ced', '2015-3-17', 150),(6, 2 ,'abc', '2015-4-17', 130),
(7, 3 ,'ced', '2015-12-17', 10),(8, 3 ,'abc', '2015-8-17', 100),(9, 2 ,'abc', '2015-12-17', 140),(10, 1,'abc', '2015-7-17', 100),
(11, 1 ,'abc', '2015-01-17', 120),(12, 2 ,'ced', '2015-12-23', 130);
I am not sure how the o/p would look like. I am stuck here.
SELECT distinct(merchant_name), max(amount) from transaction
For MySQL, we could do this:
CREATE TABLE transaction (
transaction_id int , user_id int , merchant_name varchar(255), transaction_date date , amount int
);
INSERT INTO transaction(transaction_id, user_id, merchant_name, transaction_date, amount)
VALUES (1, 1 ,'abc', '2015-01-17', 100),(2, 2, 'ced', '2015-2-17', 100),(3, 1, 'def', '2015-2-16', 120),
(4, 1 ,'ced', '2015-3-17', 110),(5, 1, 'ced', '2015-3-17', 150),(6, 2 ,'abc', '2015-4-17', 130),
(7, 3 ,'ced', '2015-12-17', 10),(8, 3 ,'abc', '2015-8-17', 100),(9, 2 ,'abc', '2015-12-17', 140),(10, 1,'abc', '2015-7-17', 100),
(11, 1 ,'abc', '2015-01-17', 120),(12, 2 ,'ced', '2015-12-23', 130)
;
SELECT merchant_name
, SUM(amount) AS sum_amount
FROM transaction
GROUP BY merchant_name
ORDER BY sum_amount DESC
LIMIT 2, 1
;
Result:
+---------------+------------+
| merchant_name | sum_amount |
+---------------+------------+
| def | 120 |
+---------------+------------+
The full result without limiting, for comparison, is:
+---------------+------------+
| merchant_name | sum_amount |
+---------------+------------+
| abc | 690 |
| ced | 500 |
| def | 120 |
+---------------+------------+
Understanding the meaning of "lifetime transaction", playing with orders and limits, and two levels of aggregation, you can get what you want:
select t.merchant_name, count(*)
from transactions t
join (
select merchant_name, sum(amount) amount
from transactions
group by merchant_name
order by 2 desc
limit 3
) top3
on t.merchant_name = top3.merchant_name
group by t.merchant_name
order by top3.amount asc
limit 1
You can test on this db<>fiddle
If you had a "merchant" dimension, you could get the merchant_name from there and perform the sum and count in the same subquery, improving this way the query performance. Something like this...
select m.merchant_name, top3.amount, top3.nbtrans
from merchants m
join (
select merchant_name, sum(amount) amount, count(1) as nbtrans
from transactions
group by merchant_name
order by 2 desc
limit 3
) top3
on t.merchant_name = top3.merchant_name
order by top3.amount asc
limit 1

grouping records by a field and submit a query on each group in sql

I have a table like this:
create table product_company (
id int PRIMARY KEY,
productName varchar(100),
companyName varchar(100),
price int
);
I want to know the name of the product which it has the second rank in price in each company.
for example if company1 has three product product1=30, product2=50 and product3=15(the assignment shows the price of each product in this company) so product1 has the second rank in price property in company1 and I want to write a query that returns something like below:
company1 product1
company2 ...
...
I mean for every company I want to know the product that has the second rank in price within that company.
I don't know how to use group by clause because group by is working fine by aggregate functions but I don't want the maximum in price.
I want to write this query with standard sql queries and clauses and without some special funcions that may not work in some DBMS
If you are running MySQL 8.0, you can use window function dense_rank():
select *
from (
select
pc.*,
dense_rank() over(partition by companyName order by price desc) rn
from product_company pc
) t
where rn = 2
In earlier versions, one solution is to filter with a correlated subquery. But you have to be careful to properly handle possible top ties. This should do it:
select pc.*
from product_company pc
where (
select count(distinct pc1.price)
from product_company pc1
where pc1.companyName = pc.companyName and pc1.price > pc.price
) = 1
An EXISTS with a COUNT can also be used for this
For example:
create table product_company (
id int PRIMARY KEY AUTO_INCREMENT,
productName varchar(100),
companyName varchar(100),
price decimal(16,2)
);
insert into product_company
(productName, companyName, price) values
('product 1', 'odd org', 9)
,('product 2', 'odd org', 15)
,('product 3', 'odd org', 11)
,('product 4', 'odd org', 17)
,('product 5', 'even inc.', 18)
,('product 6', 'even inc.', 12)
,('product 7', 'even inc.', 16)
,('product 8', 'even inc.', 14)
;
select *
from product_company t
where exists
(
select 1
from product_company t2
where t2.companyName = t.companyName
and t2.price >= t.price
having count(distinct t2.price) = 2
)
id | productName | companyName | price
-: | :---------- | :---------- | ----:
2 | product 2 | odd org | 15.00
7 | product 7 | even inc. | 16.00
db<>fiddle here
And if you want to have the top 2 per company?
Then change the HAVING clause
...
having count(distinct t2.price) <= 2
...

Joining table to union of two tables?

I have two tables: orders and oldorders. Both are structured the same way. I want to union these two tables and then join them to another table: users. Previously I only had orders and users, I am trying to shoehorn oldorders into my current code.
SELECT u.username, COUNT(user) AS cnt
FROM orders o
LEFT JOIN users u
ON u.userident = o.user
WHERE shipped = 1
AND total != 0
GROUP BY user
This finds the number of nonzero total orders all users have made in table orders, but I want to this in the union of orders and oldorders. How can I accomplish this?
create table orders (
user int,
shipped int,
total decimal(4,2)
);
insert into orders values
(5, 1, 28.21),
(5, 1, 24.12),
(5, 1, 19.99),
(5, 1, 59.22);
create table users (
username varchar(100),
userident int
);
insert into users values
("Bob", 5);
Output for this is:
+----------+-----+
| username | cnt |
+----------+-----+
| Bob | 4 |
+----------+-----+
After creating the oldorders table:
create table oldorders (
user int,
shipped int,
total decimal(4,2)
);
insert into oldorders values
(5, 1, 62.94),
(5, 1, 53.21);
The expected output when run on the union of the two tables is:
+----------+-----+
| username | cnt |
+----------+-----+
| Bob | 6 |
+----------+-----+
Just not sure where or how to shoehorn a union into there. Instead of running the query on orders, it needs to be on orders union oldorders. It can be assumed there is no intersect between the two tables.
You just need to union this way:
SELECT u.username, COUNT(user) AS cnt
FROM
(
SELECT * FROM orders
UNION
SELECT * FROM oldorders
) o
LEFT JOIN users u ON u.userident = o.user
WHERE shipped = 1
AND total != 0
GROUP BY user;
First get the combined orders using UNION between orders and oldorders table.
The rest of the work is exactly same what you did.
SEE DEMO
Note:
Left join doesn't make sense in this case. Orders for which the users don't exist then you will get NULL 0 as output. This doesn't hold any value.
If you want <user,total orders> for all users including users who might not have ordered yet then you need to change the order of the LEFT JOIN

SQL query to compare row value to group values, with condition

I wish to port some R code to Hadoop to be used with Impala or Hive with a SQL-like query. The code I have is based on this question:
R data table: compare row value to group values, with condition
I wish to find, for each row, the number of rows with the same id in subgroup 1 with cheaper price.
Let's say I have the following data:
CREATE TABLE project
(
id int,
price int,
subgroup int
);
INSERT INTO project(id,price,subgroup)
VALUES
(1, 10, 1),
(1, 10, 1),
(1, 12, 1),
(1, 15, 1),
(1, 8, 2),
(1, 11, 2),
(2, 9, 1),
(2, 12, 1),
(2, 14, 2),
(2, 18, 2);
Here is the output I would like to have (with the new column cheaper):
id price subgroup cheaper
1 10 1 0 ( because no row is cheaper in id 1 subgroup 1)
1 10 1 0 ( because no row is cheaper in id 1 subgroup 1)
1 12 1 2 ( rows 1 and 2 are cheaper)
1 15 1 3
1 8 2 0 (nobody is cheaper in id 1 and subgroup 1)
1 11 2 2
2 9 1 0
2 12 1 1
2 14 2 2
2 18 2 2
Note that I always want to compare rows to the ones in subgroup 1, even when the rows are themselves in subgroup 2.
You can join the table with itself, using a LEFT JOIN:
SELECT
p.id,
p.price,
p.subgroup,
COUNT(p2.id)
FROM
project p LEFT JOIN project p2
ON p.id=p2.id AND p2.subgroup=1 AND p.price>p2.price
GROUP BY
p.id,
p.price,
p.subgroup
ORDER BY
p.id, p.subgroup
count(p2.id) will count all rows where the join does succeed (and it succeeds where there are cheaper prices for the same id and for the subgroup 1).
The only problem is that you are expecting those two rows:
1 10 1 0
1 10 1 0
but my query will only return one, because I'm grouping by id, price, and subgroup. If you have another unique ID in your project table you could also group by that ID. Please see a fiddle here.
Or you could use an inline query:
SELECT
p.id,
p.price,
p.subgroup,
(SELECT COUNT(*)
FROM project p2
WHERE p2.id=p.id AND p2.subgroup=1 AND p2.price<p.price) AS n
FROM
project p

Create Trial Balance via Sql in MySql

I am new to Sql, and need some guidance to create a Trial Balance via Sql query in MySql.
Consider the following scenario:
Two Tables:
Accounts
Transactions
Accounts Table fields details:
AccNo (PK)(varchar) (5)
AccName (varchar)(50)
AccOpBal (double)
Transactions Table fields details:
TransID (int) (Auto Increment) (PK)
AccNo (varchar) (5)
TransDt (DateTime)
TransDebit (Double)
TransCredit (Double)
Now I need a SQL query based on transDt date range(for e.g 01st Jan-14 to 31sth Jan-2014) which will return:
AccNo
AccOpBal
TransDebit (Sum of monthly transaction i.e Jan-2014)
TransCredit (Sum of monthly transaction i.e Jan-2014)
TransDebit (Sum of Yearly transaction i.e from 01st July-2013 to 31st Jan 2014 or YTD)
TransCredit (Sum of Yearly transaction i.e from 01st July-2013 to 31st Jan 2014 or YTD)
It is not necessary that every AccNo has opening balance (AccOpBal), likewise, it is also not necessary that every AccNo has transactions (TransDebit or TransCredit). But if an AccNo has any, it should be in query.
UPDATE Picture of sample trial added
You could achieve that result with a select over a union of two queries, one for the month to date and one for the year to date figures.
select accno, accopbal, sum(mtd_d), sum(mtd_c), sum(ytd_d),sum(ytd_c)
from
( select ao.accno
, ao.accOpBal
, 0 as mtd_d
, 0 as mtd_c
, 0 as ytd_d
, 0 as ytd_c
from accounts ao
left outer join transactions tn on tn.accno = ao.accno
where tn.accno is null
union
select tm.accno
, a.accOpBal
, sum(tm.transdebit) as mtd_d
, sum(tm.transcredit) as mtd_c
, 0 as ytd_d
, 0 as ytd_c
from accounts a
right outer join transactions tm on tm.accno = a.accno
where tm.transdt between '2014-01-01' and '2014-01-31'
group by a.accno, a.accopbal
union
select ty.accno
, a.accOpBal
, 0
, 0
, sum(ty.transdebit)
, sum(ty.transcredit)
from accounts a
right outer join transactions ty on ty.accno = a.accno
and ty.transdt between '2013-07-01' and '2014-01-31'
group by a.accno, a.accopbal
) alltxn
group by accno, accopbal
Here is a sqlfiddle with a small test set
and here is the testset:
-- january
insert into transactions values (1, 'alfki', '2014-01-01', 1,3);
insert into transactions values (1, 'alfki', '2014-01-02', 1,3);
insert into transactions values (1, 'alfki', '2014-01-03', 1,3);
-- last year
insert into transactions values (1, 'alfki', '2013-09-01', 5,2);
-- txn without acc
insert into transactions values (1, 'noexi', '2014-01-03', 4,2);
-- acc with txn
INSERT INTO Accounts values ( 'alfki', 'alfred', 4);
-- acc without txn
INSERT INTO Accounts values ( 'lefto', 'lefto', 6);
with the following query result:
ACCNO | ACCOPBAL |SUM(MTD_D)|SUM(MTD_C)|SUM(YTD_D)|SUM(YTD_C)
------+----------+----------+----------+----------+-----------
alfki | 4 | 3 | 9 | 8 | 11
lefto | 6 | 0 | 0 | 0 | 0
noexi | (null) | 4 | 2 | 4 | 2