MySQL join 2 tables and result should be IN third_table - mysql

I got 3 tables: requests, d_requests (delivery requests) and s_requests (send requests).
part of "d_requests" and "s_requests" is always the same (userID, ticket_creation_date and some other data). So it was chunked from these tables and put to "requests" upon each insert to db.
Now I need to do following: JOIN requests and d_requests selecting some data, and then I need to make sure that such selection is IN s_requests' column "send_before"
SELECT r.type, r.request_from, r.request_to, d.departure_date
FROM requests as r
JOIN d_requests as d ON r.request_id = d.requests_id
WHERE r.type='d' AND r.request_from='Beijing'
AND r.request_to='Tokyo' AND d.departure_date
IN (SELECT s.s_before from s_requests s where s.s_before<='user_defined_date')
ORDER BY d.departure_date
I have a result, but it's partial. As I see from the DB, it should give me some several rows of output while it only generates a table with 1 row. Even if I set "user_defined_date" to something like 2025-12-12, output is still 1 row (while all tickets are in 2017 and early 2018).

I think you might need something like this
SELECT r.type, r.request_from, r.request_to, d.departure_date
FROM requests as r
INNER JOIN d_requests as d ON r.request_id = d.requests_id
INNER JOIN s_requests as s ON r.request_id = s.requests_id
WHERE r.type='d' AND r.request_from='Beijing'
AND r.request_to='Tokyo' AND s.s_before<='user_defined_date'
ORDER BY d.departure_date
But it's quite difficult to make suggestions when I don't know the full schema of those table, and what it is you're trying to achieve.

Related

SQL Join is changing values of my existing column

I'm attempting to use SQL to pull data from a database into a Jupyter (python) notebook and work with it there. I have a query that pulls the yearweek of flight's upload date, and counts the number of flights in that yearweek. Finally, it groups the results by the yearweek of upload date:
SELECT YEARWEEK(d.upload_date), COUNT(f.id)
FROM apps_flight f
LEFT JOIN apps_enginedatafile d ON d.id=f.import_file_id
WHERE f.global_duplicate = 0
GROUP BY YEARWEEK(d.upload_date)
I want to count number of subscribers (located in another table) from each yearweek to compare them to count of flights. So I'm trying to join said table by adding:
LEFT JOIN apps_subscription s ON s.basesubscription_ptr_id = f.id
But, when I do this, the counts of my flight values change!
The first few counts for the original query look like:
[327, 605, 78, 5768, 9716, 9686, 7902, 3699, 3323, 6081, 4966, 3456, 3181, 2749, 4577, 3157, 1792, 1806, ...]
After joining the table, it becomes:
[327, 738, 78, 8854, 17418, 16156, 13921, 7536, 5380, 10040, 7559, 5461, 6323, 6412, 6702, 5433, 2924, ...]
I'm not sure what's happening here. Perhaps the join is creating duplicate rows? The data set is very large, and takes about 30 minutes to run the query. Adding a LIMIT doesn't seem to speed it up, so as you can imagine, testing takes a little while. (If I'm oblivious to another way to speed up the query aside from a LIMIT, feel free to make me aware)!
Thanks for any info.
Simply join two aggregate count queries. Below assumes same structure including columns names. (Adjust upload_date to actual date/time column in apps_subscription.)
WITH agg_flights AS (
SELECT YEARWEEK(d.upload_date) AS year_week,
COUNT(f.id) AS flight_counts
FROM apps_flight f
LEFT JOIN apps_enginedatafile d
ON d.id = f.import_file_id
WHERE f.global_duplicate = 0
GROUP BY YEARWEEK(d.upload_date)
), agg_subs AS (
SELECT YEARWEEK(s.upload_date) AS year_week, -- ADJUST date/time variable
COUNT(f.id) AS subscriber_counts
FROM apps_flight f
LEFT JOIN apps_subscription s
ON s.basesubscription_ptr_id = f.id
WHERE f.global_duplicate = 0
GROUP BY YEARWEEK(s.upload_date) -- ADJUST date/time variable
)
SELECT f.year_week,
f.flight_counts,
s.subscriber_counts
FROM agg_flights f
INNER JOIN agg_subs s
ON f.year_week = s.year_week
Joins create combined rows of all the tables joined. So your join between f and d will have multiple rows (before the group by) for a single flight if that flight has more than one import_file_id value, and the join on s will add multiple rows if a flight has more than one subscription. And COUNT operates on the result of the joins, not on the f table before the join.
In this case, the easy fix is to just use COUNT(DISTINCT f.id) instead of COUNT(f.id), so each flight is only counted once per yearweek.

Joining the same table twice in MYSQL

I am trying to create a MYSQL query that pulls in data from a range of tables. I have a master bookings table and an invoice table where I am recording invoice id's etc from Stripe.
I am storing two invoices per booking; one a deposit, the second the final balance.
In my admin backend I then display to the admin info on whether the invoice is paid etc so need to pull in data from SQL to show this.
I'm following some previous guidance here What's the best way to join on the same table twice?.
My query is returning data, however when the invoices table is included twice (to give me the deposit and balance invoices) however the column names are identical.
Could someone point me in the right direction? I think I need to somehow rename the columns on the second returned invoice??? Sorry new to anything but basic SQL queries.
This is my SQL
SELECT * FROM bookings
INNER JOIN voyages ON bookings.booking_voyageID = voyages.voyage_id
LEFT JOIN emailautomations ON bookings.booking_reference = emailautomations.automation_bookingRef AND emailautomations.automation_sent != 1
LEFT JOIN invoices ON bookings.booking_stripeDepositInvoice = invoices.invoice_id
LEFT JOIN invoices inv2 ON bookings.booking_stripeBalanceInvoice = inv2.invoice_id
Thanks to #Algef Almocera's answer I have amended my SQL (and stopped being lazy by using SELECT *, was able to trim loads of columns down to not many!)
SELECT
bookings.booking_status,
bookings.booking_reference,
bookings.booking_stripeCustomerReference,
bookings.booking_stripeDepositInvoice,
bookings.booking_stripeBalanceInvoice,
bookings.booking_totalPaid,
bookings.booking_voyageID,
bookings.booking_firstName,
bookings.booking_lastName,
bookings.booking_contractName,
bookings.booking_contractEmail,
voyages.voyage_id,
voyages.voyage_name,
voyages.voyage_startDate,
depositInvoice.invoice_id AS depositInvoice_id,
depositInvoice.invoice_status AS depositInvoice_status,
balanceInvoice.invoice_id AS balanceInvoice_id,
balanceInvoice.invoice_status AS balanceInvoice_status
FROM bookings
INNER JOIN voyages ON bookings.booking_voyageID = voyages.voyage_id
LEFT JOIN emailautomations ON bookings.booking_reference = emailautomations.automation_bookingRef AND emailautomations.automation_sent != 1
LEFT JOIN invoices depositInvoice ON bookings.booking_stripeDepositInvoice = depositInvoice.invoice_id
LEFT JOIN invoices balanceInvoice ON bookings.booking_stripeBalanceInvoice = balanceInvoice.invoice_id
This, sometimes, couldn't be avoided as keywords might actually often be the same but of different purpose per table. To help with that, you can use aliases. for example:
SELECT
invoices.column_name AS invoices_column_name,
transactions.column_name AS transactions_column_name
FROM invoices ...
LEFT JOIN transactions ...

MYSQL many to many 3 tables query

EDIT. I missed the one main issue I was having. I want to display all the unique 'device_MAC' rows. So I want this query to output 3 rows (as per the original query). The issue I am having is connecting the data table to the remote_node table via dt_short = rn_short where the maximum timestamp for dt_short in the data table.
I am having trouble running a query on 3 tables (2 have many to many relations).
What I am trying to do:
Get each distinct rn_IEEE from the remotenodes table with the maximum timestamp (in the example this will get 3 rows with 3 distinct short addresses rn_short)
Join with the devicenames table on device_IEEE
Get each distinct dt_short from the data table with the maximum timestamp
Join dt_short with rn_short from the query above
Now the problem I am running into is that I can do the queries for the above individually, I have even gotten the first 3 of them together into a query but I cannot seem to properly join the last bit of data to get the result that I want.
I have been going in circles trying to solve this. Here is a link to SQL Fiddle which contains all the test data and the query as far as I got it, it does what i want for the first line but from table 'data' after the first line is NULL:
See this SQL fiddle
After going through your requirements and the data, it looks like you just need to change your query to include an INNER JOIN on the data table instead of a LEFT JOIN
See SQL Fiddle with Demo
select rn.*, dn.*, d.*
from remotenodes rn
inner join devicenames dn
on rn.rn_IEEE = dn.device_IEEE
and rn.rn_timestamp = (SELECT MAX(rn_timestamp) FROM remotenodes
WHERE rn.rn_IEEE = rn_IEEE
GROUP BY rn_IEEE)
inner join data d
on rn.rn_short = d.dt_short
AND d.dt_timestamp = (SELECT MAX(d2.dt_timestamp) AS ts
FROM data d2
WHERE d.dt_short = d2.dt_short
GROUP BY d2.dt_short)
what you have done the query in your SQL fiddle is right.Instead of using left join use inner join so that it will give you the first row
cheers.
Thanks for all your answers everyone. I managed to solve the problem by using views.
It's not the most efficient way but I think it will do for now.
Here is the SQL Fiddle link:
http://sqlfiddle.com/#!2/4076e/8
Try this query, for me its returning one row:
SELECT rn_short, rn_IEEE, device_name
FROM
(SELECT DISTINCTROW dt_short FROM (SELECT * FROM `data` ORDER BY `dt_timestamp` DESC) as data ) as a
JOIN
(SELECT rn_IEEE, rn_short, device_name FROM devicenames dn JOIN (SELECT DISTINCTROW rn_IEEE, rn_short FROM (SELECT * FROM `remotenodes` ORDER BY `rn_timestamp` DESC) as remotenodes GROUP BY rn_IEEE) as rn ON dn.device_IEEE = rn.rn_IEEE) as b
ON a.dt_short = b.rn_short

Missing record in a complex SELECT FULL JOIN statement

I created a SQL statement that should return the number of appointments receive by all salesmen. I work with 3 tables, Contract, Salesmen and Appointment, and I need to show how many appointments was received by each salesmen.
My problem is that although I use a Full Join the result doesn't show people who didn't receive any appointments. I found that there is a problem about constraint.
I took a look to Except, Intercept and Union option but none of those could solve my problem. Which other way could I use to get the full list of reps having or not received some appointments?
There is an example of the statement I used:
SELECT C.RepID, COUNT(A.AppID) AS AppAttrib, C.AppointmentPurchased, S.Name, S.FirstName
FROM Repartition.dbo.Contract C
FULL JOIN Repartition.DBO.Appointment A
ON C.RepID = A.RepID
LEFT JOIN Repartition.DBO.Salesmen S
ON S.RepID = C.RepID
GROUP BY C.RepID, V.Nom, S.Name, S.FirstName
Thanks for your help,
Antenor
Not knowing your table structure in detail, I'm just guessing here - but I think your query starts at the wrong place - you should start with the Salesmen table, and go from there. So basically, select those columns from the Salesmen table that you need, and then join in the other tables as needed.
Something like this:
SELECT
s.RepID, S.Name, S.FirstName,
COUNT(A.AppID) AS AppAttrib,
C.AppointmentPurchased
FROM
Repartition.dbo.Salesmen s
LEFT OUTER JOIN
Repartition.dbo.Contract c ON s.RepID = c.RepID
LEFT OUTER JOIN
Repartition.dbo.Appointment a ON s.RepID = a.RepID
GROUP BY
s.RepID, s.Name, s.FirstName

MySQL - Join a table twice with a main table

I'm not sure if this can be done. But I just wanted to check with the experts out here.
My case is:
I have a table tbl_campaign which basically stores a campaigns which has a one to many relation with a table called tbl_campaign_user where the users that were selected during the campaign are stored along with the campaign id (tbl_campagin_user.cu_campaign_id = tbl_campaign.campaign_id ).
The second table (tbl_campaign_user) has a status field which is either 0 / 1 denoting unsent/sent. I wanted to write a single sql query which would read the campaign data as well as the number of sent and unsent campaign users (which is why I'm joining twice on the second table).
I tried this below, but I get the same number of count as sent and unsent.
SELECT `tbl_campaign`.*,
COUNT(sent.cu_id) as numsent,
COUNT(unsent.cu_id) as num_unsent FROM (`tbl_campaign`)
LEFT JOIN tbl_campaign_user as sent on (sent.cu_campaign_id = tbl_campaign.campaign_id and sent.cu_status='1')
LEFT JOIN tbl_campaign_user as unsent on (unsent.cu_campaign_id = tbl_campaign.campaign_id and unsent.cu_status='0')
WHERE `tbl_campaign`.`campaign_id` = '19'
I tried debugging by breaking the query into two parts:
=>
SELECT `tbl_campaign`.*,
COUNT(unsent.cu_id) as num_unsent FROM (`tbl_campaign`)
Left join tbl_campaign_user as unsent on (unsent.cu_campaign_id = tbl_campaign.campaign_id and unsent.cu_status='0')
WHERE `tbl_campaign`.`campaign_id` = '19'
The above works exactly as wanted. And so does the one below:
=>
SELECT `tbl_campaign`.*,
COUNT(sent.cu_id) as numsent FROM (`tbl_campaign`)
Left join tbl_campaign_user as sent on (sent.cu_campaign_id = tbl_campaign.campaign_id and sent.cu_status='1')
WHERE `tbl_campaign`.`campaign_id` = '19'
I am not sure what I've been doing wrong while merging the two. I know I don't know much about joins so possibly a conceptual error? Please could anyone help me?
Thx in advance!
You only need to join tbl_campaign_user once and
count (sum, whatever) how many times cu_status was zero/one.
SELECT `tbl_campaign`.id,
count(u.id) as num_all_campaign_users
sum(u.cu_status) as num_sentcampaign_users,
count(u.id) - sum(u.cu_status) as num_unsent_campaign_users
FROM `tbl_campaign` c
LEFT JOIN tbl_campaign_user as u on (u.cu_campaign_id = c.campaign_id)
WHERE `tbl_campaign`.`campaign_id` = '19'
group by `tbl_campaign`.id
Note that this is sort of pseudo code, you may have to elaborate
the sum/count in the select clause and the group by clause as well.