Data base One To Many Relationship Query - mysql

I have 3 tables in my DB; Transactions, transaction_details, and accounts - basically as below.
transactions :
id
details
by_user
created_at
trans_details :
id
trans_id (foreign key)
account_id
account_type (Enum -[c,d])
amount
Accounts :
id
sub_name
In each transaction each account may be creditor or debtor. What I'm trying to get is an account statement (ex : bank account movements) so I need to query each movement when the account is type = c (creditor) or the account type is = d (debtor)
trans_id, amount, created_at, creditor_account, debtor_account
Update : I tried the following query but i get the debtor column values all Null!
SELECT transactions.created_at,trans_details.amount,(case WHEN trans_details.type = 'c' THEN sub_account.sub_name END) as creditor,
(case WHEN trans_details.type = 'd' THEN sub_account.sub_name END) as debtor from transactions
JOIN trans_details on transactions.id = trans_details.trans_id
JOIN sub_account on trans_details.account_id = sub_account.id
GROUP by transactions.id
After the help of #Jalos I had to convert the query to Laravel which also toke me 2 more hours to convert and get the correct result :) below is the Laravel code in case some one needs to perform such query
I also added between 2 dates functionality
public function accountStatement($from_date,$to_date)
{
$statemnt = DB::table('transactions')
->Join('trans_details as credit_d',function($join) {
$join->on('credit_d.trans_id','=','transactions.id');
$join->where('credit_d.type','c');
})
->Join('sub_account as credit_a','credit_a.id','=','credit_d.account_id')
->Join('trans_details as debt_d',function($join) {
$join->on('debt_d.trans_id','=','transactions.id');
$join->where('debt_d.type','d');
})
->Join('sub_account as debt_a','debt_a.id','=','debt_d.account_id')
->whereBetween('transactions.created_at',[$from_date,$to_date])
->select('transactions.id','credit_d.amount','transactions.created_at','credit_a.sub_name as creditor','debt_a.sub_name as debtor')
->get();
return response()->json(['status_code'=>2000,'data'=>$statemnt , 'message'=>''],200);
}

Your transactions table denotes transaction records, while your accounts table denotes account records. Your trans_details table denotes links between transactions and accounts. So, since in a transaction there is a creditor and a debtor, I assume that trans_details has exactly two records for each transaction:
select transactions.id, creditor_details.amount, transactions.created_at, creditor.sub_name, debtor.sub_name
from transactions
join trans_details creditor_details
on transactions.id = creditor_details.trans_id and creditor_details.account_type = 'c'
join accounts creditor
on creditor_details.account_id = creditor.id
join trans_details debtor_details
on transactions.id = debtor_details.trans_id and debtor_details.account_type = 'd'
join accounts debtor
on debtor_details.account_id = debtor.id;
EDIT
As promised, I am looking into the query you have written. It looks like this:
SELECT transactions.id,trans_details.amount,(case WHEN trans_details.type = 'c' THEN account.name END) as creditor,
(case WHEN trans_details.type = 'd' THEN account.name END) as debtor from transactions
JOIN trans_details on transactions.id = trans_details.trans_id
JOIN account on trans_details.account_id = account.id
GROUP by transactions.id
and it is almost correct. The problem is that due to the group-by MySQL can only show a single value for each record for creditor and debtor. However, we know that there are exactly two values for both: there is a null value for creditor when you match with debtor and a proper creditor value when you match with creditor. The case for debtor is similar. My expectation for this query would have been that MySQL would throw an error because you did not group by these computed case-when fields, yet, there are several values, but it seems MySQL can surprise me after so many years :)
From the result we see that MySQL probably found the first value and used that both for creditor and debtor. Since it met with a creditor match as a first match, it had a proper creditor value and a null debtor value. However, if you write bullet-proof code, you will never meet these strange behavior. In our case, doing some minimalistic improvements on your code transforms it into a bullet-proof version of it and provides correct results:
SELECT transactions.id,trans_details.amount,max((case WHEN trans_details.type = 'c' THEN account.name END)) as creditor,
max((case WHEN trans_details.type = 'd' THEN account.name END)) as debtor from transactions
JOIN trans_details on transactions.id = trans_details.trans_id
JOIN account on trans_details.account_id = account.id
group by transactions.id
Note, that the only change I did with your code is to wrap a max() function call around the case-when definitions, so we avoid the null values, so your approach was VERY close to a bullet-proof solution.
Fiddle: http://sqlfiddle.com/#!9/d468dc/10/0
However, even though your thought process was theoretically correct (theoretically there is no difference between theory and practice, but in practice they are usually different) and some slight changes are transforming it into a well-working code, I still prefer my query, because it avoids group by clauses, which can be useful, if necessary, but here it's unnecessary to do group by, which is probably better in terms of performance, memory usage, it's easier to read and keeps more options open for you for your future customisations. Yet, your try was very close to a solution.
As about my query, the trick I used was to do several joins with the same tables, aliasing them and from that point differentiating them as if they were different tables. This is a very useful trick that you will need a lot in the future.

Related

How to Join to a table where the result can sometimes lead with a - sign?

Hopefully i can explain this well enough. I have a bit of a unique issue where the customer system we use can change a ID in the database in the background based on the products status.
What this means is when i want to report old products we don't use anymore along side active products there ID differs between the two key tables depending on there status. This means Active products in the product table match that of the stock item table with both showing as 647107376 but when the product is no long active the StockItem table will present as 647107376 but the table that holds the product information the id presents as -647107376
This is proving problematic for me when i comes to joining the tables together to get the information needed. Originally i had my query set up like this:
SELECT
Company_0.CoaCompanyName
,SopProduct_0.SopStiStockItemCode AS hbpref
,SopProduct_0.SopStiCustomerStockCode AS itemref
,SopProduct_0.SopDescription AS ldesc
,StockMovement_0.StmOriginatingEntityID AS Goodsin
FROM
SBS.PUB.StockItem StockItem_0
LEFT JOIN SBS.PUB.SopProduct SopProduct_0 ON StockItem_0.StockItemID = SopProduct_0.StockItemID
LEFT JOIN SBS.PUB.Company Company_0 ON SopProduct_0.CompanyID = Company_0.CompanyID
LEFT JOIN SBS.PUB.StockMovement StockMovement_0 ON StockItem_0.StockItemID = StockMovement_0.StockItemID
WHERE
Company_0.CoaCompanyName = ?
AND StockMovement_0.MovementTypeID = '173355'
AND StockMovement_0.StmMovementDate >= ? AND StockMovement_0.StmMovementDate <= ?
AND StockMovement_0.StmQty <> 0
AND StockMovement_0.StockTypeID ='12049886'
Unfortunately though what this means is any of the old product will not show because there is no matching id due to the SopProduct table presenting the StockItemID with a leading -
So from this i thought best to use a case when statement with a nested concat and left in it to bring through the results but this doesn't appear to work either sample of the join below:
LEFT JOIN SBS.PUB.SopProduct SopProduct_0 ON (CASE WHEN LEFT(SopProduct_0.StockItemID,1) = "-" THEN CONCAT("-",StockItem_0.StockItemID) ELSE StockItem_0.StockItemID END) = SopProduct_0.StockItemID
Can anyone else think of a way around this issue? I am working with a Progress OpenEdge ODBC.
Numbers look like numbers. If they are, you can use abs():
ON StockItem_0.StockItemID = ABS(SopProduct_0.StockItemID)
Otherwise a relatively simple method is:
ON StockItem_0.StockItemID IN (SopProduct_0.StockItemID, CONCAT('-', SopProduct_0.StockItemID))
Note that non-equality conditions often slow down JOIN operations.
Using an or in the join should work:
LEFT JOIN SBS.PUB.SopProduct SopProduct_0
ON SopProduct_0.StockItemID = StockItem_0.StockItemID
OR
SopProduct_0.StockItemID = CONCAT("-", StockItem_0.StockItemID)
You might need to cast the result of the concat to a number (if the ids are stored as numbers).
Or you could use the abs function too (assuming the ids are numbers):
LEFT JOIN SBS.PUB.SopProduct SopProduct_0
ON SopProduct_0.StockItemID = abs(StockItem_0.StockItemID)

MySQL Ignoring Specific Rows?

I am running a SQL query to obtain results with the column applicationstatus that aren't S to tell which position is still open. There are 3 application status U=Unsuccessful, O = ongoing and S = successful. This works fine. Below is the code I am running.
SELECT DISTINCT position.position_ID, Title, EmployerName, Industry
FROM position
JOIN application ON position.position_id = application.position_id
JOIN employer ON position.employer_id = employer.employer_id
WHERE applicationstatus != 'S'
The problem I am facing is with the above query code, let's assume position_ID 3212 has received 3 applications, with one successful application; I get 2 results for the mentioned position(Excluding the successful one) like this.
Is there a way to filter it so that if a position already has a successful application, then the rows with the same position ID will be ignored?
Add a NOT EXISTS condition to exclude positions with successful applications:
SELECT DISTINCT
p.position_ID,
Title,
EmployerName,
Industry
FROM position p
JOIN application a
ON p.position_id = a.position_id
JOIN employer e
ON p.employer_id = e.employer_id
WHERE
applicationstatus != 'S'
AND NOT EXISTS(
SELECT 1
FROM application
WHERE
position_id = a.position_id
AND applicationstatus = 'S'
)
Note that I've rewritten your query to use meaningful aliases. You should also do this to improve readability and maintainability of your code.

How to filter results

I have a query which is used to obtain information about the owner of a vehicle at a particular point in time, when it is sighted (from the vehicle_sightings table). I have attached a snippet of part of the query below:
SELECT
sighting_id
FROM
vehicle_sightings
INNER JOIN
vehicle_vrn ON vehicle_sightings.plate = vehicle_vrn.vrnno
INNER JOIN
vehicle_ownership ON vehicle_vrn.fk_sysno = vehicle_ownership.fk_sysno
WHERE
vehicle_sightings.seenDate >= vehicle_ownership.ownership_start_date
AND (vehicle_sightings.seenDate <= vehicle_ownership.ownership_end_date
OR vehicle_ownership.ownership_end_date IS NULL
OR vehicle_ownership.ownership_end_date = '0001-01-01 00:00:00')
This works well for most scenarios where a vehicle only has one owner in its history. However, there are some instances where the ownership_end_date field is not filled in (it is filled in for most cases, as it indicates that a vehicle would have changed hands, and is from that stage onwards passed on to a new owner). In these instances where it is not filled in (or left default), all entries for that ownership history are returned, such as the below case:
In that case above, the query returns both of those records as the seenDate fits in both of them since the end date is not filled in (and has the default value in that case). I therefore need to modify my query to return the record with the highest ownership_start_date in those cases.
I tried to do this by adding the following at the end:
GROUP BY sighting_id HAVING seenDate >= MAX(ownership_start_date)
This however did not work, as many less records were returned. Is there a clean way this can be achieved, maybe without the GROUP BY?
To start with, using a default date as you have is an extremely bad idea. You should already have an idea of that, because now you're stuck coding against a hard-coded date in some scenarios. When another developer is coding against the database (or even you at a later date) they now have to both know and remember that they have to code for exceptions based on some hard-coded date.
Further, your ownership_end_date should always be after the ownership_start_date, which this "default" date is now going to violate. If you don't know the date or the date doesn't exist yet then it should be NULL - that's exactly what NULL is for - unknown.
For your specific issue, you can do this with a LEFT JOIN that checks for other owners that fit the criteria and excludes the row if a better one exists. You didn't provide all of your table structures and it was a little unclear on whether or not you just wanted the latest owner before the sighted date (what I've done) or all owners who owned the car after the sighted date, so I don't know if this works, but something like this:
SELECT
VS.sighting_id -- ALWAYS use table aliases or prefixes for clarity
FROM
vehicle_sightings VS
INNER JOIN vehicle_vrn VRN ON VRN.vrnno = VS.plate
INNER JOIN vehicle_ownership VO ON VO.fk_sysno = VRN.fk_sysno
LEFT OUTER JOIN vehicle_ownership VO2 ON
VO2.fk_sysno = VRN.fk_sysno AND
VO2.ownership_start_date <= VS.seenDate AND
(
VO2.ownership_end_date >= VS.seenDate OR
VO2.ownership_end_date IS NULL OR
VO2.ownership_end_date = '0001-01-01 00:00:00'
) AND
VO2.ownership_start_date > VO.ownership_start_date
WHERE
VS.seenDate >= VO.ownership_start_date AND
(
VS.seenDate <= VO.ownership_end_date OR
VO.ownership_end_date IS NULL OR
VO.ownership_end_date = '0001-01-01 00:00:00'
) AND
VO2.id IS NULL -- Or some other non-nullable column
One last caveat: decide on a naming convention and stick to it (seenDate vs ownership_end_date for example) and use names that make sense (what is an fk_sysno??)
Here is a solution which doesn't require a subquery. It ensures that there exists no ownership records greater than the record which is returned.
SELECT
sighting_id
FROM
vehicle_sightings
INNER JOIN
vehicle_vrn ON vehicle_sightings.plate = vehicle_vrn.vrnno
INNER JOIN
vehicle_ownership ON vehicle_vrn.fk_sysno = vehicle_ownership.fk_sysno
LEFT OUTER JOIN
vehicle_ownership vo2 ON vo2.fk_sysno = vehicle_ownership.fk_sysno
AND vo2.ownership_start_date > vehicle_ownership.ownership_start_date
WHERE
vehicle_sightings.seenDate >= vehicle_ownership.ownership_start_date
AND (vehicle_sightings.seenDate <= vehicle_ownership.ownership_end_date
OR vehicle_ownership.ownership_end_date IS NULL
OR vehicle_ownership.ownership_end_date = '0001-01-01 00:00:00')
AND vo2.fk_sysno IS NULL

Having trouble with an IFNULL in a mySQL WHERE clause

Before anyone says, I have searched through for a suitable answer for my issue but cannot find anything specific enough so I thought I'd ask it.
Basically I am trying to select a bunch of data for a report of people who have made loan applications to a website, but there are two different types: unsecured and guarantee. I need to place an IFNULL statement in the WHERE clause so that I ONLY use that clause if a certain other field isn't null.
Here is my statement:
SELECT
la.`lms_loan_application_id`,
la.`created`,
la.`updated`,
la.`loan_amount`,
la.`loan_term`,
la.`loan_document_fee`,
la.`broker_reference`,
la.`broker_sub_reference`,
laa.`first_name`,
laa.`surname`,
laa.`dob`,
laa.`email`,
laa.`mobile_number`,
laaAd.`address_postcode`,
lag.`first_name`,
lag.`surname`,
lag.`dob`,
lag.`email`,
lag.`mobile_number`,
lagAd.`address_postcode`,
lagAd.`housing_status`
FROM
loan_application AS la
JOIN
loan_application_applicant AS laa ON la.`id` = laa.`loan_application`
LEFT JOIN
loan_application_guarantor AS lag ON la.`id` = lag.`loan_application`
JOIN
loan_application_address AS laaAd ON laaAd.`loan_application_applicant` = laa.`id`
LEFT JOIN
loan_application_address AS lagAd ON lagAd.`loan_application_guarantor` = lag.`id`
WHERE
la.`status` = 'signature_given'
AND ! IFNULL(lag.`first_name`,
lag.`status` = 'signature_given')
AND laa.`status` = 'signature_given'
AND ! IFNULL(lag.`first_name`,
lagAd.`current_address` = 1)
AND laaAd.`current_address` = 1
ORDER BY la.`updated` DESC
LIMIT 10000
As you can see, I have attempted to use the IFNULLs (although in a negated way, which I assume works?) but all I get is duplicate row results and not the result set I really want.
Basically, I need to use the where clause "lag.status = 'signature_given" and "lagAd.current_address = 1" ONLY if the lag.first_name field is NOT null (i.e. there is a guarantor name) otherwise the status won't exist, and therefore the results of unsecured loans will not show. Hope I'm explaining this well enough!
In summary, I need to show all loan information, unsecured and guaranteed, and use a negated IFNULL in order to determine when the WHERE clause is to be taken into consideration.
Any help appreciated!
Thank you in advance
Michael
From this MySQLTutorial article:
Notice that you should avoid using the IFNULL function in the WHERE clause, because it degrades the performance of the query. If you want to check if a value is NULL or not, you can use IS NULL or IS NOT NULL in the WHERE clause.
Here is a WHERE clause which implements your logic correctly using IS NULL and IS NOT NULL instead of IFNULL:
WHERE la.`status` = 'signature_given' AND
(lag.`first_name` IS NULL OR
(lag.`first_name` IS NOT NULL AND lag.`status` = 'signature_given')) AND
laa.`status` = 'signature_given' AND
(lag.`first_name` IS NULL OR
(lag.`first_name` IS NOT NULL AND lagAd.`current_address` = 1)) AND
laaAd.`current_address` = 1

SQL Query returns fake duplicate results?

I've been trying to write a few little plugins for personal use with WHMCS. Essentially what I'm trying to do here is grab a bunch of information about a certain order(s), and return it as an array in Perl.
The Perl bit I'm fine with, it's the MySQL query I've formed that's giving me stress..
I know it's big and messy, but what I have is:
SELECT tblhosting.id, tblhosting.userid, tblhosting.orderid, tblhosting.packageid, tblhosting.server, tblhosting.domain, tblhosting.username, tblorders.invoiceid, tblproducts.gid, tblservers.ipaddress, tblinvoices.status
FROM tblhosting, tblproducts, tblorders, tblinvoices, tblservers
WHERE tblorders.status = 'Pending'
AND tblproducts.gid = '2'
AND tblservers.id = tblhosting.server
AND tblorders.id = tblhosting.orderid
AND tblinvoices.id = tblorders.invoiceid
AND tblinvoices.status = 'Paid'
I don't know if this /should/ work, but I assume I'm on the right track as it does return what I'm looking for, however it returns everything twice.
For example, I created a new account with the domain 'sunshineee.info', and then in PHPMyAdmin ran the above query.
id userid orderid packageid server domain username invoiceid gid ipaddress status
13 7 17 6 1 sunshineee.info sunshine 293 2 184.22.145.196 Paid
13 7 17 6 1 sunshineee.info sunshine 293 2 184.22.145.196 Paid
Could anyone give me a heads up on where I've gone wrong with this one.. Obvioiusly (maybe not obviously enough) I want this as only one row returned per match.. I've tried it with >1 domain in the database and it returned duplicates for each of the matches..
Any help would be much appreciated
:)
SELECT distinct tblhosting.id, tblhosting.userid, tblhosting.orderid, tblhosting.packageid, tblhosting.server, tblhosting.domain, tblhosting.username, tblorders.invoiceid, tblproducts.gid, tblservers.ipaddress, tblinvoices.status
FROM tblhosting, tblproducts, tblorders, tblinvoices, tblservers
WHERE tblorders.status = 'Pending'
AND tblproducts.gid = '2'
AND tblservers.id = tblhosting.server
AND tblorders.id = tblhosting.orderid
AND tblinvoices.id = tblorders.invoiceid
AND tblinvoices.status = 'Paid'
Well, its near impossible without any table definitions, but you are doing a lot of joins there. You are starting with tblhosting.id and working your way 'up' from there. If any of the connected tables has a double entry, you'll get more hits
You could add a DISTINCT to your query, but that would not fix the underlying issue. It could be a problem with your data: do you have 2 invoices? Maybe you should select everything (SELECT * FROM) and check what is returned, maybe check your tables for double content.
Using DISTINCT is most of the time not a good choice: it means either your query or your data is incorrect (or you don't understand them thoroughly). It might get you the right result for now, but can get you in trouble later.
A guess about the reason this happens:
You do not connect the products table to the chain of id's. So you are basically adding a '2' to your result as far as I can see. You join on products, and the only thing that limits that table is that "gid" should be 2. So if you add a product with gid 2 you get another result. Either join it (maybe tblproduct.orderid = tblorders.id ? just guessing here) or just remove it, as it does nothing as far as I can see.
If you want to make your query a bit clearer, try not implicitly joining, but do it like this. So you can actually see what's happening
SELECT tblhosting.id, tblhosting.userid, tblhosting.orderid, tblhosting.packageid, tblhosting.server, tblhosting.domain, tblhosting.username, tblorders.invoiceid, tblproducts.gid, tblservers.ipaddress, tblinvoices.status
FROM tblhosting
JOIN tblproducts ON /*you're missing something here!*/
JOIN tblorders ON tblorders.id = tblhosting.orderid
JOIN tblinvoices ON tblinvoices.id = tblorders.invoiceid
JOIN tblservers ON tblservers.id = tblhosting.server
WHERE
tblorders.status = 'Pending'
AND tblproducts.gid = '2'
AND tblinvoices.status = 'Paid'
I don't see in your query JOIN to tblproducts, it seems to be a reason.