Reduce MySQL Code down or combine SELECT Statements - mysql

I have made a few relations to do with a banking database system.
this is my current code. The table has
SELECT COUNT(AccountType) AS Student_Total FROM Account
WHERE AccountType ='Student'
and SortCode = 00000001;
SELECT COUNT(AccountType) AS Student_Total FROM Account
WHERE AccountType ='Student'
and SortCode = 00000002;
SELECT COUNT(AccountType) AS Student_Total FROM Account
WHERE AccountType ='Student'
and SortCode = 00000003;
the rest of the code is a duplicate of this part with the next type of 'Account' and looping back through sortcode's 1-3 again.
I was wondering if there was a more elegant way of producing this. I need to count the number of student, current and saver accounts for each bank.
Or is there a way to combine lots of selects together to make a neat table?

That's what GROUP BY is for!
SELECT SortCode,COUNT(AccountType) AS Student_Total FROM Account
WHERE AccountType ='Student'
GROUP BY SortCode;
UPDATE:
You can also GROUP BY with multiple grouping fields:
SELECT SortCode,AccountType,COUNT(AccountType) AS Student_Total FROM Account
GROUP BY SortCode,AccountType;

You could also apply a PIVOT approach to this query to always return a single row and know the fixed-final columns of the result set. However, applying a group by allows for more flexibility of returned rows, especially if you have a large amount of individual things you are trying to tally up.
select
A.AccountType,
SUM( IF( A.SortCode = 1, 1, 0 )) as SortCode1Cnt,
SUM( IF( A.SortCode = 2, 1, 0 )) as SortCode2Cnt,
SUM( IF( A.SortCode = 3, 1, 0 )) as SortCode3Cnt
from
Account A
where
A.AccountType = 'Student'
AND A.SortCode IN ( 1, 2, 3 )
group by
A.AccountType
Note... it appears your sort code is a numeric as you have no quotes around indicating a character string. So, all the leading zeros are irrelevant. And if you were only doing based on a single Account Type, you don't even need the leading Account Type column and can remove the group by too.

Related

Data base One To Many Relationship Query

I have 3 tables in my DB; Transactions, transaction_details, and accounts - basically as below.
transactions :
id
details
by_user
created_at
trans_details :
id
trans_id (foreign key)
account_id
account_type (Enum -[c,d])
amount
Accounts :
id
sub_name
In each transaction each account may be creditor or debtor. What I'm trying to get is an account statement (ex : bank account movements) so I need to query each movement when the account is type = c (creditor) or the account type is = d (debtor)
trans_id, amount, created_at, creditor_account, debtor_account
Update : I tried the following query but i get the debtor column values all Null!
SELECT transactions.created_at,trans_details.amount,(case WHEN trans_details.type = 'c' THEN sub_account.sub_name END) as creditor,
(case WHEN trans_details.type = 'd' THEN sub_account.sub_name END) as debtor from transactions
JOIN trans_details on transactions.id = trans_details.trans_id
JOIN sub_account on trans_details.account_id = sub_account.id
GROUP by transactions.id
After the help of #Jalos I had to convert the query to Laravel which also toke me 2 more hours to convert and get the correct result :) below is the Laravel code in case some one needs to perform such query
I also added between 2 dates functionality
public function accountStatement($from_date,$to_date)
{
$statemnt = DB::table('transactions')
->Join('trans_details as credit_d',function($join) {
$join->on('credit_d.trans_id','=','transactions.id');
$join->where('credit_d.type','c');
})
->Join('sub_account as credit_a','credit_a.id','=','credit_d.account_id')
->Join('trans_details as debt_d',function($join) {
$join->on('debt_d.trans_id','=','transactions.id');
$join->where('debt_d.type','d');
})
->Join('sub_account as debt_a','debt_a.id','=','debt_d.account_id')
->whereBetween('transactions.created_at',[$from_date,$to_date])
->select('transactions.id','credit_d.amount','transactions.created_at','credit_a.sub_name as creditor','debt_a.sub_name as debtor')
->get();
return response()->json(['status_code'=>2000,'data'=>$statemnt , 'message'=>''],200);
}
Your transactions table denotes transaction records, while your accounts table denotes account records. Your trans_details table denotes links between transactions and accounts. So, since in a transaction there is a creditor and a debtor, I assume that trans_details has exactly two records for each transaction:
select transactions.id, creditor_details.amount, transactions.created_at, creditor.sub_name, debtor.sub_name
from transactions
join trans_details creditor_details
on transactions.id = creditor_details.trans_id and creditor_details.account_type = 'c'
join accounts creditor
on creditor_details.account_id = creditor.id
join trans_details debtor_details
on transactions.id = debtor_details.trans_id and debtor_details.account_type = 'd'
join accounts debtor
on debtor_details.account_id = debtor.id;
EDIT
As promised, I am looking into the query you have written. It looks like this:
SELECT transactions.id,trans_details.amount,(case WHEN trans_details.type = 'c' THEN account.name END) as creditor,
(case WHEN trans_details.type = 'd' THEN account.name END) as debtor from transactions
JOIN trans_details on transactions.id = trans_details.trans_id
JOIN account on trans_details.account_id = account.id
GROUP by transactions.id
and it is almost correct. The problem is that due to the group-by MySQL can only show a single value for each record for creditor and debtor. However, we know that there are exactly two values for both: there is a null value for creditor when you match with debtor and a proper creditor value when you match with creditor. The case for debtor is similar. My expectation for this query would have been that MySQL would throw an error because you did not group by these computed case-when fields, yet, there are several values, but it seems MySQL can surprise me after so many years :)
From the result we see that MySQL probably found the first value and used that both for creditor and debtor. Since it met with a creditor match as a first match, it had a proper creditor value and a null debtor value. However, if you write bullet-proof code, you will never meet these strange behavior. In our case, doing some minimalistic improvements on your code transforms it into a bullet-proof version of it and provides correct results:
SELECT transactions.id,trans_details.amount,max((case WHEN trans_details.type = 'c' THEN account.name END)) as creditor,
max((case WHEN trans_details.type = 'd' THEN account.name END)) as debtor from transactions
JOIN trans_details on transactions.id = trans_details.trans_id
JOIN account on trans_details.account_id = account.id
group by transactions.id
Note, that the only change I did with your code is to wrap a max() function call around the case-when definitions, so we avoid the null values, so your approach was VERY close to a bullet-proof solution.
Fiddle: http://sqlfiddle.com/#!9/d468dc/10/0
However, even though your thought process was theoretically correct (theoretically there is no difference between theory and practice, but in practice they are usually different) and some slight changes are transforming it into a well-working code, I still prefer my query, because it avoids group by clauses, which can be useful, if necessary, but here it's unnecessary to do group by, which is probably better in terms of performance, memory usage, it's easier to read and keeps more options open for you for your future customisations. Yet, your try was very close to a solution.
As about my query, the trick I used was to do several joins with the same tables, aliasing them and from that point differentiating them as if they were different tables. This is a very useful trick that you will need a lot in the future.

Graded Assignment [Zaption] Database

Most specifically, I'm having trouble returning from LibreOffice Base [HSQLdb] a list of grades organized by (1) class, (2) assignment, (3) student's last name.
I want this output so I can run a script to copy the grades from the database to an online gradebook (which doesn't have an API [sadface])
I suspect several possible causes for this problem:
My relational structure may need tweaking.
I somehow need to implement a "student ID." On Zaption, Students make their submissions under whatever "ZaptionName" they choose to use. I then manually match ZaptionName to RosterFullName in the second table shown.
Zaption allows multiple submissions by the same "student" for the same assignment. Because multiple submissions are allowed, I run a FilterLowMultiples query to select the highest grade for that assignment for that student.
FilterLowMultiples:
SELECT MAX( "Grade" ) "Grade", "RosterFullName",
"Assignment", MAX( "ZaptionName" ) "ZapName"
FROM "SelectAssignment"
GROUP BY "RosterFullName", "Assignment"
SelectAssignment is below for reference:
SELECT "GradedAssignments"."Assignment", "Roster"."RosterFullName",
"GradedAssignments"."Grade", "ZaptionNames"."ZaptionName"
FROM "Roster", "ClassIndex", "GradedAssignments", "ZaptionNames"
WHERE "Roster"."Class" = "ClassIndex"."Class"
AND "GradedAssignments"."ZaptionName" = "ZaptionNames"."ZaptionName"
AND "ZaptionNames"."RosterFullName" = "Roster"."RosterFullName"
AND ( "GradedAssignments"."Assignment" = 'YouKnowWhatever')
My query to PullAssignmentGrades is as follows, but sorting by assignment fails, as there is no assignment by default unless that student submitted one, so the row is blank and that student falls to the bottom of the sort, which is bad for the transfer-to-online script I run.
SELECT "Roster"."RosterFirstName", "ClassIndex"."Class",
"Roster"."RosterFullName", "ClassIndex"."ClassLevel",
"FilterLowMultiples"."Grade", "FilterLowMultiples"."ZapName",
"FilterLowMultiples"."Assignment", "FilterLowMultiples"."Grade",
"FilterLowMultiples"."Assignment", "ClassIndex"."ClassDisplayOrder",
"Roster"."RosterLastName"
FROM "ClassIndex", "FilterLowMultiples", "Roster"
ORDER BY "Roster"."RosterFirstName" ASC,
"FilterLowMultiples"."Grade" DESC,
"FilterLowMultiples"."Assignment" ASC,
"ClassIndex"."ClassDisplayOrder" ASC,
"Roster"."RosterLastName" ASC
Use a LEFT JOIN in your query for SelectAssignment so you don't drop students who didn't do a particular assignment. Optionally you can use COALESCE on the potentially NULL values from the "GradedAssignments" table to assign a grade of 0 or I. Like so:
SELECT 'YouKnowWhatever' AS "Assignment", "Roster"."RosterFullName",
COALESCE("GradedAssignments"."Grade",0), "ZaptionNames"."ZaptionName"
FROM "Roster"
INNER JOIN "ClassIndex" ON "Roster"."Class" = "ClassIndex"."Class"
INNER JOIN "ZaptionNames" ON "ZaptionNames"."RosterFullName" = "Roster"."RosterFullName"
LEFT JOIN "GradedAssignments" ON ("GradedAssignments"."ZaptionName" = "ZaptionNames"."ZaptionName"
AND "GradedAssignments"."Assignment" = 'YouKnowWhatever')

MySQL: Remove JOIN for Matched Row if 2nd Round of Criteria Not Met

CONDENSED VERSION
I'm trying to join a new list with my existing database with no unique identifier -- but I'm trying to figure out a way to do it in one query that's more specific than matching by first name/last name but less specific than by all the fields available (first name/middle name/last name/address/phone).
So my idea was to match solely on first/last name and then try to assign each possible matching field with points to see if anyone who matched had 'zero points' and thus have the first name/last name match stripped from them. Here's what I came up with:
SELECT *,
#MidMatch := IF(LEFT(l.middle,1)=LEFT(d.middle,1),"TRUE","FALSE") MidMatch,
#AddressMatch := IF(left(l.address,5)=left(d.address,5),"TRUE","FALSE") AddressMatch,
#PhoneMatch := IF(right(l.phone,4)=right(d.phone,4),"TRUE","FALSE") PhoneMatch,
#Points := IF(#MidMatch = "TRUE",4,0) + IF(#AddressMatch = "TRUE",3,0) + IF(#PhoneMatch = "TRUE",1,0) Points
FROM list l
LEFT JOIN database d on IF(#Points <> 0,(l.first = d.first AND l.last = d.last),(l.first = d.first AND l.last = d.last AND l.address = d.vaddress));
The query runs fine but it does still match people who's first/last names are identical even if their points are zero (and if their addresses don't match).
Is there a way to do what I'm looking for with this roundabout points system? I've found that it helps me a lot when trying to identify which duplicate to choose, so I'm trying to expand it to the initial match. Or should I do something different?
SPECIFIC VERSION
This is kind of a roundabout idea -- so if somebody has something more straight forward, I'd definitely be willing to bail on this completely and try something else. But basically I have a 93k person table (from a database) that I'm matching against a 92k person table (from a new list). I expect many of them to be the same but certainly not all -- and I'm trying to avoid creating duplicates. Unfortunately, there's no unique identifiers that can be matched, so I'm generally stuck with matching based on some variation of first name, middle name, last name, address, and/or phone number.
The schema for the two tables (list and database) are pretty identical with the fields you see above (first name, middle name, last name, address, phone) -- the only difference is that the database table also has an unique numerical ID that I would use to upload back into the database after this match. Unfortunately the list table has no such ID. Records with the ID would get matched and loaded in on top of the old record and any record without that ID would get loaded as a new record.
What I'm trying to avoid with this question is creating a bunch of different tables and queries that start with a really specific JOIN statement and then eventually get down to just first and last name -- since there's likely some folks who should match but have moved and/or gotten a new phone number since this last list.
I could write a very simple query as a JOIN and do it numerous times, each time taking out another qualifier:
SELECT *
FROM list l
JOIN database d
ON d.first = l.first AND d.last = l.last AND d.middle = l.middle AND d.address = l.address AND d.phone = l.phone;
And I'd certainly feel confident that those people from the new list matched with the existing people in my database, but it'd only return a very small amount of people, then I'd have to go back and loosen the criteria (e.g. drop the middle name restriction, etc.) and continually create tables then merge them all back together at the end along with all the ones that didn't match at all, which I would assume would be the new people.
But is there a way to write the query solely using a first/last name match, then evaluating the other criteria and wiping the match from people who have zero 'points' (below)? Here's what I attempted to do assigning [arbitrary] points to each match:
SELECT *,
#MidMatch := IF(LEFT(l.middle,1)=LEFT(d.middle,1),"TRUE","FALSE") MidMatch,
#AddressMatch := IF(left(l.address,5)=left(d.address,5),"TRUE","FALSE") AddressMatch,
#PhoneMatch := IF(right(l.phone,4)=right(d.phone,4),"TRUE","FALSE") PhoneMatch,
#Points := IF(#MidMatch = "TRUE",4,0) + IF(#AddressMatch = "TRUE",3,0) + IF(#PhoneMatch = "TRUE",1,0) Points
FROM list l
LEFT JOIN database d on IF(#Points <> 0,(l.first = d.first AND l.last = d.last),(l.first = d.first AND l.last = d.last AND l.address = d.vaddress));
The LEFT and RIGHT formulas within the IF statements are just attempting to control for unstandardized data that gets sent. I also would've done something with a WHERE statement, but I still need the NULL values to return so I know who matched and who didn't. So I ended up attempting to use an IF statement in the LEFT JOIN to say that if the Points cell was equal to zero, that the JOIN statement would get really specific and what I thought would hopefully still return the row but it wouldn't be matched to the database even if their first and last name did.
The query doesn't produce any errors, though unfortunately I'm still getting people back who have zeros in their Points column but matched with the database because their first and last names matched (which is what I was hoping the IF/Points stuff would stop).
Is this potentially a way to avoid bad matches, or am I going down the wrong path? If this isn't the right way to go, is there any other way to write one query that will return a full LEFT JOIN along with NULLs that don't match but have it be more specific than just first/last name but less work than doing a million queries based on a new table each time?
Thanks and hopefully that made some sense!
Your first query:
SELECT *,
#MidMatch := IF(LEFT(l.middle,1)=LEFT(d.middle,1),"TRUE","FALSE") MidMatch,
#AddressMatch := IF(left(l.address,5)=left(d.address,5),"TRUE","FALSE") AddressMatch,
#PhoneMatch := IF(right(l.phone,4)=right(d.phone,4),"TRUE","FALSE") PhoneMatch,
#Points := IF(#MidMatch = "TRUE",4,0) + IF(#AddressMatch = "TRUE",3,0) + IF(#PhoneMatch = "TRUE",1,0) Points
FROM list l LEFT JOIN
database d
on IF(#Points <> 0,(l.first = d.first AND l.last = d.last),(l.first = d.first AND l.last = d.last AND l.address = d.vaddress));
This is making a serious mistake with regards to variables. The simplest is the SELECT -- the SELECT does not guarantee the order of calculation of expressions, so they could calculated in any order. And the logic is wrong if #Points is calculated first. This problem is compounded by referring to variables in different clauses. The SQL statement is a logical statement describing the results set, not a programmatic statement of how the query is run.
Let me assume that you have a unique identifier for each row in the database (just to identify the row). Then you can get the match by using a correlated subquery:
select l.*,
(select d.databaseid
from database d
where l.first = d.first and l.last = d.last
order by (4 * (LEFT(l.middle, 1) = LEFT(d.middle, 1) ) +
3 * (left(l.address, 5) = left(d.address, 5)) +
1 * (right(l.phone, 4) = right(d.phone, 4))
)
limit 1
) as did
from list l;
You can join back to the database table to get more information if you need it.
EDIT:
Your comment made it clear. You don't just want the first and last name but something else as well.
select l.*,
(select d.databaseid
from database d
where l.first = d.first and l.last = d.last and
(LEFT(l.middle, 1) = LEFT(d.middle, 1) or
left(l.address, 5) = left(d.address, 5) or
right(l.phone, 4) = right(d.phone, 4)
)
order by (4 * (LEFT(l.middle, 1) = LEFT(d.middle, 1) ) +
3 * (left(l.address, 5) = left(d.address, 5)) +
1 * (right(l.phone, 4) = right(d.phone, 4))
)
limit 1
) as did
from list l;

How can I sanitize my DB from these duplicates

I have a table with the following fields:
id | domainname | domain_certificate_no | keyvalue
An example for the output of a select statement can be as:
'57092', '02a1fae.netsolstores.com', '02a1fae.netsolstores.com_1', '55525772666'
'57093', '02a1fae.netsolstores.com', '02a1fae.netsolstores.com_2', '22225554186'
'57094', '02a1fae.netsolstores.com', '02a1fae.netsolstores.com_3', '22444356259'
'97168', '02aa6aa.netsolstores.com', '02aa6aa.netsolstores.com_1', '55525772666'
'97169', '02aa6aa.netsolstores.com', '02aa6aa.netsolstores.com_2', '22225554186'
'97170', '02aa6aa.netsolstores.com', '02aa6aa.netsolstores.com_3', '22444356259’
I need to sanitize my db such that: I want to remove the domain names that have repeated keyvalue for the first domain_certificate_no (i.e, in this example, I look for the field domain_certificate_no: 02aa6aa.netsolstores.com_1, since it is number 1, and has repeated value for the key, then I want to remove the whole chain which is 02aa6aa.netsolstores.com_2 and 02aa6aa.netsolstores.com_3 and this by deleting the domain name that this chain belongs to which is 02aa6aa.netsolstores.com.
How can I automate the checking process for the whole DB. So, I have a query that checks any domain name in the pattern ('%.%.%) EDIT: AND they have share domain name (in this ex: netsolstores.com) , if it finds cert no. 1 that belongs to this domain name has a repeated key value, then delete. Otherwise no. Please, note tat, it is ok for domain_certificate_no to have repeated value if it is not number 1.
EDIT: I only compare the repeated valeues for the same second level domain name. Ex: in this question, I compare the values that share the domain name: .netsolstores.com. If I have another domain name, with sublevel domains, I do the same. But the point is that I don't need to compare the whole DB. Only the values with shared domain name (but different sub domain).
I'm not sure what happens with '02aa6aa.netsolstores.com_1' in your example.
The following keeps only the minimum id for any repeated key:
with t as (
select t.*,
substr(domain_certificate_no,
instr(domain_certificate_no, '_') + 1, 1000) as version,
left(domain_certificate_no, instr(domain_certificate_no, '_') - 1) as dcn
from t
)
select t.*
from t join
(select keyvalue, min(dcn) as mindcn
from t
group by keyvalue
) tsum
on t.keyvalue = tsum.keyvalue and
t.dcn = tsum.mindcn
For the data you provide, this seems to do the trick. This will not return the "_1" version of the repeats. If that is important, the query can be pretty easily modified.
Although I prefer to be more positive (thinking about the rows to keep rather than delete), the following should delete what you want:
with t as (
select t.*,
substr(domain_certificate_no,
instr(domain_certificate_no, '_') + 1, 1000) as version,
left(domain_certificate_no, instr(domain_certificate_no, '_') - 1) as dcn
from t
),
tokeep as (
select t.*
from t join
(select keyvalue, min(dcn) as mindcn
from t
group by keyvalue
) tsum
on t.keyvalue = tsum.keyvalue and
t.dcn = tsum.mindcn
)
delete from t
where t.id not in (select id from tokeep)
There are other ways to express this that are possibly more efficient (depending on the database). This, though, keeps the structure of the original query.
By the way, when trying new DELETE code, be sure that you stash a copy of the table. It is easy to make a mistake with DELETE (and UPDATE). For instance, if you leave out the WHERE clause, all the rows will disappear, after the long painful process of logging all of them. You might find it faster to simply select the desired results into a new table, validate them, then truncate the old table and re-insert them.

Return zero when records not found

Im making a table generator as a school project.
In MySQL I have 3 tables namely process,operation,score. Everything looked fine until i tested out my "ADD column" button in the web app.
Previous saved data should be read properly but also include the new column in the format, problem is the previous data queried does not include any values for the new table, so I intended it to return a score of 0 if no records were found, tried IFNULL & COALESCE but nothing happens(maybe im just using it wrong)
process - processID, processName
operation - operationID, operationName
score - scoreID, score, processID, operationID, scoreType (score
types are SELF,GL,FINAL)
ps = (PreparedStatement)dbconn.prepareStatement("SELECT score FROM score WHERE processID=? and operationID=? and type=?ORDER BY processid");
here's a pic of a small sample http://i50.tinypic.com/2yv3rf9.jpg
The reason that IFNULL doesn't work is that it only has an effect on values. A result set with no rows has no values, so it does nothing.
First, it's probably better to do this on the client than on the server. But if you have to do it on the server, there's a couple of approaches I can think of.
Try this:
SELECT IFNULL(SUM(score), 0) AS score
FROM score
WHERE processID=? and operationID=? and type=?
ORDER BY processid
The SUM ensures that exactly one row will be returned.
If you need to return multiple rows when the table contains multiple matching rows then you can use this (omitting the ORDER BY for simplicity):
SELECT score
FROM score
WHERE processID = ? and operationID = ? and type = ?
UNION ALL
SELECT 0
FROM (SELECT 0) T1
WHERE NOT EXISTS
(
SELECT *
FROM score
WHERE processID = ? and operationID = ? and type = ?
)