Specific where clause in Mysql query - mysql

So i have a mysql table with over 9 million records. They are call records. Each record represents 1 individual call. The columns are as follows:
CUSTOMER
RAW_SECS
TERM_TRUNK
CALL_DATE
There are others but these are the ones I will be using.
So I need to count the total number of calls for a certain week in a certain Term Trunk. I then need to sum up the number of seconds for those calls. Then I need to count the total number of calls that were below 7 seconds. I always do this in 2 queries and combine them but I was wondering if there were ways to do it in one? I'm new to mysql so i'm sure my syntax is horrific but here is what I do...
Query 1:
SELECT CUSTOMER, SUM(RAW_SECS), COUNT(*)
FROM Mytable
WHERE TERM_TRUNK IN ('Mytrunk1', 'Mytrunk2')
GROUP BY CUSTOMER;
Query 2:
SELECT CUSTOMER, COUNT(*)
FROM Mytable2
WHERE TERM_TRUNK IN ('Mytrunk1', 'Mytrunk2') AND RAW_SECS < 7
GROUP BY CUSTOMER;
Is there any way to combine these two queries into one? Or maybe just a better way of doing it? I appreciate all the help!

There are 2 ways of achieving the expected outcome in a single query:
conditional counting: use a case expression or if() function within the count() (or sum()) to count only specific records
use self join: left join the table on itself using the id field of the table and in the join condition filter the alias on the right hand side of the join on calls shorter than 7 seconds
The advantage of the 2nd approach is that you may be able to use indexes to speed it up, while the conditional counting cannot use indexes.
SELECT m1.CUSTOMER, SUM(m1.RAW_SECS), COUNT(m1.customer), count(m2.customer)
FROM Mytable m1
LEFT JOIN Mytable m2 ON m1.id=m2.id and m2.raw_secs<7
WHERE TERM_TRUNK IN ('Mytrunk1', 'Mytrunk2')
GROUP BY CUSTOMER;

Related

Join Performances When Searching For NULL Value

I need to find a value that exists in LoyaltyTransactionBasketItemStores table but not in DimProductConsolidate table. I need the item code and its corresponding company. This is my query
SELECT
A.ProductReference, A.CompanyCode
FROM
(SELECT ProductReference, CompanyCode FROM dwhdb.LoyaltyTransactionsBasketItemsStores GROUP BY ProductReference) A
LEFT JOIN
(SELECT LoyaltyVariantArticleCode FROM dwhdb.DimProductConsolidate) B ON B.LoyaltyVariantArticleCode = A.ProductReference
WHERE
B.LoyaltyVariantArticleCode IS NULL
It is a pretty straight forward query. But when I run it, it's taking 1 hour and still not finish. Then I use EXPLAIN and this is the result
But when I remove the CompanyCode from my query, its performance is increasing a lot. This is the EXPLAIN result
I want to know why is this happening and is there any way to get ProductReference and its company with a lot more better performance?
Your current query is rife with syntax and structural errors. I would use exists logic here:
SELECT a.ProductReference, a.CompanyCode
FROM dwhdb.LoyaltyTransactionsBasketItemsStores a
WHERE NOT EXISTS (SELECT 1 FROM dwhdb.DimProductConsolidate b
WHERE b.LoyaltyVariantArticleCode = a.ProductReference);
Your current query is doing a GROUP BY in the first subquery, but you never select aggregates, but rather other non aggregate columns. On most other databases, and even on MySQL in strict mode, this syntax is not allowed. Also, there is no need to have 2 subqueries here. Rather, just select from the basket table and then assert that matching records do not exist in the other table.

Mysql DISTINCT with more than one column (remove duplicates)

My database is called: (training_session)
I try to print out some information from my data, but I do not want to have any duplicates. I do get it somehow, may someone tell me what I do wrong?
SELECT DISTINCT athlete_id AND duration FROM training_session
SELECT DISTINCT athlete_id, duration FROM training_session
It works perfectly if i use only one column, but when I add another. it does not work.
I think you misunderstood the use of DISTINCT.
There is big difference between using DISTINCT and GROUP BY.
Both have some sort of goal, but they have different purpose.
You use DISTINCT if you want to show a series of columns and never repeat. That means you dont care about calculations or group function aggregates. DISTINCT will show different RESULTS if you keep adding more columns in your SELECT (if the table has many columns)
You use GROUP BY if you want to show "distinctively" on a certain selected columns and you use group function to calculate the data related to it. Therefore you use GROUP BY if you want to use group functions.
Please check group functions you can use in this link.
https://dev.mysql.com/doc/refman/8.0/en/group-by-functions.html
EDIT 1:
It seems like you are trying to get the "latest" of a certain athlete, I'll assume the current scenario if there is no ID.
Here is my alternate solution:
SELECT a.athlete_id ,
( SELECT b.duration
FROM training_session as b
WHERE b.athlete_id = a.athlete_id -- connect
ORDER BY [latest column to sort] DESC
LIMIT 1
) last_duration
FROM training_session as a
GROUP BY a.athlete_id
ORDER BY a.athlete_id
This syntax is called IN-SELECT subquery. With the help of LIMIT 1, it shows the topmost record. In-select subquery must have 1 record to return or else it shows error.
MySQL's DISTINCT clause is used to filter out duplicate recordsets.
If your query was SELECT DISTINCT athlete_id FROM training_session then your output would be:
athlete_id
----------
1
2
3
4
5
6
As soon as you add another column to your query (in your example, the column called duration) then each record resulting from your query are unique, hence the results you're getting. In other words the query is working correctly.

Subquery in SELECT or Subquery in JOIN?

I have a MYSQL query of this form:
SELECT
employee.name,
totalpayments.totalpaid
FROM
employee
JOIN (
SELECT
paychecks.employee_id,
SUM(paychecks.amount) totalpaid
FROM
paychecks
GROUP BY
paychecks.employee_id
) totalpayments on totalpayments.employee_id = employee.id
I've recently found that this returns MUCH faster in this form:
SELECT
employee.name,
(
SELECT
SUM(paychecks.amount)
FROM
paychecks
WHERE
paychecks.employee_id = employee.id
) totalpaid
FROM
employee
It surprises me that there would be a difference in speed, and that the lower query would be faster. I prefer the upper form for development, because I can run the subquery independently.
Is there a way to get the "best of both worlds": speedy results return AND being able to run the subquery in isolation?
Likely, the correlated subquery is able to make effective use of an index, which is why it's fast, even though that subquery has to be executed multiple times.
For the first query with the inline view, that causing MySQL to create a derived table, and for large sets, that's effectively a MyISAM table.
In MySQL 5.6.x and later, the optimizer may choose to add an index on the derived table, if that would allow a ref operation and the estimated cost of the ref operation is lower than the nested loops scan.
I recommend you try using EXPLAIN to see the access plan. (Based on your report of performance, I suspect you are running on MySQL version 5.5 or earlier.)
The two statements are not entirely equivalent, in the case where there are rows in employees for which there are no matching rows in paychecks.
An equivalent result could be obtained entirely avoiding a subquery:
SELECT e.name
, SUM(p.amount) AS total_paid
FROM employee e
JOIN paychecks p
ON p.employee_id = e.id
GROUP BY e.id
(Use an inner join to get a result equivalent to the first query, use a LEFT outer join to be equivalent to the second query. Wrap the SUM() aggregate in an IFNULL function if you want to return a zero rather than a NULL value when no matching row with a non-null value of amount is found in paychecks.)
Join is basically Cartesian product that means all the records of table A will be combined with all the records of table B. The output will be
number of records of table A * number of records of table b =rows in the new table
10 * 10 = 100
and out of those 100 records, the ones that match the filters will be returned in the query.
In the nested queries, there is a sample inner query and whatever is the total size of records of the inner query will be the input to the outter query that is why nested queries are faster than joins.

SQL:in clause with query taking too long compared to in clause with actual data

I have 3 SQL queries as given:
select student_id from user where user_id =4; // returns 35
select * from student where student_id in (35);
select * from student where student_id in (select student_id from user where user_id =4);
first 2 queries take less than 0.5 second, but the third, similar as 2nd containing 1st as subquery, is taking around 8 seconds.
I indexed tables according to my need, but time is not reducing.
Can someone please give me a solution or provide some explanation for this behaviour.
Thanks!
Actually, MySQL execute the inner query at the end, it scans every indexes before. MySQL rewrites the subquery in order to make the inner query fully dependent of the outer one.
For exemple, it select * from student (depend of your database, but could return many results), then apply the inner query user_id=4 to the previous result.
The dev team are working on this problem and it should be "solved" in the 6.0 http://dev.mysql.com/doc/refman/5.5/en/optimizing-subqueries.html
EDIT:
In your case, you should use a JOIN method.
Not with a subquery but why don't you use a join here?
select
s.*
from
student s
inner join
user u
on s.id_student_id = u.student_id
where
u.user_id = 4
;

How to merge this two query statement into one query statement

1. SELECT * FROM instalmentsdetails WHERE instalmentName='Third Installment'AND studentFeeId='1'
2. select max(`receiptNo`)as `receiptNo` FROM instalmentsdetails
Table instalmentsdetails
instalmentsDetailsId
studentFeeId
receiptNo
instalmentName
amount
dueDate
fineAmt
waivedAmt
scholarShip
grandTotal
status
Little confused .How to merge this two query statement into one query statement
P.S: One statement checks for the condition and the other checks for the max of receiptNo in that table
I want both the values in one query
Is this what you want?
SELECT max(`receiptNo`) as `receiptNo`
FROM instalmentsdetails
WHERE instalmentName='Third Installment' AND studentFeeId='1'
Update: how about this:
SELECT *
FROM instalmentsdetails as inds
INNER JOIN (
SELECT max(`receiptNo`) as `maxreceiptNo`
FROM instalmentsdetails
) as maxt
WHERE inds.instalmentName='Third Installment' AND inds.studentFeeId='1'
This applies the filter to the table, then adds an extra column (the maximum receiptNo)
Assuming the goal is to get:
a list of instalmentsdetails with specific a instalmentName and studentFeeId
global maximum
 
SELECT *, 0 AS receiptNo FROM instalmentsdetails WHERE instalmentName='Third Installment'AND studentFeeId='1'
UNION
select *, max(`receiptNo`) as `receiptNo` FROM instalmentsdetails
Update
Apparently the OP simply wants to consolidate separate query results into a single row. In that case:
SELECT
*,
(SELECT max(`receiptNo`) FROM instalmentsdetails) AS maxReceiptNo
FROM instalmentsdetails WHERE instalmentName='Third Installment'AND studentFeeId='1'
From all the reading, Matt is correct, but maybe I can help explain what he's doing...
The first part of the query (From InstalmentsDetails WHERE YourCondition) will get all records that qualify for the condition.
THEN, by doing a JOIN to the second query (select max( 'receiptNo' ) from... with no where clause ) will ALWAYS return a single row, single column of the maximum receipt without regard to ANY criteria.
This creates an implied Cartesian result. Join everything in the first table with every record in the second. Since there is no explicit join condition, every row will get the same "max()" value as a returned column. And since there will only be one record in the select max() call, you never worry about duplicates.
Now, if you wanted the maximum receipt within the same criteria, you would just copy that same criteria to the select max() query portion.