Problem with comparing two sql queries with EXCEPT - mysql

I have two queries. First query returns 4554 results from database and second query returns 3830 results. I need to fetch and list those 724 results that are the difference between two queries, that exist in first query and does not exist in second query. I tried with EXCEPT function, but I get error in my console
FOR, GROUP, HAVING, INTO expected got 'EXCEPT'
Any help is appreciated. Here are my queries.
select * from billing_trans B,members M where (sign>='2021-03-01 00:00:00' && sign<='2021-03-31 23:59:59') AND M.id=B.mid
EXCEPT
select * from billing_trans B,members M where (sign>='2021-03-01 00:00:00' && sign<='2021-03-31 23:59:59') AND M.id=B.mid
AND (trans LIKE 'BH%' OR bank IN ('SM', 'TO', 'II'));

You can use not exists. Presumably you intend:
select *
from dt_billing_trans B join
dt_members M
on M.id = B.mid
where signD >= '2021-03-01' and signD < '2021-04-01' and
not exists (select 1
from dt_billing_trans b2
where b2.mid = b.mid and
b2.signD >= '2021-03-01' and
b2.signD < '2021-04-01' and
(b2.transId LIKE 'WH%' OR b2.bank IN ('WT', 'MO', 'SL'))
);
This returns all transactions for members who do not have transactions that match the second set of conditions.
This is not exactly equivalent to your code (which does not require except in any database), but I suspect it is closer to what you intend.
Note:
JOIN. JOIN. JOIN.
Note the improved date comparisons that do not miss fractions of a second before midnight. And the coding is much simpler.
The SQL standard AND boolean operator is AND, not &&.

You don't need 2 queries.
Use 1 WHERE clause:
SELECT *
FROM dt_billing_trans B INNER JOIN dt_members M
ON M.id = B.mid
WHERE (signD >= '2021-03-01 00:00:00' AND signD <= '2021-03-31 23:59:59')
AND NOT (transId LIKE 'WH%' OR bank IN ('WT', 'MO', 'SL'));

Related

Is it possible to LEFT JOIN a #variable table in SQL

Summary
I am attempting to LEFT JOIN on a filtered TABLE in SQL.
To do so I am first declaring a temporary table in an # variable, then attempting to join on it later.
Unfortunately I am not having much luck doing this, it massively speeds up my query when limiting to such a resultset.
Other Routes Tried
I initially was trying to conduct this on the WHERE, however this was preventing rows from occurring where there were no matching events (I am counting the events by intervals_days so I need to return regardless of whether there's matching in the other).
After I realised my mistake, I moved to the ON. I have not seen examples of this done, and I have a feeling doing a FIND_IN_SET here on a SELECT would not be performant?
QUERIES
The Initial Filter
DECLARE #filtered_events TABLE (id INT, eventable_id INT, eventable_type VARCHAR(255), occurred_at TIMESTAMP, finished_at TIMESTAMP)
INSERT INTO #filtered_events
SELECT
e.id, e.eventable_id, e.eventable_type, e.occurred_at, e.finished_at
FROM
units_events as ue
LEFT JOIN
events as e
ON
e.id = ue.event_id
WHERE
ue.unit_id
IN
(1,2,3);
Ignore the (1,2,3) here, these values are added to the query dynamically.
My Attempted Use Of It
SELECT
i.starts_at as starts_at,
i.ends_at as ends_at,
count(fe.id) as count
FROM
intervals_days as i
LEFT JOIN
#filtered_events as fe
ON
( fe.occurred_at >= starts_at AND fe.occurred_at < ends_at )
WHERE
( starts_at >= '15-01-2019' AND ends_at < NOW() )
GROUP BY
starts_at
ORDER BY
starts_at
DESC;
These queries are within one SQL document, one above the other with the terminating ; semicolon on each.
I expect this to output what my lower query outputs (a grouped resultset by the intervals_days rows)- however with the benefit of my LEFT JOIN query being conducted on a much smaller sample.
Sorry, no such syntax.
MySQL has no concept of arrays, either.
Nor can you use DECLARE outside of a Stored Routine.
As Shadow said in the comments, with MySQL I would use a subquery like this:
SELECT
i.starts_at as starts_at,
i.ends_at as ends_at,
count(fe.id) as count
FROM
intervals_days as i
LEFT JOIN
(
SELECT
e.id,
e.eventable_id,
e.eventable_type,
e.occurred_at,
e.finished_at
FROM units_events as ue
LEFT JOIN events as e
ON e.id = ue.event_id
WHERE ue.unit_id IN (1,2,3)
) fe
ON ( fe.occurred_at >= starts_at AND fe.occurred_at < ends_at )
WHERE ( starts_at >= '15-01-2019' AND ends_at < NOW() )
GROUP BY starts_at
ORDER BY starts_at
DESC;

SQL Count on JOIN query is taking forever to execute?

I'm trying to run count query on a 2 table join. e_amazing_client table is having million entries/rows and m_user has just 50 rows BUT count query is taking forever!
SELECT COUNT(`e`.`id`) AS `count`
FROM `e_amazing_client` AS `e`
LEFT JOIN `user` AS `u` ON `e`.`cx_hc_user_id` = `u`.`id`
WHERE ((`e`.`date_created` >= '2018-11-11') AND (`e`.`date_created` >= '2018-11-18')) AND (`e`.`id` >= 1)
I don't know what is wrong with this query?
First, I'm guessing that this is sufficient:
SELECT COUNT(*) AS `count`
FROM e_amazing_client e
WHERE e.date_created >= '2018-11-11' AND e.id >= 1;
If user has only 50 rows, I doubt it is creating duplicates. The comparisons on date_created are redundant.
For this query, try creating an index on e_amazing_client(date_created, id).
Maybe you wanted this:
SELECT COUNT(`e`.`id`) AS `count`
FROM `e_amazing_client` AS `e`
LEFT JOIN `user` AS `u` ON `e`.`cx_hc_user_id` = `u`.`id`
WHERE ((`e`.`date_created` >= '2018-11-11') AND (`e`.`date_created` <= '2018-11-18')) AND (`e`.`id` >= 1)
to check between dates?
Also, do you really need
AND (`e`.`id` >= 1)
If id is what an id is usually in a table, is there a case to be <1?
Your query is pulling ALL records on/after 2018-11-11 because your WHERE clause is ID >= 1 You have no clause in there for a specific user. You also had in your original query based on a date of >= 2018-11-18. You MAY have meant you only wanted the count WITHIN the week 11/11 to 11/18 where the sign SHOULD have been >= 11-11 and <= 11-18.
As for the count, you are getting ALL people (assuming no entry has an ID less than 1) and thus a count within that date range. If you want it per user as you indicated you need to group by the cx_hc_user_id (user) column to see who has the most, or make the user part of the WHERE clause to get one person.
SELECT
e.cx_hc_user_id,
count(*) countPerUser
from
e_amazing_client e
WHERE
e.date_created >= '2018-11-11'
AND e.date_created <= '2018-11-18'
group by
e.cx_hc_user_id
You can order by the count descending to get the user with the highest count, but still not positive what you are asking.

Rewrite MySQL code to retrieve average time

I don't know if that's a question for SO, if not please delete it. I am using the query below to calculate an average time it takes for a ticket on our service desk to get closed.
I don't have write permissions on the database, so I can't create functions, variables etc.
I strongly believe that there must be a better/nicer, more robust way to calculate that, than my query below, any thoughts?
What I want to avoid, if possible, is to recalculate the count value, which especially with all the where clauses makes the query a bit slow.
SELECT Count(hd_ticket.id) AS 'Tickets #',
ROUND(( Timestampdiff(hour, hd_ticket.created, hd_ticket.time_closed) /
(SELECT Count(hd_ticket.id)
FROM
hd_ticket
LEFT JOIN hd_status
ON hd_status_id = hd_status.id
WHERE
Month(
hd_ticket.time_closed) = 12
AND
Year
(hd_ticket.time_closed) = 2017
AND
hd_status.state LIKE '%close%'
AND
hd_ticket.hd_queue_id IN ( 8 )) )) AS
'AVG Closure Time'
FROM hd_ticket
LEFT JOIN hd_status
ON hd_status_id = hd_status.id
WHERE Month(hd_ticket.time_closed) = 12
AND Year(hd_ticket.time_closed) = 2017
AND hd_status.state LIKE '%close%'
AND hd_ticket.hd_queue_id IN ( 8 )
In a nutshell what the above query does is
SELECT COUNT(TICKETS) as 'Tickets #',
ROUND(TOTAL_TIME_TAKES_TO_CLOSE_TICKETS/COUNT(TICKETS + FILTERS)) as 'AVG Closure Time'
FROM HD_TICKET
SOME FILTERS
I would recommend:
SELECT Count(*) as Num_Tickets_Closed,
AVG( ( Timestampdiff(hour, t.created, t.time_closed) ) as AVG_CLosure_Time
FROM hd_ticket t LEFT JOIN
hd_status s
ON t.hd_status_id = s.id
WHERE t.time_closed >= '2017-12-01' AND
t.time_closed < '2018-01-01' AND
s.state LIKE '%close%' AND
t.hd_queue_id IN ( 8 ) ;
Notes:
First, you can just use AVG(). That greatly simplifies the query.
The date comparisons are made without functions. Although this likely has little impact in your case, it allows the use of indexes.
The names of the columns no longer have special characters, so they don't need to be escaped.
Table aliases make the query easier to write and to read.

How to fix SQL query with Left Join and subquery?

I have SQL query with LEFT JOIN:
SELECT COUNT(stn.stocksId) AS count_stocks
FROM MedicalFacilities AS a
LEFT JOIN stocks stn ON
(stn.stocksIdMF = ( SELECT b.MedicalFacilitiesIdUser
FROM medicalfacilities AS b
WHERE b.MedicalFacilitiesIdUser = a.MedicalFacilitiesIdUser
ORDER BY stn.stocksId DESC LIMIT 1)
AND stn.stocksEndDate >= UNIX_TIMESTAMP() AND stn.stocksStartDate <= UNIX_TIMESTAMP())
These query I want to select one row from table stocks by conditions and with field equal value a.MedicalFacilitiesIdUser.
I get always count_stocks = 0 in result. But I need to get 1
The count(...) aggregate doesn't count null, so its argument matters:
COUNT(stn.stocksId)
Since stn is your right hand table, this will not count anything if the left join misses. You could use:
COUNT(*)
which counts every row, even if all its columns are null. Or a column from the left hand table (a) that is never null:
COUNT(a.ID)
Your subquery in the on looks very strange to me:
on stn.stocksIdMF = ( SELECT b.MedicalFacilitiesIdUser
FROM medicalfacilities AS b
WHERE b.MedicalFacilitiesIdUser = a.MedicalFacilitiesIdUser
ORDER BY stn.stocksId DESC LIMIT 1)
This is comparing MedicalFacilitiesIdUser to stocksIdMF. Admittedly, you have no sample data or data layouts, but the naming of the columns suggests that these are not the same thing. Perhaps you intend:
on stn.stocksIdMF = ( SELECT b.stocksId
-----------------------------^
FROM medicalfacilities AS b
WHERE b.MedicalFacilitiesIdUser = a.MedicalFacilitiesIdUser
ORDER BY b.stocksId DESC
LIMIT 1)
Also, ordering by stn.stocksid wouldn't do anything useful, because that would be coming from outside the subquery.
Your subquery seems redundant and main query is hard to read as much of the join statements could be placed in where clause. Additionally, original query might have a performance issue.
Recall WHERE is an implicit join and JOIN is an explicit join. Query optimizers
make no distinction between the two if they use same expressions but readability and maintainability is another thing to acknowledge.
Consider the revised version (notice I added a GROUP BY):
SELECT COUNT(stn.stocksId) AS count_stocks
FROM MedicalFacilities AS a
LEFT JOIN stocks stn ON stn.stocksIdMF = a.MedicalFacilitiesIdUser
WHERE stn.stocksEndDate >= UNIX_TIMESTAMP()
AND stn.stocksStartDate <= UNIX_TIMESTAMP()
GROUP BY stn.stocksId
ORDER BY stn.stocksId DESC
LIMIT 1

Speed up MySql query time with multiple conditional joins

There are 3 tables, persontbl1, persontbl2 (each 7500 rows) and schedule (~3000 active schedules i.e. schedule.status = 0). Person tables contain data for the same persons as one to one relationship and INNER join between two takes less than a second. And schedule table contains data about persons to be interviewed and not all persons have schedules in schedule table. With Left join query instantly takes around 45 seconds, which is causing all sorts of issues.
SELECT persontbl1._CREATION_DATE, persontbl2._TOP_LEVEL_AURI,
persontbl2.RESP_CNIC, persontbl2.RESP_CNIC_NAME,
persontbl1.MOB_NUMBER1, persontbl1.MOB_NUMBER2,
schedule.id, schedule.call_datetime, schedule.enum_id,
schedule.enum_change, schedule.status
FROM persontbl1
INNER JOIN persontbl2 ON (persontbl2._TOP_LEVEL_AURI = persontbl1._URI)
AND (AGR_CONTACT=1)
LEFT JOIN SCHEDULE ON (schedule.survey_id = persontbl1._URI)
AND (SCHEDULE.status=0)
AND (DATE(SCHEDULE.call_datetime) <= CURDATE())
ORDER BY schedule.call_datetime IS NULL DESC, persontbl1._CREATION_DATE ASC
Here is the explain for query:
Schedule Table structure:
Schedule Table indexes:
Please let me know if any further information is required.
Thanks.
Edit: Added fully qualified table names and their columns.
You should just replace this line:
AND (DATE(SCHEDULE.call_datetime) <= CURDATE())
to this one:
AND SCHEDULE.call_datetime <= '2015-04-18 00:00:00'
so mysql will not call 2 functions per every record but will use static constant '2015-04-18 00:00:00'.
So you can just try for performance improvements if your query is:
SELECT persontbl1._CREATION_DATE, persontbl2._TOP_LEVEL_AURI,
persontbl2.RESP_CNIC, persontbl2.RESP_CNIC_NAME,
persontbl1.MOB_NUMBER1, persontbl1.MOB_NUMBER2,
schedule.id, schedule.call_datetime, schedule.enum_id,
schedule.enum_change, schedule.status
FROM persontbl1
INNER JOIN persontbl2 ON (persontbl2._TOP_LEVEL_AURI = persontbl1._URI)
AND (AGR_CONTACT=1)
LEFT JOIN SCHEDULE ON (schedule.survey_id = persontbl1._URI)
AND (SCHEDULE.status=0)
AND (SCHEDULE.call_datetime <= '2015-02-01 00:00:00')
ORDER BY schedule.call_datetime IS NULL DESC, persontbl1._CREATION_DATE ASC
EDIT 1 So you said without LEFT JOIN part it was fast enough, so you can try then:
SELECT persontbl1._CREATION_DATE, persontbl2._TOP_LEVEL_AURI,
persontbl2.RESP_CNIC, persontbl2.RESP_CNIC_NAME,
persontbl1.MOB_NUMBER1, persontbl1.MOB_NUMBER2,
s.id, s.call_datetime, s.enum_id,
s.enum_change, s.status
FROM persontbl1
INNER JOIN persontbl2 ON (persontbl2._TOP_LEVEL_AURI = persontbl1._URI)
AND (AGR_CONTACT=1)
LEFT JOIN
(SELECT *
FROM SCHEDULE
WHERE status=0
AND call_datetime <= '2015-02-01 00:00:00'
) s
ON s.survey_id = persontbl1._URI
ORDER BY s.call_datetime IS NULL DESC, persontbl1._CREATION_DATE ASC
I'm guessing that AGR_CONTACT comes from p1. This is the query you want to optimize:
SELECT p1._CREATION_DATE, _TOP_LEVEL_AURI, RESP_CNIC, RESP_CNIC_NAME,
MOB_NUMBER1, MOB_NUMBER2,
s.id, s.call_datetime, s.enum_id, s.enum_change, s.status
FROM persontbl1 p1 INNER JOIN
persontbl2 p2
ON (p2._TOP_LEVEL_AURI = p1._URI) AND (p1.AGR_CONTACT = 1) LEFT JOIN
SCHEDULE s
ON (s.survey_id = p1._URI) AND
(s.status = 0) AND
(DATE(s.call_datetime) <= CURDATE())
ORDER BY s.call_datetime IS NULL DESC, p1._CREATION_DATE ASC;
The best indexes for this query are: persontbl2(agr_contact), persontbl1(_TOP_LEVEL_AURI, _uri), and schedule(survey_id, status, call_datime).
The use of date() around the date time is not recommended. In general, that precludes the use of indexes. However, in this case, you have a left join, so it doesn't make a difference. That column is not being used for filtering anyway. The index on schedule is only for covering the on clause.