number of rows return different in mysql - mysql

For query first i make the aggregated table,pre-processing table like using below query
query no. 1
DROP VIEW IF EXISTS aggregated;
CREATE VIEW aggregated as
SELECT
`date`,
district,
town,
locality,
vendor,
freq_kind,
bandwidth,
circle_id,
SUM(data_total),
SUM(capacity),
CASE WHEN (SUM(IFNULL(data_total,0))/SUM(IFNULL(capacity,0))) * 100 < 50 THEN 1 ELSE 0 END AS productivity
FROM (
SELECT
circle_id,
`date`,
district,town,locality,vendor,freq_kind,bandwidth,
site_id,
`download`+`upload` AS data_total,
CASE WHEN freq_kind = "2300" AND bandwidth = "20 MHz" THEN 105
WHEN freq_kind = "2300" AND bandwidth = "15 MHz" THEN 80
WHEN freq_kind = "2300" AND bandwidth = "10 MHz" THEN 45
ELSE 0 END AS capacity
FROM websites WHERE `date` = "2020-03-28" AND upload IS NOT NULL AND download IS NOT NULL AND LOWER(site_id) != 'na' AND site_id IS NOT NULL) AS aa GROUP BY site_id
HAVING productivity =1
if i make the query
select count(*) from aggregated where circle_id = '102' and district = 'East'
It return 98 rows;
If i add this condition also with query no.1 then where clause so final looks like
WHERE `date` = "2020-03-28" and `circle_id = '102' and district = 'East' AND download IS NOT NULL AND upload IS NOT NULL AND LOWER(site_id) != 'na' AND site_id IS NOT NULL) AS aa GROUP BY site_id
HAVING productivity =1
then it
return 101 ; which is correct ; where i am wrong.
For more if run the query like as below
SELECT
`date`,
district,
town,
locality,
vendor,
freq_kind,
bandwidth,
circle_id,
SUM(data_total),
SUM(capacity),
CASE WHEN (SUM(IFNULL(data_total,0))/SUM(IFNULL(capacity,0))) * 100 < 50 THEN 1 ELSE 0 END AS productivity
FROM (
SELECT
circle_id,
`date`,
district,town,locality,vendor,freq_kind,bandwidth,
site_id,
`download`+`upload` AS data_total,
CASE WHEN freq_kind = "2300" AND bandwidth = "20 MHz" THEN 105
WHEN freq_kind = "2300" AND bandwidth = "15 MHz" THEN 80
WHEN freq_kind = "2300" AND bandwidth = "10 MHz" THEN 45
ELSE 0 END AS capacity
FROM websites WHERE `date` = "2020-03-28" and `circle_id = '102' and district = 'East' AND download IS NOT NULL AND upload IS NOT NULL AND LOWER(site_id) != 'na' AND site_id IS NOT NULL) AS aa GROUP BY site_id
HAVING productivity =1
The above query will return 101 rows;

Related

Can someone optimize this SQL query?

I am currently working on a project that has 2 very large sql tables Users and UserDocuments having around million and 2-3 millions records respectively. I have a query that will return the count of all the documents that each indvidual user has uploaded provided the document is not rejected.
A user can have multiple documents against his/her id.
My current query:-
SELECT
u.user_id,
u.name,
u.date_registered,
u.phone_no,
t1.docs_count,
t1.last_uploaded_on
FROM
Users u
JOIN(
SELECT user_id,
MAX(updated_at) AS last_uploaded_on,
SUM(CASE WHEN STATUS != 2 THEN 1 ELSE 0 END) AS docs_count
FROM
UserDocuments
WHERE
user_id IN(
SELECT
user_id
FROM
Users
WHERE
region_id = 1 AND city_id = 8 AND user_type = 1 AND user_suspended = 0 AND is_enabled = 1 AND verification_status = -1
) AND document_id IN('1', '2', '3', '4', '10', '11')
GROUP BY
user_id
ORDER BY
user_id ASC
) t1
ON
u.user_id = t1.user_id
WHERE
docs_count < 6 AND region_id = 1 AND city_id = 8 AND user_type = 1 AND user_suspended = 0 AND is_enabled = 1 AND verification_status = -1
LIMIT 1000, 100
Currently the query is taking very long around 20 secs to return data with indexes. can someone suggest some tweaks in the follwing query to gain some more preformance out of it.
SELECT
u.user_id,
max( u.name ) name,
max( u.date_registered ) date_registered,
max( phone_no ) phone_no,
MAX(d.updated_at) last_uploaded_on,
SUM(CASE WHEN d.STATUS != 2
THEN 1 ELSE 0 END) docs_count
FROM
Users u
JOIN UserDocuments d
ON u.user_id = d.user_id
AND d.document_id IN ('1', '2', '3', '4', '10', '11')
WHERE
u.region_id = 1
AND u.city_id = 8
AND u.user_type = 1
AND u.user_suspended = 0
AND u.is_enabled = 1
AND u.verification_status = -1
GROUP BY
u.user_id
HAVING
SUM(CASE WHEN d.STATUS != 2
THEN 1 ELSE 0 END) < 6
ORDER BY
u.user_id ASC
LIMIT
1000, 100
Have indexes on your tables as
user ( region_id, city_id, user_type, user_suspended, is_enabled, verification_status )
UserDocuments ( user_id, document_id, status, updated_at )
You are adding extra querying from the user table to both the inner and outer joins which might be killing it. Having an index on your critical "WHERE" components by user will pre-filter that set out. Only from that will it join to the UserDocuments table. By having the outer query get the counts() at the top level query.
Since the users name, registered and phone dont change per user, applying max() to each respectively prevents the need of adding those columns to the group by clause.
The index on the documents table on only the columns needed to confirm status and document_id and when last updated. This prevents the engine from having to go to the raw data pages as it can get the qualifying details directly from the index parts saving you time too.
LIMIT without ORDER BY does not make sense.
An ORDER BY in a 'derived table' is ignored.
Will you really have thousands of result rows? (I see the "offset of 1000".)
Use JOIN instead of IN ( SELECT ... )
What indexes do you have? I suggest INDEX(region_id, city_id, user_id)
CASE WHEN d.STATUS != 2 THEN 1 ELSE 0 END can be shortened to d.status != 2.
How many different values of status are there? If only two, then flip the test to d.status = 1`.

Getting distinct values from table

Im having and issue where in my table FarmerGroups I have multiple records by BSI_Code and I am getting double results for GallonsIssued due to this inner join. Is there a way to get the unique value of GallonsIssued or a way to just get results by individual BSI_CODE
With Summary as (
Select B_NAME as Branch, LOC as Location
,SUM(payment) as Gallons
,SUM(case when printed = 1 THEN Fee ELSE NULL END) as FeeCollected
,SUM(case when printed = 0 THEN Fee ELSE NULL END) as FeeNotCollected
,SUM(case when printed = 1 THEN Payment ELSE NULL END) as GallonsIssued
,SUM(case when printed = 0 THEN Payment ELSE NULL END) as GallonsNotIssued
From SicbWeeklyDeliveriesFuel F Inner Join FarmerGroups G ON G.BSI_CODE = F.BSI_CODE AND G.CROP_SEASON = F.CROP_SEASON AND F.B_NAME = G.BRANCH
Where F.CROP_SEASON = #cropseason
Group By B_NAME, LOC
)
SELECT Branch
,Location
,Gallons
,GallonsIssued
,GallonsNotIssued
,FeeCollected
,FeeNotCollected
,((GallonsIssued/Gallons) * 100) as pct_GallonsCollected
FROM Summary
Order by Location, Branch
For SicbWeeklyDeliveriesFuel
BSI_CODE
Payment
LOC
CROP_SEASON
Fee
B_NAME
FNAME
66
125
CZ
5
12.5
DOUGLAS
John K
55
147
OW
5
14.7
CALEDONIA
Tim H
66
95
CZ
5
9.5
DOUGLAS
John K
For Farmer Groups
BSI_CODE
Farmer
CROP_SEASON
BRANCH
TEST_GROUP
66
John K
5
DOUGLAS
1A
55
Tim H
5
CALEDONIA
1B
66
John K
5
DOUGLAS
2A
Your selection for the JOIN of G.BSI_CODE = F.BSI_CODE AND G.CROP_SEASON = F.CROP_SEASON AND F.B_NAME = G.BRANCH does not uniquely define the rows.
You will need to also include F..FNAME = G.Farmer otherwise the first row of SicbWeeklyDeliveriesFuel (BSI_CODE = 66, CROP_SEASON = 5 and B_NAME = DOUGLAS) matches both the first and last rows of FarmerGroups. Likewise the third row also matches the same two rows in FarmerGroups.
The reason for the duplication is the field TEST_GROUP in FarmerGroups Table.
But you don't need this field in the Join.
First,a CTE to get the info you need in the join without duplicates.
then your old join to the new CTE.
Try this:
WITH FarmersGroup AS
(
SELECT DISTINCT
BSI_CODE
, CROP_SEASON
, BRANCH
FROM FarmerGroups
)
, Summary AS
(
SELECT
Branch = B_NAME
, Location = LOC
, Gallons = SUM(payment)
, FeeCollected = SUM(case when printed = 1 THEN Fee ELSE NULL END)
, FeeNotCollected = SUM(case when printed = 0 THEN Fee ELSE NULL END)
, GallonsIssued = SUM(case when printed = 1 THEN Payment ELSE NULL END)
, GallonsNotIssued = SUM(case when printed = 0 THEN Payment ELSE NULL END)
FROM SicbWeeklyDeliveriesFuel F
JOIN FarmerGroup G ON G.BSI_CODE = F.BSI_CODE
AND G.CROP_SEASON = F.CROP_SEASON
AND G.BRANCH = F.B_NAME
WHERE F.CROP_SEASON = #cropseason
GROUP BY
B_NAME, LOC
)
SELECT
Branch
, Location
, Gallons
, GallonsIssued
, GallonsNotIssued
, FeeCollected
, FeeNotCollected
, pct_GallonsCollected = ((GallonsIssued/Gallons) * 100)
FROM Summary
ORDER BY
Location
, Branch
You can use Andy's code above and it should do the job or you can just replace the table join in your current query
Change the following
Inner Join FarmerGroups G ON G.BSI_CODE = F.BSI_CODE AND G.CROP_SEASON = F.CROP_SEASON AND F.B_NAME = G.BRANCH
to
Inner join (select SELECT DISTINCT
BSI_CODE
, CROP_SEASON
, BRANCH
FROM FarmerGroups ) G on
ON G.BSI_CODE = F.BSI_CODE AND G.CROP_SEASON = F.CROP_SEASON AND F.B_NAME = G.BRANCH

Sum on same column with two different conditions mysql

I have a table named Order with schema as
user_id, state amount
11 success 100
11 FAILED 10
11 FAILED 10
11 success 17
state can have two values (Success/Failed).
I want to fetch sum(amount) when state = "SUCCESS" - sum(amount) when state = "FAILED"
means difference total amount when success - total amount when failed.
I can solve this problem in 2 queries.
A = select id, sum(amount) when state = "SUCCESS"
B = select id, sum(amount) when state = "FAILED"
And solution will be A-B.
Is there any way I can achieve this in single sql query?
use case when
select user_id,sum(case when state = 'SUCCESS' then amount else 0 end)-sum(case when state = 'FAILED' then amount else 0 end)
from table group by user_id
Use conditional aggregation:
select id,
sum(case when state = 'SUCCESS' then amount else - amount end) as total
from t
where state in ('SUCCESS', 'FAILED')
group by id;
I assume that you want this sum per id and not overall in the table.
select sum(case when state = "SUCCESS" then amount else o end) -
sum(case when state = "FAILED" then amount else o end)
from tbl
group by userid

sql server 2008 running totals between 2 dates

I need to get running totals between 2 dates in my sql server table and update the records simultaneoulsy. My data is as below and ordered by date,voucher_no
DATE VOUCHER_NO OPEN_BAL DEBITS CREDITS CLOS_BAL
-------------------------------------------------------------------
10/10/2017 1 100 10 110
12/10/2017 2 110 5 105
13/10/2017 3 105 20 125
Now if i insert a record with voucher_no 4 on 12/10/2017 the output should be like
DATE VOUCHER_NO OPEN_BAL DEBITS CREDITS CLOS_BAL
------------------------------------------------------------------
10/10/2017 1 100 10 110
12/10/2017 2 110 5 105
12/10/2017 4 105 4 109
13/10/2017 3 109 20 129
I have seen several examples which find running totals upto a certain date but not between 2 dates or from a particular date to end of file
You should consider changing your database structure. I think it will be better to keep DATE, VOUCHER_NO, DEBITS, CREDITS in one table. And create view to calculate balances. In that case you will not have to update table after each insert. In this case your table will look like
create table myTable (
DATE date
, VOUCHER_NO int
, DEBITS int
, CREDITS int
)
insert into myTable values
('20171010', 1, 10, null),( '20171012', 2, null, 5)
, ('20171013', 3, 20, null), ('20171012', 4, 4, null)
And view will be
;with cte as (
select
DATE, VOUCHER_NO, DEBITS, CREDITS, bal = isnull(DEBITS, CREDITS) * case when DEBITS is null then -1 else 1 end
, rn = row_number() over (order by DATE, VOUCHER_NO)
from
myTable
)
select
a.DATE, a.VOUCHER_NO, a.DEBITS, a.CREDITS
, OPEN_BAL = sum(b.bal + case when b.rn = 1 then 100 else 0 end) - a.bal
, CLOS_BAL = sum(b.bal + case when b.rn = 1 then 100 else 0 end)
from
cte a
join cte b on a.rn >= b.rn
group by a.DATE, a.VOUCHER_NO, a.rn, a.bal, a.DEBITS, a.CREDITS
Here's another solution if you can not change your db structure. In this case you must run update statement each time after inserts. In both cases I assume that initial balance is 100 while recalculation
create table myTable (
DATE date
, VOUCHER_NO int
, OPEN_BAL int
, DEBITS int
, CREDITS int
, CLOS_BAL int
)
insert into myTable values
('20171010', 1, 100, 10, null, 110)
,( '20171012', 2, 110, null, 5, 105)
, ('20171013', 3, 105, 20, null, 125)
, ('20171012', 4, null, 4, null, null)
;with cte as (
select
DATE, VOUCHER_NO, DEBITS, CREDITS, bal = isnull(DEBITS, CREDITS) * case when DEBITS is null then -1 else 1 end
, rn = row_number() over (order by DATE, VOUCHER_NO)
from
myTable
)
, cte2 as (
select
a.DATE, a.VOUCHER_NO
, OPEN_BAL = sum(b.bal + case when b.rn = 1 then 100 else 0 end) - a.bal
, CLOS_BAL = sum(b.bal + case when b.rn = 1 then 100 else 0 end)
from
cte a
join cte b on a.rn >= b.rn
group by a.DATE, a.VOUCHER_NO, a.rn, a.bal
)
update a
set a.OPEN_BAL = b.OPEN_BAL, a.CLOS_BAL = b.CLOS_BAL
from
myTable a
join cte2 b on a.DATE = b.DATE and a.VOUCHER_NO = b.VOUCHER_NO

Combine results from 2 SQL Server queries into 2 columns

I currently have 2 SQL queries:
select
SUM(CASE T1.DOCTYPE
WHEN '1' THEN T1.CURTRXAM *1
WHEN '4' THEN T1.CURTRXAM *-1
WHEN '5' THEN T1.CURTRXAM *-1
WHEN '6' THEN T1.CURTRXAM *-1
END) as [Payables TB]
from PM20000 T1
select
sum(PERDBLNC) as [GL Balance]
from GL10110
where ACTINDX = '130'
which return 2 results like this:
Payables TB
1520512.30
GL Balance
-1520512.30
I would like to combine the results into 2 columns and have a variance column like below -
Payables TB GL Balance Variance
1520512.30 -1520512.30 0.00
Thank you
simply
select
(select
SUM(CASE T1.DOCTYPE
WHEN '1' THEN T1.CURTRXAM *1
WHEN '4' THEN T1.CURTRXAM *-1
WHEN '5' THEN T1.CURTRXAM *-1
WHEN '6' THEN T1.CURTRXAM *-1
END) as [Payables TB]
from PM20000 T1) as Payables TB,
(select
sum(PERDBLNC) as [GL Balance]
from GL10110
where ACTINDX = '130') as GL Balance,
0.00 as Variance
You can wrap these into CTE's to reuse the values to compute the difference. With no join condition you will just need to CROSS JOIN, as long as these return just one row each :
WITH Payables AS
(
SELECT
SUM(
CASE
WHEN T1.DOCTYPE IN ('1') THEN T1.CURTRXAM *1
WHEN T1.DOCTYPE IN ('4','5','6') THEN T1.CURTRXAM *-1
-- ? ELSE
END) as [Payables TB]
FR PM20000 T1
),
Balance AS
(
SELECT
SUM(PERDBLNC) as [GL Balance]
FROM GL10110
WHERE ACTINDX = '130'
)
SELECT
Payables.[Payables TB],
Balance.[GL Balance],
Payables.[Payables TB] + Balance.[GL Balance] AS Variance
FROM
Payables, Balance; -- OR Payables CROSS JOIN Balance
Since you seem to be doing the same projection for T1.DOCTYPE 4, 5 and 6 in the first query, you can replace it with a CASE WHEN x IN (...)