I am using a pivot to generate a cross-tabulation report to summarize data using MSSQL Server. However, when I run my query, it didn't produce the result which I expect to.
This is the sql pivot query:
select Station,
[1] as Good,
[3] as Bad,
[5] as Deactivated,
[6] as Deleted
--StateCount
from
(
select m.MetaData1 as Station,
t.ID,
--m.StateID,
count(t.[State]) as StateCount
from MasterTags m inner join TagStates t on t.ID = m.StateID
where (m.MetaData1 is not null and m.MetaData1 != '') and (m.PIServerID = 1)
group by m.MetaData1, t.ID, m.StateID, t.[State]
) as result
PIVOT (count(result.StateCount) for ID in ([1], [3], [5], [6])) pvt
Result:
Station | Good | Bad | Deactivated | Deleted | StateCount
------- +-------+------+-------------+---------+-----------
ABY | 0 | 0 | 0 | 1 | 4
ABY | 0 | 1 | 0 | 0 | 18
ABY | 1 | 0 | 0 | 0 | 40
FTB | 0 | 1 | 0 | 0 | 10
FTB | 1 | 0 | 0 | 0 | 121
KIK | 0 | 1 | 0 | 0 | 1
KIK | 1 | 0 | 0 | 0 | 45
I have included the StateCount column above to show the actual count(t.[State]) values. But, instead I got a value of 1 for each of these columns (Good, Bad, Deactivated, Deleted) in the final result set. I would expect the StateCount values will be the data for these columns (as shown below).
Expected Output:
Station | Good | Bad | Deactivated | Deleted
------- +-------+------+-------------+--------
ABY | 40 | 18 | 0 | 4
FTB | 121 | 10 | 0 | 0
KIK | 45 | 1 | 0 | 0
This is my first time to use a pivot relation operator in a table-valued expression. Perhaps I don't really understand how to use it correctly. Is my pivot query wrong?
Any help is greatly appreciated.
I found out that my sql query isn't correctly structured the way I want to return the correct result. In order to arrive to my desired output, this is how I wrote the sql statement with PIVOT relation operator with table-valued expression.
select Station,
[1] as Good,
[3] as Bad,
[5] as Deactivated,
[6] as Deleted
from
(
select m.MetaData1,
m.MetaData1 as Station,
t.ID,
from MasterTags m inner join TagStates t on t.ID = m.StateID
where (m.MetaData1 is not null and m.MetaData1 != '') and (m.PIServerID = 1)
group by m.MetaData1, t.ID
) as result
PIVOT (count(result.MetaData1) for ID in ([1], [3], [5], [6])) pvt
That gives me the correct result.
Related
Data
+----+-----------+--------------+------------+--------+------------+
| id | action | question_id | answer_id | q_num | timestamp |
+----+-----------+--------------+------------+--------+------------+
| 5 | "show" | 285 | null | 1 | 123 |
| 5 | "answer" | 285 | 124124 | 1 | 124 |
| 5 | "show" | 369 | null | 2 | 125 |
| 5 | "skip" | 369 | null | 2 | 126 |
+----+-----------+--------------+------------+--------+------------+
MYSQL
select question_id as survey_log
from
(
SELECT sum(CASE WHEN action='answer' THEN 1 ELSE 0 END) as num,
question_id,
count(distinct id) as den
from
survey_log
group by question_id
) b
order by (num/den) desc
limit 1
Output
285
MSSQL
select top 1 question_id as survey_log
from
(
SELECT sum(CASE WHEN action='answer' THEN 1 ELSE 0 END) as num,
question_id,
count(distinct id) as den
from
survey_log
group by question_id
) b
order by (num/den) desc
Output
369
For most of the scenarios, I used top 1 and limit 1 for similar results until this question. Somehow in this query, I get different results. Where am I going wrong? Is the order of execution different in MSSQL for TOP clause? Or I'm totally confusing the use-case of the two?
Original Question from Leetcode
In SQL Server, the division of two integers is an integer. So, 1/2 = 0, not 0.5.
To fix this, use:
order by num * 1.0 / den
In addition, if there are duplicate values for the order by key, then an arbitrary equivalent row will be returned.
I have two predefined tolerance limits in another table(A), i want to calculate how many values which are above the tolerance limit and below the tolerance limit using 'Single Query' INTO two different variables using data in current table(B). Is it possible using single query ? There is also a very important where clause in the same query which is for non unique int column called referenceNo.
Example:
Tolerance 1 from Table A : 4
Tolerance 2 from Table A : 6
referenceNo and Data Value from Table B:
+-------------+------------+
| referenceNo | Data Value |
+-------------+------------+
| 227 | 7 |
| 227 | 2 |
| 227 | 4 |
| 227 | 5 |
| 227 | 9 |
| 228 | 5 |
| 228 | 1 |
| 228 | 0 |
| 228 | 8 |
| 228 | 6 |
+-------------+------------+
i am expecting output COUNT(*) for below Tolerance 1 and COUNT(*) for above Tolerance 2 INTO #BelowTolerance1Count and #AboveTolerance2Count.
Like:
Output: (For referenceNo = 227)
+-----------------------+-----------------------+
| #BelowTolerance1Count | #AboveTolerance2Count |
+-----------------------+-----------------------+
| 1 | 2 |
+-----------------------+-----------------------+
Output: (For referenceNo = 228)
+-----------------------+-----------------------+
| #BelowTolerance1Count | #AboveTolerance2Count |
+-----------------------+-----------------------+
| 2 | 1 |
+-----------------------+-----------------------+
Thanks in Advance.
I think the following code example will help you understand how to do this:
SELECT
referenceNo,
SUM(CASE WHEN VALUE < 4 THEN 1 ELSE 0 END) AS BELOW_4
SUM(CASE WHEN VALUE > 6 THEN 1 ELSE 0 END) AS ABOVE_6
FROM TABLE_NAME
GROUP BY referenceNo
Note: This solution solves for all reference numbers, not just a specific number. This is often how SQL works since it is set based.
You could make a view
CREATE VIEW SOLVE_PROBLEM AS
SELECT
referenceNo,
SUM(CASE WHEN VALUE < 4 THEN 1 ELSE 0 END) AS BELOW_4
SUM(CASE WHEN VALUE > 6 THEN 1 ELSE 0 END) AS ABOVE_6
FROM TABLE_NAME
GROUP BY referenceNo
And then use it
SELECT * FROM SOLVE_PROBLEM WHERE referenceNo = 227
SELECT * FROM SOLVE_PROBLEM WHERE referenceNo = 228
or even
SELECT
#BelowTolerance1Count = BELOW_4,
#AboveTolerance2Count = ABOVE_6
FROM SOLVE_PROBLEM
WHERE referenceNo = 228
I have some data (~70,000 rows) that is in a similar format to the below.
+-----------+-----+-----+----+-----------+
| ID | A | B | C | Whatever |
+-----------+-----+-----+----+-----------+
| 1banana | 42 | 0 | 2 | Um |
| fhqwhgads | 514 | 6 | 9 | Nevermind |
| 2banana | 69 | 42 | 0 | NULL |
| pears | 18 | 96 | 2 | 8.8 |
| zubat2 | 96 | 2 | 14 | "NULL" |
+-----------+-----+-----+----+-----------+
I want to make an output table that counts how many times each number occurs in any of the three columns, such as:
+--------+---------+---------+---------+-----+
| Number | A count | B count | C count | sum |
+--------+---------+---------+---------+-----+
| 0 | 0 | 1 | 1 | 2 |
| 2 | 0 | 1 | 2 | 3 |
| 6 | 0 | 1 | 0 | 1 |
| 9 | 0 | 0 | 1 | 1 |
| 14 | 0 | 0 | 1 | 1 |
| 18 | 1 | 0 | 0 | 1 |
| 42 | 1 | 1 | 0 | 2 |
| 69 | 1 | 0 | 0 | 1 |
| 96 | 1 | 1 | 0 | 2 |
| 514 | 1 | 0 | 0 | 1 |
+--------+---------+---------+---------+-----+
(In my real-world use, there would be at least 10 times as many rows in the input table than in the query result)
Whether or not the query returns a row of zeros for numbers that aren't anywhere in those 3 columns isn't that important, as is a lack of a distinct sum column (though my preferences are that it does have the sum column and numbers not in any column are excluded).
Currently, I am using the following query to get ungrouped data:
SELECT * #Number, COUNT(DISTINCT A), COUNT(DISTINCT B), COUNT(DISTINCT C)
FROM
( # Generate a list of numbers to try
SELECT #ROW := #ROW + 1 AS `Number`
FROM DataTable t
join (SELECT #ROW := -9) t2
LIMIT 777 # None of the numbers I am interested in should be greater than this
) AS NumberList
INNER JOIN DataTable ON
Number = A
OR Number = B
OR Number = C
#WHERE <filters on DataTable columns to speed things up>
#WHERE NUMBER = 10 # speed things up
#GROUP BY Number
The above query with the commented-out parts of the code left as they are returns a table similar to the data table, but sorted by which number of the entry it matches. I would like to group together all rows starting with the same Number, and have the values in the "data" columns of the query result be the count of how many times the Number occured in the corresponding column of DataTable.
When I uncomment the grouping statements (and delete the * from the SELECT statement), I can get the count of how many rows each Number appeared in (useful for the sum column of the desired output). However, it does not give me the actual totals of how many times the Number matched each data column: I just get three copies of the number of rows where Number was found. How do I get the groupings to be by each actual column instead of the total number of matching rows?
Additionally, you may have noticed that I have some lines with comments regarding speeding things up. This query is slow, so I added a couple filters so testing it runs faster. I would very much like some way to make it run fast so that sending the results of the query from the complete set to a new table is not the only reasonable way to re-use this data, since I would like to have the ability to play around with the filters on DataTable for non-performance reasons. Is there a better way to structure the overall query so that it runs faster?
I think you want to unpivot using union all and then an aggregation:
select number, sum(a) as a, sum(b) as b, sum(c) as c, count(*) as `sum`
from ((select a as number, 1 as a, 0 as b, 0 as c from t
) union all
(select b, 0 as a, 1 as b, 0 as c from t
) union all
(select c, 0 as a, 0 as b, 1 as c from t
)
) abc
group by number
order by number;
I need to show the data from DB into a table of report file.
my_table looks like:
+----+-------+------+------+-------------------+-----------+-------+----+-------------------+
| id |entryID|userID|active| dateCreated |affiliateId|premium|free| endDate |
| 1 | 69856 | 1 | N |2014-03-22 13:54:49| 1 | N | N |2014-03-22 13:54:49|
| 2 | 63254 | 2 | Y |2014-03-21 13:35:15| 2 | Y | N | |
| 3 | 56324 | 3 | N |2014-03-21 11:11:22| 2 | Y | N |2014-02-22 16:44:46|
| 4 | 41256 | 4 | Y |2014-03-21 08:10:46| 1 | N | Y | |
| .. | ... | ... | ... | ... | ... | ... | .. | ... |
+----+-------+------+------+-------------------+-----------+-------+----+-------------------+
I need to create the table with data from my_table
| Date | № of Entries (in that date) | Total № of Entries | Premium | Free | Afiiliate |
The final table in file should looks like:
Report 17-07-2013:
+----------+--------------+-------+---------+------+-----------+
| Date | № of Entries | Total | Premium | Free | Afilliate |
|2013-07-17| 2 | 99845 | 2 | 0 | 0 |
|2013-07-18| 1 | 99843 | 0 | 1 | 0 |
|2013-07-22| 1 | 99842 | 1 | 0 | 1 |
|2013-07-23| 3 | 99841 | 2 | 1 | 2 |
|2013-07-24| 298 | 99838 | 32 | 273 | 25 |
|2013-07-25| 5526 | 99540 | 474 | 5058 | 126 |
|2013-07-26| 1686 | 94014 | 157 | 1532 | 56 |
|2013-07-27| 1673 | 92328 | 156 | 1517 | 97 |
|2013-07-28| 1461 | 90655 | 155 | 1310 | 83 |
| ... | ... | ... | ... | ... | ... |
+----------+--------------+-------+---------+------+-----------+
Should I for each column do a SELECT or I should do only 1 select?
If it possible to do 1 select how to do it?
It should be by analogy with this report:
report
Some fields differ (like 'Number of Entries in that date').
Total number of Entries means: all entries from beginning to the that specific date.
Number of Entries in that date means: all entries in that date.
In a final table the date from column Date will not repeat, that's why Column 'Number of Entries (in that date)' will calculate all entries for that date.
Your result is not so clear for the total is a count or sum and affiliate is sum or count also
but assuming total will be count and affiliate will be sum
here a query you might use to give you a result ( using ms-sql )
select DateCreated,count(EntryId) as Total,
sum(case when Premium='Y' then 1 else 0 end) as Premium,
sum(case when Premium='N' then 1 else 0 end) as Free,
sum(AffiliateId) as Affiliate
from sample
group by DateCreated
here a working demo
if I didn't understood you correctly, kindly advise
hope it will help you
SQLFiddle Demo: http://sqlfiddle.com/#!9/20cc0/5
The added column entryID does not matter for us.
I don't really understand what you want for Total, or the criteria for affiliateID. This query should get you started.
SELECT
DATE(dateCreated) as "Date",
count(dateCreated) as "No of Entries",
99845 as Total,
sum( case when premium='Y' then 1 else 0 end ) as Premium,
sum( case when premium='N' then 1 else 0 end ) as Free,
sum( case when affiliateID IS NOT NULL then 1 else 0 end) as Affiliate
FROM MyTable
GROUP BY DATE(dateCreated)
ORDER BY Date ASC
The final table in file should looks like:
... This new table can be in a file or in the web page. But it is not a new table in DB. –
It sounds like you may be new to this area so I just wanted to inform you that spitting out a report into a file for a website is highly unusual and typically only done when your data is completely separate from the website. Putting data from a database onto a website (like the query we made here) is very common and it's very likely you don't need to mess with any files.
select date(DateCreated),count(entryId) as Total,
sum(case when Premium='Y' then 1 else 0 end) as Premium,
sum(case when Premium='N' then 1 else 0 end) as Free,
sum( case when affiliateID IS NOT NULL then 1 else 0 end) as Affiliate
INTO OUTFILE '/tmp/myfile.csv'
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
from my_table
group by date(DateCreated) order by date(DateCreated);
Please help me to fix mysql query and get correct results...
Please see dataset for tables as following...
students
| id | name | batch | discount | open_bal | inactive |
+----+-------+-------+----------+----------+----------+
| 1 | Ash | 19 | 0 | -5000 | 0 |
+----+-------+-------+----------+----------+----------+
| 2 | Tuh | 15 | 0 | 0 | 0 |
+----+-------+-------+----------+----------+----------+
invoices
| id | invoice_num | student_id | reg_fee | tut_fee | other_fee | discount |
+------+-------------+------------+---------+---------+-----------+----------+
| 1 | 2011/1 | 1 | 5000 | 0 | 0 | 0 |
+------+-------------+------------+---------+---------+-----------+----------+
| 137 | 2011/137 | 1 | 15000 | 0 | 0 | 0 |
+------+-------------+------------+---------+---------+-----------+----------+
| 169 | 2011/169 | 2 | 15000 | 0 | 0 | 0 |
+------+-------------+------------+---------+---------+-----------+----------+
recipts
| id | recipt_num | student_id | reg_fee | tut_fee | other_fee | status |
+------+-------------+------------+---------+---------+-----------+------------+
| 264 | 2011/264 | 1 | 0 | 15000 | 0 | confirmed |
+------+-------------+------------+---------+---------+-----------+------------+
| 18 | 2011/18 | 2 | 0 | 5250 | 0 | confirmed |
+------+-------------+------------+---------+---------+-----------+------------+
| 251 | 2011/251 | 2 | 4650 | 0 | 0 | pending |
+------+-------------+------------+---------+---------+-----------+------------+
batches
| id | name |
+-----+----------+
| 19 | S.T-11 |
+-----+----------+
| 15 | Mc/11-13 |
+-----+----------+
I want to achieve report according to batches....
Batch id - batch id from batches table
Batch Name - batch name from batches table
Total Students - count(s.id) from students table group by batch
Opening Bal - sum(s.openbal) from students table
Gross Fee - sum(reg_fee+tut_fee+other_fee) from invoices table
Discount - sum(i.discount) from invoices table
Net Payable - (openbal + grossfee) - discount
Net Received - sum(reg_fee+tut_fee+other_fee) from recipts table where r.status = 'confirmed'
Due Balance - Net Payable - Net Received
expected report
| batch_id | batch_name | total_students | opening_bal | gross_fee | discount | net_payable | net_recieved | due_balance |
+----------+------------+----------------+-------------+-----------+----------+-------------+--------------+-------------+
| 15 | 2011/264 | 1 | 0 | 15000 | 0 | 15000 | 5250 | 9750 |
+----------+------------+----------------+-------------+-----------+----------+-------------+--------------+-------------+
| 19 | S.T-11 | 1 | -5000 | 20000 | 0 | 15000 | 15000 | 0 |
+----------+------------+----------------+-------------+-----------+----------+-------------+--------------+-------------+
I have tried using following query but its giving wrong results.
SELECT b.name AS batch_name,
b.id AS batch_id,
COUNT( s.id ) AS total_students,
COALESCE( s.open_bal, 0 ) AS open_balance,
COALESCE( sum( i.reg_fee + i.tut_fee + i.other_fee ) , 0 ) AS gross_fee,
COALESCE( s.discount, 0 ) ,
COALESCE( sum( i.reg_fee + i.tut_fee + i.other_fee ) , 0 ) -
COALESCE( s.discount, 0 ) AS net_payable,
COALESCE( sum( r.reg_fee + r.tut_fee + r.other_fee ) , 0 ) AS net_recieved,
COALESCE( s.discount, 0 ) ,
COALESCE( sum( i.reg_fee + i.tut_fee + i.other_fee ) , 0 ) -
COALESCE( s.discount, 0 ) -
COALESCE( sum( r.reg_fee + r.tut_fee + r.other_fee ) , 0 )
AS due_balance
FROM batches b
LEFT JOIN students s ON s.batch = b.id
LEFT JOIN invoices i ON i.student_id = s.id
LEFT JOIN recipts r ON r.student_id = s.id
WHERE s.inactive =0 and r.status = 'confirmed'
GROUP BY b.name;
please help me to rewrite this query...
Talking about SQL this line is quite certainly wrong:
GROUP BY b.name;
The GROUP BY should contain every element of the select which is not an aggregate expression.
Try the query using:
GROUP BY b.name,b.id,COALESCE(s.open_bal,0), COALESCE(s.discount,0);
When you do not make the right GROUP BY expression MySQL makes his own improved and simplified group by, which avoids a query rejection but produce higly unexpectable results, especially if your query is complex.
If you do not need a distinct result row for each s.open_bal and s.discount, then maybe you do not need theses (duplicates) data in the select.
Then I did not took the time to analyze the complete query. But your needs seems quite complex. I would say Divide and conquer, KISS (Keep It Stupid Simple), make several queries you fully understand instead of one huge query. Especially if requirements from some of the results differs (some working on details, some working on aggregates, and some working on different aggregates, etc), as you would maybe need some window functions ("partition by" keyword) that you do not have on MySQL.
maybe you should try to fix your sum like this example:
COALESCE( sum( i.reg_fee + i.tut_fee + i.other_fee ) , 0 ) //bad
sum( COALESCE(i.reg_fee,0) + COALESCE(i.tut_fee,0) + COALESCE(i.other_fee,0) ) //good