Compare rows SQL counting discrepancies in value - mysql

I have a table:
id firstval secondval
1 4 5
2 5 4
3 3 3
4 6 6
5 7 8
6 9 8
7 3 3
8 3 3
The first thing I need to do is count the number of times secondval > firstval. This is obviously no problem.
However, the thing I'm struggling with is how to then count how many times (for each instance of secondval > firstval) the next row satisfies the condition secondval < firstval
So in this example there are two rows that would satisfy the first rule id 1 & 5 and two for the second rule, the next rows id 2 and 6.

SELECT id, #prevGreater AND secondval < firstval AS discrepancy,
#prevGreater := secondval > firstval AS secondGreater
FROM (SELECT * FROM YourTable ORDER BY id) AS x
CROSS JOIN (SELECT #prevGreater := false) AS init
DEMO

SELECT * from table t1
INNER JOIN table t2 on t1.ID+1=t2.ID -- here we join on t2.ID is t1.ID+1
WHERE t1.secondval>t1.firstval AND t2.secondval<t2.firstval
Now you can use COUNT statement as you want :)

DECLARE #YourTable TABLE
(id int, firstval int, secondval int)
INSERT INTO #YourTable
SELECT 1, 4, 5
UNION ALL
SELECT 2, 5, 4
UNION ALL
SELECT 3, 3, 3
UNION ALL
SELECT 4, 6, 6
UNION ALL
SELECT 5, 7, 8
UNION ALL
SELECT 6, 9, 8
UNION ALL
SELECT 7, 3, 3
UNION ALL
SELECT 8, 3, 3
SELECT ID
,CASE
WHEN SECONDVAL>FIRSTVAL THEN 0
WHEN FIRSTVAL>SECONDVAL THEN 1
ELSE 0
END AS DISCREPANCY
,CASE
WHEN SECONDVAL>FIRSTVAL THEN 1
WHEN FIRSTVAL>SECONDVAL THEN 0
ELSE 0
END AS SECONDGREATER
FROM #YourTable
You could try this one.

Related

MySQL include empty WEEKS data as 0

is there way to include empty week value from empty result ? or how i can unionn empty missing weeks
there is bit of my query
SELECT
o.user_id , WEEK(FROM_UNIXTIME(o.cdate, '%Y-%m-%d'),7) as week_number,
FROM
(_orders AS `o`)
WHERE
o.cdate BETWEEN '1505409460' AND '1540815218'
GROUP BY
week_number
Result
1
2
4
6
8
requested result
1
2
3
4
5
6
7
8
This is just an example, there are numerous ways to achieve this. The first step is to have, or generate, a set on integers. Having a table of these is very handy actually. Here I use 2 subqueries cross joined to generate 100 rows (with n = 0 to 99)
select
ns.n, sq.*
from (
select
d1.digit + (d10.digit*10) as n
from (
SELECT 0 AS digit UNION ALL
SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL
SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL
SELECT 9
) d1
cross join (
SELECT 0 AS digit UNION ALL
SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL
SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL
SELECT 9
) d10
) ns
left join (
your query goes here
) sq on ns.n = sq.week_number
where n between 1 and 52
order by n

Count up on a positive change, and down on a negative change

I have a column that changes values.
I want to count by adding at each change up and subtracting at each change down. Assuming x[] are my values, Delta is the sign of change in x's elements, and y[] is my targeted results or counts.
We count up until the next delta -1 at which we start counting down, then we resume counting up when delta changes back to +1. In summary we add normally until we have a delta of -1 at that time we start subtracting, then resume adding up at the next +1 delta.
x: 1, 3, 4, 4, 4, 5, 5, 3, 3, 4, 5, 5, 6, 5, 4, 4, 4, 3, 4, 5, 6, 7, 8
Delta: 0, 1, 1, 0, 0, 1, 0, -1, 0, 1, 1, 0, 1, -1, -1, 0, 0, -1, 1, 1, 1, 1, 1
y: 1, 2, 3, 4, 5, 6, 7, 6, 5, 6, 7, 8, 9, 8, 7, 6, 5, 4, 5, 6, 7, 8, 9
The length of my array is in the millions of rows, and efficiency is important. Not sure if such operation should be done in SQL or whether I would be better off retrieving the data from the database and performing such calculation outside.
You could use this query in SQL-Server, presuming a PK-column for the ordering:
WITH CTE AS
(
SELECT t.ID, t.Value,
LastValue = Prev.Value,
Delta = CASE WHEN Prev.Value IS NULL
OR t.Value > Prev.Value THEN 1
WHEN t.Value = Prev.Value THEN 0
WHEN t.Value < Prev.Value THEN -1 END
FROM dbo.TableName t
OUTER APPLY (SELECT TOP 1 t2.ID, t2.Value
FROM dbo.TableName t2
WHERE t2.ID < t.ID
ORDER BY t2.ID DESC) Prev
)
, Changes AS
(
SELECT CTE.ID, CTE.Value, CTE.LastValue, CTE.Delta,
Change = CASE WHEN CTE.Delta <> 0 THEN CTE.Delta
ELSE (SELECT TOP 1 CTE2.Delta
FROM CTE CTE2
WHERE CTE2.ID < CTE.ID
AND CTE2.Delta <> 0
ORDER BY CTE2.ID DESC) END
FROM CTE
)
SELECT SUM(Change) FROM Changes c
The result is 9 as expected:
complete result set
only Sum
The OUTER APPLY links the current with the previous record, the previous record is the one with the highest ID < current.ID. It works similar to a LEFT OUTER JOIN.
The main challenge was the sub-query in the last CTE. That is necessary to find the last delta that is <> 0 to determine if the current delta is positive or negative.
You can also use LAG and SUM with OVER (Assuming you have SQL Server 2012 or above) like this.
Sample Data
DECLARE #Table1 TABLE (ID int identity(1,1), [x] int);
INSERT INTO #Table1([x])
VALUES (1),(3),(4),(4),(4),(5),(5),(3),(3),(4),(5),(5),(6),(5),(4),(4),(4),(3),(4),(5),(6),(7),(8);
Query
;WITH T1 as
(
SELECT ID,x,ISNULL(LAG(x) OVER(ORDER BY ID ASC),x - 1) as PrevVal
FROM #Table1
), T2 as
(
SELECT ID,x,PrevVal,CASE WHEN x > PrevVal THEN 1 WHEN x < PrevVal THEN -1 ELSE 0 END as delta
FROM T1
)
SELECT ID,x,SUM(COALESCE(NULLIF(T2.delta,0),TI.delta,0))OVER(ORDER BY ID) as Ordered
FROM T2 OUTER APPLY (SELECT TOP 1 delta from T2 TI WHERE TI.ID < T2.ID AND TI.x = T2.x AND TI.delta <> 0 ORDER BY ID DESC) as TI
ORDER BY ID
Output
ID x Ordered
1 1 1
2 3 2
3 4 3
4 4 4
5 4 5
6 5 6
7 5 7
8 3 6
9 3 5
10 4 6
11 5 7
12 5 8
13 6 9
14 5 8
15 4 7
16 4 6
17 4 5
18 3 4
19 4 5
20 5 6
21 6 7
22 7 8
23 8 9
You use sql-server and mysql tag. If this can be done within SQL-Server you should have a look on the OVER-clause: https://msdn.microsoft.com/en-us/library/ms189461.aspx
Assuming there's an ordering criteria it is possible to state a ROW-clause and use the value of a preceeding row. Many SQL-functions allow the usage of OVER.
You could define a computed column which does the calculation on insert...
Good luck!

Filling empty records through SELECT..INSERT statement

Before anything else, here is the simplified schema (with dummy records) of the database:
ItemList
ItemID ItemName DateAcquired Cost MonthlyDep CurrentValue
================================================================================
1 Stuff Toy 2011-12-25 100.00 10.00 100.00
2 Mouse 2011-12-23 250.00 50.00 200.00
3 Keyboard 2011-12-17 250.00 30.00 190.00
4 Umbrella 2011-12-28 150.00 20.00 110.00
5 Aircon 2011-12-29 950.00 25.00 925.00
DepreciationTransaction
ItemID DateOfDep MonthlyDep
======================================
2 2012-01-31 250.00
3 2012-01-31 30.00
4 2012-01-31 20.00
5 2012-01-31 25.00
3 2012-02-29 30.00
4 2012-02-29 20.00
I need your suggestions to help me solve this problem. Basically I am creating a depreciation monitoring system of a certain LGU. The problem of the current database is that it lacks some records for a specific date of depreciation, for instance:
Lacking Records (this is not a table from the database)
ItemID LackingDate
============================
1 2012-01-31
1 2012-02-29
2 2012-02-29
5 2012-02-29
And because of the lacking records, I cannot generate the depreciation report for the month of MARCH. Any idea how can I insert missing records on the DepreciationTransaction?
What have I done so far? None. But a simple query that calculates the newly depreciated value (which produces incorrect value because of the missing records)
The problem here is that you will have to generate data. MySQL is not intended to generate data, you should do that at an application level and just tell MySQL to store it. In this case, the application should check wether there are missing records and create them if needed.
Leaving that aside, you can (awfully) create dynamic data with MySQL like this:
select il.itemId, endOfMonths.aDate from ((
select aDate from (
select #maxDate - interval (a.a+(10*b.a)+(100*c.a)+(1000*d.a)) day aDate from
(select 0 as a union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9) a, /*10 day range*/
(select 0 as a union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9) b, /*100 day range*/
(select 0 as a union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9) c, /*1000 day range*/
(select 0 as a union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9) d, /*10000 day range*/
(select #minDate := (select min(dateAcquired) from il),
#maxDate := '2012-03-01') e
) f
where aDate between #minDate and #maxDate and aDate = last_day(aDate)
) endOfMonths, il)
left join dt
on il.itemId = dt.itemId and endOfMonths.aDate = dt.dateOfDep
where dt.itemId is null and last_day(il.dateAcquired) < endOfMonths.aDate
Depending on the length of the date range you can reduce the amount of dynamically generated results (10000 days means over 27 years of records each representing one day) by removing tables (d, c, b and a) and removing them from the upper formula. Setting the #minDate and #maxDate variables will allow you to specify the dates between you want to filter the results. This dates should be the min date from which you have an item and the max date should be march, in your case.
In plain english: If select min(dateAcquired) from il returns a date before '2012-03-01' - 10000 days then you'll have to add another union.
Finally, just add the insert statement (if you really need to insert those records).
You may build a temporary table, which contains the date needed. And use the table to LEFT OUTER JOIN the "DepreciationTransaction" table.
SELECT dt.date_value, dt.itemid, ISNULL(SUM(dt.MonthlyDep), 0)
FROM tmp_date
LEFT OUTER JOIN
DepreciationTransaction AS dt
ON tmp_date.date_value = dt.DateOfDep
GROUP BY dt.date_value, dt.itemid
Of course, if your want that all of the items to be on report, you should make a cartesian product with tmp_date and items_id.

Get the column order of a query

I have a table with two columns [id, value] both numeric.
In this example:
[ id, value ]
[ 1, 6 ]
[ 2, 4 ]
[ 3, 10 ]
[ 4, 2 ]
[ 5, 7 ]
[ 6, 3 ]
For a given id I'd like to retrieve the top 3 id's (those with highest value), their top position and if the given id is not in the top 3, also get its position, id and value:
Example 1: ask_id = 5 Return:
[ position, id, value ]
[ 1, 3, 10 ]
[ 2, 5, 7 ]
[ 3, 1, 6 ]
Example 2: ask_id = 4. Return:
[ position, id, value ]
[ 1, 3, 10 ]
[ 2, 5, 7 ]
[ 3, 1, 6 ]
[ 6, 4, 2 ]
So the important points are:
How to get for the position column?
How to get the additional row if possible (anyway there's no problem if I need two queries)?
select t2.pos, t1.id, t1.value
from test as t1
inner join
(select id, value, #pos:=if(#pos is null, 0, #pos)+1 as pos
from test order by value desc) as t2
on t1.id=t2.id
where t2.pos<=3 or t2.id={$ask_id}
order by t2.pos;
Basically, the idea is like this:
Rank the rows by value.
Retrieve rows where at least one of the following is true:
position BETWEEN 1 AND 3
id = #given_id
These posts give examples of how you could substitute ranking functions (at least the most fundamental of them, ROW_NUMBER()) in MySQL:
ROW_NUMBER() in MySQL
MSSQL Row_Number() over(order by) in MySql
This method should be used with caution, though, as this article explains.
That said, one possible implementation of the above steps might look like this:
SET #pos = 0;
SELECT
position,
id,
value
FROM (
SELECT
id,
value,
#pos := #pos + 1 AS position
FROM atable
ORDER BY value DESC
) s
WHERE position BETWEEN 1 AND 3
OR id = #given_id
ORDER BY position
Tested in MySQL
to retrieve the top 3 id's (those with highest value) with position in ascending order.
set #num = 0;
SELECT #num := #num + 1 as position_sequence,id,value FROM tablename
ORDER BY value desc
limit 3;
I've not (yet) tested the selected answer in MySQL on the interesting cases where there are ties in the top three places, but I have tested this code in Informix on those cases, and it produces the answer I think should be produced.
Assuming that the table is called leader_board:
CREATE TABLE leader_board(id INTEGER NOT NULL PRIMARY KEY, value INTEGER NOT NULL);
INSERT INTO leader_board(id, value) VALUES(1, 6);
INSERT INTO leader_board(id, value) VALUES(2, 4);
INSERT INTO leader_board(id, value) VALUES(3, 10);
INSERT INTO leader_board(id, value) VALUES(4, 2);
INSERT INTO leader_board(id, value) VALUES(5, 7);
INSERT INTO leader_board(id, value) VALUES(6, 3);
This query works on the data shown, assuming that the special ID is 4:
SELECT b.position - c.tied + 1 AS standing, a.id, a.value
FROM leader_board AS a
JOIN (SELECT COUNT(*) AS position, d.id
FROM leader_board AS d
JOIN leader_board AS e ON (d.value <= e.value)
GROUP BY d.id
) AS b
ON a.id = b.id
JOIN (SELECT COUNT(*) AS tied, f.id
FROM leader_board AS f
JOIN leader_board AS g ON (f.value = g.value)
GROUP BY f.id
) AS c
ON a.id = c.id
WHERE (a.id = 4 OR (b.position - c.tied + 1) <= 3) -- Special ID = 4; Top N = 3
ORDER BY position, a.id;
Output on original data:
standing id value
1 3 10
2 5 7
3 1 6
6 4 2
Explanation
The two sub-queries are closely related, but they produce different answers. At one time, I used two temporary tables to hold those results. In particular, the first sub-query (AS b) produces a position, but when there are ties, the position is the lowest rather than the highest of the tied positions. That is, given:
ID Value
1 10
2 7
3 7
4 7
The outputs will be:
Position ID
1 1
4 2
4 3
4 4
However, we would like to count them as:
Position ID
1 1
2 2
2 3
2 4
So, the corrected position is the original position minus the number of tied values (3 for ID ∈ { 2, 3, 4 }, 1 for ID 1) plus 1. The second sub-query returns the number of tied values for each ID. There might be a neater way to do that calculation, but I'm not sure what it is at the moment.
Special cases
However, the code should demonstrate that it handles the cases where:
There are 2 or more ID values with the same top value.
There are 2 or more ID values with the same second highest top score (but the top one is unique).
There are 2 or more ID values with the same third highest top score (but the top two are unique).
To save rewriting the query each time, I converted it into an Informix-style stored procedure which take both the Special ID and the Top N (defaulting to 3) values that should be displayed and made them into parameters of the procedure. (Yes, the notation in the RETURNING clause is weird.)
CREATE PROCEDURE leader_board_standings(extra_id INTEGER, top_n INTEGER DEFAULT 3)
RETURNING INTEGER AS standing, INTEGER AS id, INTEGER AS value;
DEFINE standing, id, value INTEGER;
FOREACH SELECT b.position - c.tied + 1 AS standing, a.id, a.value
INTO standing, id, value
FROM leader_board AS a
JOIN (SELECT COUNT(*) AS position, d.id
FROM leader_board AS d
JOIN leader_board AS e ON (d.value <= e.value)
GROUP BY d.id
) AS b
ON a.id = b.id
JOIN (SELECT COUNT(*) AS tied, f.id
FROM leader_board AS f
JOIN leader_board AS g ON (f.value = g.value)
GROUP BY f.id
) AS c
ON a.id = c.id
WHERE (a.id = extra_id OR (b.position - c.tied + 1) <= top_n)
ORDER BY position, a.id
RETURN standing, id, value WITH RESUME;
END FOREACH;
END PROCEDURE;
This can be invoked to produce the same result as before:
EXECUTE PROCEDURE leader_board_standings(4);
To illustrate the various cases outlined above, add and remove extra rows:
EXECUTE PROCEDURE leader_board_standings(4);
1 3 10
2 5 7
3 1 6
6 4 2
INSERT INTO leader_board(id, value) VALUES(10, 10);
EXECUTE PROCEDURE leader_board_standings(4);
1 3 10
1 10 10
3 5 7
7 4 2
INSERT INTO leader_board(id, value) VALUES(11, 10);
EXECUTE PROCEDURE leader_board_standings(4);
1 3 10
1 10 10
1 11 10
8 4 2
INSERT INTO leader_board(id, value) VALUES(12, 10);
EXECUTE PROCEDURE leader_board_standings(4);
1 3 10
1 10 10
1 11 10
1 12 10
9 4 2
DELETE FROM leader_board WHERE id IN (10, 11, 12);
EXECUTE PROCEDURE leader_board_standings(6, 4); -- Special ID 6; Top 4
1 3 10
2 5 7
3 1 6
4 2 4
5 6 3
INSERT INTO leader_board(id, value) VALUES(7, 7);
EXECUTE PROCEDURE leader_board_standings(4);
1 3 10
2 5 7
2 7 7
7 4 2
INSERT INTO leader_board(id, value) VALUES(13, 7);
EXECUTE PROCEDURE leader_board_standings(4);
1 3 10
2 5 7
2 7 7
2 13 7
8 4 2
INSERT INTO leader_board(id, value) VALUES(14, 7);
EXECUTE PROCEDURE leader_board_standings(4);
1 3 10
2 5 7
2 7 7
2 13 7
2 14 7
9 4 2
DELETE FROM leader_board WHERE id IN(7, 13, 14);
INSERT INTO leader_board(id, value) VALUES(8, 6);
EXECUTE PROCEDURE leader_board_standings(4);
1 3 10
2 5 7
3 1 6
3 8 6
7 4 2
INSERT INTO leader_board(id, value) VALUES(9, 6);
EXECUTE PROCEDURE leader_board_standings(4);
1 3 10
2 5 7
3 1 6
3 8 6
3 9 6
8 4 2
INSERT INTO leader_board(id, value) VALUES(15, 6);
EXECUTE PROCEDURE leader_board_standings(4);
1 3 10
2 5 7
3 1 6
3 8 6
3 9 6
3 15 6
9 4 2
EXECUTE PROCEDURE leader_board_standings(3); -- Special ID 3 appears in top 3
1 3 10
2 5 7
3 1 6
That all looks correct to me.

How to query rows that are not in a table?

First of all, I don't think my title is good, but I couldn't think of a better one. Please feel free to change it.
I have a table that keeps record of a pair of rows.
The following is a sample table structure.
table History
user_id row_1 row_2
2 1 2
2 1 3
table Rows
row_id
1
2
3
4
5
6
I would like to query to get only a pair of rows that are not in the 'History' table.
so..I like to get the following result.
row pairs:
1,4
1,5
1,6
2,3
2,4
2,5
2,6
and so on
Can I do it with one query?
Just added:
I just made a query that works, but I am not sure about the performance.
select r1.row_id, r2.row_id from rows as r1 cross join rows as r2
where r1.row_id!=r2.row_id and ( r1.row_id + r2.row_id) not in (select row_1 + row_2 from history)
order by r1.row_id desc
Would it be super slow?
Something like this. You haven't made the correlations clear between the fields but this should be easy to adapt.
select h.row_id r1, r.row_id r2
from rows h
cross join rows r
left join history h2 on h2.row_1=h.row_id and h2.row_2=r.row_id
where h2.row_1 is null
The CROSS JOIN produces all the possible combinations of row_id x row_id
THE LEFT JOIN attempts to find the combination in the history table
The WHERE clause picks out where the combination is not found
I think this might work:
SELECT DISTINCT
CASE WHEN r1.row_id < r2.row_id THEN r1.row_id ELSE r2.row_id END AS row_id_1,
CASE WHEN r1.row_id < r2.row_id THEN r2.row_id ELSE r1.row_id END AS row_id_2
FROM Rows AS r1
INNER JOIN Rows AS r2 /* ON r1.row_id <> r2.row_id */
WHERE (r1.row_id, r2.row_id) NOT IN (
SELECT row_1, row_2
FROM history
UNION
SELECT row_2, row_1
FROM history
)
ORDER BY 1, 2
Returns:
1, 1
1, 4
1, 5
1, 6
2, 2
2, 3
2, 4
2, 5
2, 6
3, 3
3, 4
3, 5
3, 6
4, 4
4, 5
4, 6
5, 5
5, 6
6, 6
This query will be super slow.