ROW_NUMBER restart on duplicate value - duplicates

I have a result set as shown below and am using the ROW_NUMBER() function to determine when there is a change in value.
Date | Value | RowNumber
2/13/17 | 10 | 1
2/13/17 | 10 | 2
2/13/17 | 10 | 3
2/13/17 | 11 | 1
2/13/17 | 11 | 2
2/13/17 | 10 | 4
2/13/17 | 10 | 5
However, here's my problem.
The last 2 rows which have a value of 10 once again, receive a ROW_NUMBER of 4 and 5, continuing from where the previous ROW_NUMBER of 10 left off.
For my purposes, I need the 2nd set of 10s to restart the ROW_NUMBER with 1 and 2 again.
How can I achieve this goal?

Trying using LAG() to detect a change from the row before and only select the changes.

Related

How to do an IF statement in mysql as part of the SELECT statement

I'm comparing search engine rankings for URLs and I have 3 columns: URL, Rank, Previous rank. What I'd like to do is add a fourth column which says whether the rank has gone up or down.
For example
URL | Rank | Previous_Rank
example.com/page1 | 2 | 16
example.com/page2 | 2 | 11
example.com/page3 | 1 | 14
example.com/page4 | 1 | 4
example.com/page5 | 101| 7
example.com/page6 | 101| 14
example.com/page7 | 101| 7
example.com/page8 | 6 | 17
example.com/page9 | 10| 17
example.com/page10| 19| 1
I'd like another column to return:
URL | Rank | Previous_Rank | Movement
example.com/page1 | 2 | 16 | Up
example.com/page2 | 2 | 11 | Up
example.com/page3 | 1 | 14 | Up
example.com/page4 | 1 | 4 | Up
example.com/page5 | 101| 7 | Down
example.com/page6 | 101| 14 | Down
example.com/page7 | 101| 7 | Down
example.com/page8 | 6 | 17 | Up
example.com/page9 | 10| 17 | Up
example.com/page10| 19| 1 |Down
I'm using HeidiSQL which is mysql. The data here is part of a larger table and is pulled together with this SELECT statement:
select
URL,
Rank,
Previous_Rank
from URL_Changes
where
date = "2017-06-14"
group by URL
order by 2
;
So my question is, how do I edit that select statement to bring back that extra column?
Thanks.
You use a CASE statement for this:
select
URL,
Rank,
Previous_Rank,
CASE WHEN Rank < Previous_Rank THEN 'Up' WHEN Rank > Previous_Rank THEN 'Down' WHEN Rank = Previous_Rank THEN 'No Change' END AS Movement
from URL_Changes
where
date = "2017-06-14"
group by URL
order by 2
;
In MySQL there is also an IF() function, but CASE is used on nearly every RDBMS and IF() requires nesting to test for multiple scenarios, so Case is generally the better choice.

Median of multiple columns in MySql

How do you get the median of a row in MySQL?
I have a table which gives monthly stock for a series of categories:
cat_id | mar_stk | feb_stk | jan_stk
1 | 5 | 7 | 9
2 | 2 | 1 | 3
3 | 6 | 8 | 10
I need the median, maximum and minimum stock for each category.
Currently have minimum and maximum using:
SELECT
cat_id,
GREATEST(mar_stk, feb_stk, jan_stk) AS max_stk,
LEAST(mar_stk, feb_stk, jan_stk) AS min_stk
FROM example_table
Which leaves me with:
cat_id | max_stk | min_stk
1 | 9 | 5
2 | 3 | 1
3 | 10 | 6
But I can't find any straightforward way to find the median.
By statistics, Median is the middle number in a given out distribution. For instance if in the column cat_id where you have value 1,2,3 etc. Your median is 2 since its the number or value at the middle. Query the middle value and then hurray. Give me a shout if you still need further guide. ..Sectona

Grouping query by similar items in a row

How can I group all items next to each other returned from a query.
It's difficult to explain so best if I just provide an example.
I have a database called UserActions with two columns and the following data:
ID | User | Action
1 | Mark | Jump
2 | Mark | Jump
3 | Mark | Jump
4 | Mark | Run
5 | Mark | Run
6 | John | Run
7 | John | Run
8 | Mark | Run
9 | Mark | Run
10 | Mark | Jump
11 | Mark | Jump
12 | John | Jump
13 | John | Jump
The output I want is this:
Last ID | User | Action | Count
12 | John | Jump | 2
10 | Mark | Jump | 2
8 | Mark | Run | 2
6 | John | Run | 2
4 | Mark | Run | 2
1 | Mark | Jump | 3
Basically it groups all items by the user and action and outputs the total count before the next row is either a different action or user. If I do regular group by using "annotate" it will just group all items.
Is there a way to do this using a Django Query or raw SQL?
Thanks,
Mark
SELECT Max(ID),Count([Action]) AS [Count], [User], [Action]
FROM #Table1
GROUP BY [User],[Action]
The above query will yield the desired output.
The Output generated is:
LastID User Action Count
13 John Jump 2
11 Mark Jump 5
7 John Run 2
9 Mark Run 4
Hope it helps
In django ORM:
(Model.objects.values('user', 'action')
.order_by()
.annotate(max_id=models.Max('id'),
count=models.Count('action')))
Please note the empty .order_by(). It's needed in order to override one declared in Meta. Django includes default ordering field in GROUP BY fields.
SELECT x.*
FROM my_table x
LEFT
JOIN my_table y
ON y.user = x.user
AND y.action = x.action
AND y.id = x.id - 1
WHERE y.id IS NULL;
This assumes contiguous incremental ids, as per the example, but it's trivial to rewrite it if that's not the case.

Creating Temporary Column Order By Slow

So basically what I'm trying to do is get gained experience, ordering it, then only displaying top 5 or 50. Now note that I'm not SQL expert but I have knowledge of indexes as well as file sorting. The query that I have is filesorting most likely due to "gained_xp" not being a index-- let alone even a column as it's only temporary. There's no clear explanation how to fix this as I'm trying to contain it all in one query. I'm trying to sort nearly 13k rows with that number only expanding. I'd also need the number of rows to be dynamic as well as the time since. Any help would be appreciated. Thank you
Explain Output: Using where; Using temporary; Using filesort
Indexes include: time userid override overallXp overallLevel overallRank
The closest I've gotten to order all rows (which never ends up completing and ends in a mysql reboot) are:
SELECT FROM_UNIXTIME(time, '%Y-%m-%d'), t_u.userid as uid, MAX(t_u.OverallXP)-(SELECT overallXP FROM track_updates WHERE `userid` = t_u.userid AND `time`>'1394323200' ORDER BY id ASC LIMIT 1) as gained_xp
FROM track_updates t_u
WHERE t_u.time>'1394323200'
GROUP BY t_u.userid
If I'd run a query that selects only one user and works correctly is:
SELECT FROM_UNIXTIME(time, '%Y-%m-%d'), (t_u.overallXP)-(SELECT overallXP FROM track_updates WHERE `userid`='1' ORDER BY `id` ASC LIMIT 1) as gained_xp, t_u.userid
FROM track_updates t_u
WHERE t_u.userid='1' AND t_u.time>'1393632000'
ORDER BY t_u.time DESC
LIMIT 1
Sample Data per request:
____________________________________________________________________
| id | userid | time | overallLevel | overallXP | overallRank |
| 1 | 1 | 1394388114 | 1 | 1 | 1 |
| 2 | 1 | 1394389114 | 2 | 10 | 1 |
| 3 | 2 | 1394388114 | 1 | 1 | 2 |
| 4 | 2 | 1394389114 | 1 | 5 | 2 |
| 5 | 2 | 1394390114 | 2 | 7 | 2 |
Output (most recent time; gained xp current-initial; ordered by gain_xp):
____________________________________________
| id | time | userid | gained_xp |
| 1 | March 9th 2014 | 1 | 9 |
| 2 | March 9th 2014 | 2 | 6 |

Complicated MySQL Data Structure/Manipulation Problem

First off, I apologize for the length. This is kind of complicated (at least for me).
Background on the database:
I have a products, variables, and prices table. "Products" are the main information regarding a product (description, title, etc). "Prices" have information about each price (price, cost, minimum qty required, shipping cost, etc), as some products can have more than one price (a 10" widget is a different price than a 12" widget, for instance). "Variables" are variations to the product that do not change the price, such as color, size, etc.
Initially (when I built this database about 7 years ago) I had the variable information stored in the first price in a list of prices for the same product in a pipe-delimited format (yes, I know, badbadbad). This worked in general, but we've always had a problem, though, where sometimes a variable wouldn't be consistent among all the prices.
For instance, a Widget (product) may be 10" or 12" and sell for $10 and $20 (prices) respectively. However, while the 10" widget may be available in blue and red (variables), the 12" widget is only available in red. We ameliorated this problem by adding a little parenthetical statement in the incongruent variable like "Red (10" ONLY)". This sort of works, but customers are not always that smart and a lot of time is devoted to fixing mistakes when a customer selects a 12" widget in red.
I have since been tasked with modernizing the database and have decided to put the variables in their own table and making them more dynamic and easier to match with certain prices, as well as keep a more dummy-proof inventory (you can't imagine the nightmares).
My first step was to write a stored procedure on my test db (for when I do the conversion) to process all the existing variables into a new variable table (and label table, but that's not really important, I don't think). I effectively parsed out the variables and listed them with the correct product id and the product id they were initially associated with in the variable table. However, I realized this is only a part of the problem, since I (at least for the initial transformation of the database) want each variable to be listed as being connected to each price for a given product.
To do this, I created another table, like so:
tblvariablesprices
variablepriceid | variableid | priceid | productid
which is a many-to-many with the variable table.
Problems:
My problem now is, I don't know how to create the rows. I can create a left join on my prices and variables tables to get (I think) all the necessary data, I just don't know how to go through it. My sql is (mysql 5.0):
SELECT p.priceid, p.productid, variableid, labelid
FROM tblprices p
LEFT JOIN tblvariables v ON p.priceid = v.priceid
ORDER BY productid, priceid
This will get me every priceid and productid and any matching variable and label ids. This is good in certain instances, such as when I have something like:
priceid | productid | variableid | labelid
2 | 7 | 10 | 4
2 | 7 | 11 | 4
2 | 7 | 12 | 4
3 | 7 | (null) | (null) --- another price for product
because now I know that I need to create a record for priceid 2 and variableids 10, 11, 12, and then also for priceid 3 for that product. However, I also get results from this dataset for products with no variables, products with one price and multiple variables, and products with multiple prices and no variables, for instance:
priceid | productid | variableid | labelid
2 | 7 | 10 | 4
2 | 7 | 11 | 4
2 | 7 | 12 | 4
3 | 7 | (null) | (null)
4 | 8 | (null) | (null) --- 1 price no variables
5 | 9 | 13 | 5 --- mult vars, 1 price
5 | 9 | 14 | 5
5 | 9 | 15 | 6
5 | 9 | 16 | 6
6 | 10 | (null) | (null) --- mult price, no vars
7 | 10 | (null) | (null)
8 | 10 | (null) | (null)
Taking the above dataset, I want to add entries into my tblpricesvariables table like so:
variablepriceid | variableid | priceid | productid
1 | 10 | 2 | 7
2 | 11 | 2 | 7
3 | 12 | 2 | 7
4 | 10 | 3 | 7
5 | 11 | 3 | 7
6 | 12 | 3 | 7
7 | 13 | 5 | 9
8 | 14 | 5 | 9
9 | 15 | 5 | 9
10 | 16 | 5 | 9
I have thousands of records to process, so obviously doing this manually is not the answer. Can anyone at least point me in the correct direction, if not come up with a sproc that could handle this type of operation? I also would welcome any comments on how to better organize and/or structure this data.
Thank you so much for reading all this and helping me out.
How about:
SELECT DISTINCT b.variableid, a.priceid, a.productid
FROM tblprices AS a
JOIN tblprices AS b ON a.productid = b.productid
WHERE b.labelid IS NOT NULL
ORDER BY priceid;
+------------+---------+-----------+
| variableid | priceid | productid |
+------------+---------+-----------+
| 10 | 2 | 7 |
| 11 | 2 | 7 |
| 12 | 2 | 7 |
| 10 | 3 | 7 |
| 11 | 3 | 7 |
| 12 | 3 | 7 |
| 13 | 5 | 9 |
| 14 | 5 | 9 |
| 15 | 5 | 9 |
| 16 | 5 | 9 |
+------------+---------+-----------+
INSERTing into tblvariables is left as an exercise for the reader ;)
I think this should work:
SELECT v.variableid, p.productid, p.priceid
FROM tblvariables v, tblprices p
WHERE v.priceid IN (SELECT s.priceid
FROM tblprices s
WHERE s.productid = p.productid);
Next time, can you throw in create and insert statements to replicate your setup? Thanks.