How to add over partition by column as a new column - mysql

I have a partition by query that increments based on the number of account codes and the dates these were entered as follows
SELECT
Account,
Entry_Date,
COUNT(Account) OVER (PARTITION BY Account ORDER BY Entry_Date) AS Entry_No
FROM accounts_entries;
The new column produces the following output.
Account | Entry_Date | Entry_No
1000 2022-02-17 1
1000 2022-03-01 2
1000 2022-08-14 3
1000 2022-12-11 4
2000 2022-01-02 1
2000 2022-04-01 2
2000 2022-05-04 3
2000 2022-06-05 4
However, I cannot figure out how get this added as an actual column using ALTER TABLE without errors.
Attempting to add this using ALTER TABLE results in an Error Code: 3593. I've tried reviewing this and window functions but can't seem to find a solution that applies to PARTITION BY. I did consider joining but Entry_No really needs to be its own column that incrementally updates as new Account entries are added.

Related

Delete the duplicate values in the SUM with MySQL or SQL

Hi I am doing a sum of a table, but the problem is that the table has duplicate rows, so I wonder how can I do the sum without duplicated rows:
The main table is this one:
folio
cashier_id
amount
date
0001
1
2500
2022-06-01 00:00:00
0002
2
10000
2022-06-01 00:00:00
0001
1
2500
2022-06-01 00:00:00
0003
1
1000
2022-06-01 00:00:00
If I sum that you can see that the first and the third row are duplicated, so when I do the sum it makes it wrong because, the result will be:
cashier_id
cash_amount
1
6000
2
10000
but it should be:
cashier_id
cash_amount
1
3500
2
10000
The query that I use to make the sum is this one:
SELECT `jysparki_jis`.`api_transactions`.`cashier_id` AS `cashier_id`,
SUM(`jysparki_jis`.`api_transactions`.`cash_amount`) AS `cash_amount`,,
COUNT(0) AS `ticket_number`,
DATE(`jysparki_jis`.`api_transactions`.`created_at`) AS `date`
FROM `jysparki_jis`.`api_transactions`
WHERE DATE(`jysparki_jis`.`api_transactions`.`created_at`) >= '2022-01-01'
AND (`jysparki_jis`.`api_transactions`.`dte_type_id` = 39
OR `jysparki_jis`.`api_transactions`.`dte_type_id` = 61)
AND `jysparki_jis`.`api_transactions`.`cashier_id` <> 0
GROUP BY `jysparki_jis`.`api_transactions`.`cashier_id`,
DATE(`jysparki_jis`.`api_transactions`.`created_at`)
How you can see the sum is this:
SUM(`jysparki_jis`.`api_transactions`.`cash_amount`).
I wonder how can I do the sum avoiding to duplicate the folio with same cashier_id?
I know that if I filter for the cashier_id and folio I can avoid the duplicate rows but I do not know how to do that, can you help me?
Thanks
Given your provided input tables, you can use the DISTINCT clause inside the SUM aggregation function to solve your problem:
SELECT cashier_id, SUM(DISTINCT amount)
FROM tab
GROUP BY cashier_id,
folio,
date
Check the demo here.
Then you can add up your conditions inside your WHERE clause to this query, and your aggregation on the "created_at" field (that should correspond to the "date" field of your sample table - I guess). This solution may give your the general idea.

Nested sort in SELECT followed by Conditional INSERT based upon results of SELECT inquiry

I have been struggling with the following for some time.
The server I am using has MySQL ver 5.7 installed.
The issue:
I wish to take recorded tank level readings from one table, find the difference between the last two records for a particular tank, and multiply this by a factor to get a quantity used.
The extracted quantity, if it is +ve, else 0 , then to be inserted into another table for further use.
The Quant value extracted may be +ve or -ve as tanks fill and empty. I only require the used quantity -ie falling level.
The two following tables are used:
Table 'tf_rdgs' sample;
value 1 is content height.
id
location
value1
reading_time
1
18
1500
2
18
1340
3
9
1600
4
18
1200
5
9
1400
6
18
1765
yyyy
7
18
1642
xxxx
Table 'flow' example
id
location
Quant
reading_time
1
18
5634
dd-mm: HH-mm
2
18
0
dd-mm: HH-mm
3
18
123
current time
I do not require to go back over history and am only interested in the latest level readings as a new level reading is inserted.
I can get the following to work with a table of only one location.
INSERT INTO flow (location, Quant)
SELECT t1.location, (t2.value1 - t1.value1) AS Quant
FROM tf_rdgs t1 cross join tf_rdgs t2 on t1.reading_time > t2.reading_time
ORDER BY t2.reading_time DESC limit 1
It is not particularly efficient but works and gives the following return from the above table.
location
Quant
18
123
for a table with mixed locations including a WHERE t1.location = ... statement does not work.
The problems i am struggling with are
How to nest the initial sorting by location for the subsequent inquiry of difference between the last two tank level readings.
A singular location search is ok rather than all tanks.
A Conditional INSERT to insert the 'Quant' value only if it is +ve or else insert a 0 if it is -ve (ie filling)
I have tried many permutations on these without success.
Once the above has been achieved it needs to run on a conditional trigger - based upon location of inserted data - in the tf_rdgs table activated upon each new reading inserted from the sensors on a particular tank.
I can achieve the above with the exception of the conditional insert if each tank had a dedicated table but unfortunately I cant go there due existing data structure and usage.
Any direction or assitance on parts or whole of this much appreciated.

How to Query the Same Data within a Table but the Output Row Positions are Different

I have a table inside my database just like the sample below and i would like to query the same data but in the Column 2 the positions of the data would be 1 row greater than the previous data.
P.S. Im actually making a system for a Electric Meter Reading and I need the Current(Column 1) and the Previous(Column 2) Data Reading, so that I could compute the total consumption of the Electric Meter. But I am having a hard time doing it. Any suggestions would be deeply appreciated. Thank You. :)
Example data:
Desired Query Output:
Keep in mind that SQL table rows have no inherent order. They're just bags of records.
You must order them based on some column value or other criterion. In your case I guess you want the most recent and the second most recent meter reading for each account. Presumably your reading table has columns something like this:
reading_id customer_id datestamp value
1 1122 2009-02-11 112
2 1234 2009-02-13 18
3 1122 2009-03-08 125
4 1234 2009-03-10 40
5 1122 2009-04-12 160
6 1234 2009-04-11 62
I guess you need this sort of result set
customer_id datestamp value previous
1122 2009-03-08 125 112
1122 2009-04-12 160 125
1234 ...etcetera.
How can you get this? For each row in the table, you need a way to find the previous reading for the same customer: that is, the row with
the same customer id
the latest datestamp that occurs before the current datestamp.
This is a job for a so-called correlated subquery. Here's the query, with its subquery. (https://www.db-fiddle.com/f/hWGAbq4uAbA5f15j7oZY9o/0)
SELECT aft.customer_id,
aft.datestamp,
( SELECT bef.value
FROM r bef /* row from table.... */
WHERE bef.datestamp < aft.datestamp /* with datestamp < present datestamp */
AND bef.customer_id = aft.customer_id /* and same customer id */
ORDER BY bef.datestamp DESC /* most recent first */
LIMIT 1 /* only most recent */
) prev,
aft.value
FROM r aft
ORDER BY aft.customer_id, aft.datestamp
Notice that dealing with the first reading for each customer takes some thought in your business process.

How to sum specific rows and columns in SQL?

pnr mnd pris
1 1 600
1 7 900
2 1 600
2 7 600
3 1 40
3 7 40
I have trouble how to sum specific rows on the columns. Looking at the above, the table is called travel and it has 3 columns:
pnr - Personal Number
mnd - Month
Pris - Price
So what I want is to sum total of the price for the a specific month, so in this case, it should be 1240 USD and month 1. For the month 7, it should be 1540 USD.
I have trouble to do the query correct. So far from I have tried is this:
SELECT t.rnr, t.mnd, SUM(t.pris)
FROM travel AS t
WHERE t.mnd = 1
The result I get is 3720 USD which I have no idea how the SQL managed to calculate this for me.
Appreciate if someone could please help me out!
For this you need to drop the pnr column from the output (it is not relevant and will cause your data to split) and add a GROUP BY:
SELECT t.mnd, SUM(t.pris)
FROM travel AS t
WHERE t.mnd = 1
GROUP BY t.mnd
Live demo: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=b34ec2bb9c077c2d74ffc66748c5c142
(The use of an aggregate function without grouping, as you've got now, is not a standard SQL feature and can often be turned off in MySQL. If turned on, you might not always get the result you expected/intended.)
just group your result with mnd column
SELECT t.mnd, SUM(t.pris)
FROM travel AS t
group by t.mnd

Spotfire intersect first 'n' periods

Is there a way to use an Over and Intersect function to get the average sales for the first 3 periods (not always consecutive months, sometimes a month is skipped) for each Employee?
For example:
EmpID 1 is 71.67 ((80 + 60 + 75)/3) despite skipping "3/1/2007"
EmpID 3 is 250 ((350 + 250 + 150)/3).
I'm not sure how EmpID 2 would work because there are just two data points.
I've used a work-around by calculated column using DenseRank over Date, "asc", EmpID and then used another Boolean calculated column where DenseRank column name is <= 3, then used Over functions over the Boolean=TRUE column but I want to figure the correct way to do this.
There are Last 'n' Period functions but I haven't seen anything resembling a First 'n' Period function.
EmpID Date Sales
1 1/1/2007 80
1 2/1/2007 60
1 4/1/2007 75
1 5/1/2007 30
1 9/1/2007 100
2 2/1/2007 200
2 3/1/2007 100
3 12/1/2006 350
3 1/1/2007 250
3 3/1/2007 150
3 4/1/2007 275
3 8/1/2007 375
3 9/1/2007 475
3 10/1/2007 300
3 12/1/2007 200
I suppose the solution depends on where you want this data represented, but here is one example
If((Rank([Date],"asc",[EmpID])<=3) and (Max(Rank([Date],"asc",[EmpID])) OVER ([EmpID])>=3),Avg([Sales]) over ([EmpID]))
You can insert this as a calculated column and it will give you what you want (assuming your data is sorted by date when imported).
You may want to see the row numbering, and in that case insert this as a calculated column as well and name it RN
Rank([Date],"asc",[EmpID])
Explanation
Rank([Date],"asc",[EmpID])
This part of the function is basically applying a row number (labeled as RN in the results below) to each EmpID grouping.
Rank([Date],"asc",[EmpID])<=3
This is how we are taking the top 3 rows regardless if Months are skipped. If your data isn't sorted, we'd have to create one additional calculated column but the same logic applies.
(Max(Rank([Date],"asc",[EmpID])) OVER ([EmpID])>=3)
This is where we are basically ignoring EmpID = 2, or any EmpID who doesn't have at least 3 rows. Removing this would give you the average (dynamically) for each EmpID based on their first 1, 2, or 3 months respectively.
Avg([Sales]) over ([EmpID])
Now that our data is limited to the rows we care about, just take the average for each EmpID.
#Chris- Here is the solution I came up with
Step 1: Inserted a calculated column 'rank' with the expression below
DenseRank([Date],"asc",[EmpID])
Step 2: Created a cross table visualization from the data table and limited data with the expression below