I have the following results(temp table):
Product Side(buy/sell) TotalQuantity AverageWeightedPrice Cost
Prod1 1 100 120 12,000
Prod1 2 -50 130 -6,500
Prod2
Prod2
So on and so forth for multiple products.
I consolidated it to (with a groupby):
Product Side(buy/sell) TotalQuantity AverageWeightedPrice Cost
Prod1 1 50 110 5,500
Prod2
I want the consolidated results, unless it is certain conditions:
When Side 1 and 2 have the same Quantity. Consolidated quantity would be 0 and I would not be able to calculate the AverageWeightedPrice anymore.
When consolidated quantity = 0, the other condition is when TotalQuantity and Cost are of inversed values, ie, when Quantity is positive and Cost is Negative (and when Quantity+ and Cost-)
If it is the certain conditions, I would want to return the UNcosolidated data.
I am having trouble excluding between consolidated and unconsolidated data at the same time.
This is not something that SQL is really good for.
It is totally possible to produce this result with pure sql but the solution is very complex and you should not do it if you have any other option for it since sql does not provide any optimization over complex tasks like this and the syntax can be hard to master. Maintaining the query could also prove to be hard task in the future.
In cases like this you should use sql only to fetch the data and then format (consolidate) it using the whichever programming language or application that is executing the query.
Related
I have a table of "outcomes" of scores for a few hundred people on specific days. E.g.
Person
Date
Score
1
1/1/2021
10
2
1/1/2021
15
1
2/2/2022
20
3
2/2/2022
17
I will need to repeatedly compare each players' average score for a specific date range. E.g. get each player's average score between 1/1/2021 and 12/31/2021.
I know that I could query their average using the AVG(score) aggregate function, like SELECT Person, AVG(Score) FROM outcomes WHERE date < ?;
However, since I have hundreds of players with possibly hundreds of outcomes, I am worried that repeatedly doing this query will be produce a lot of row reads. I am considering creating an "averages" table or view where there is an entry for each player on each unique date, but the Score is an average score for the outcomes before that date.
Something like:
Person
EndDate
AVG(Score)
1
1/2/2021
10
2
1/2/2021
15
3
1/2/2021
0
1
2/3/2022
15
2
2/3/2022
15
3
2/3/2022
17
I realize that this is essentially at least doubling the amount of storage required, because each outcome will also have the associated "average" entry.
How is this kind of problem often addressed in practice? At what point does creating an "averages" table make sense? When is using the AVG(x) function most appropriate? Should I just add an "average" column to my "outcomes" table?
I was able to implement my query using the aggregate AVG(x) function, but I am considered about the number of row reads that my database quickly started requiring.
What you are describing is a form of denormalization. Storing the result of an aggregation instead of running the query every time you need it.
When to implement this? When running the query cannot be done fast enough to meet your performance goals.
Be cautious about adopting denormalization too soon. It comes with a cost.
The risk is that if your underlying data changes, but your denormalized copy is not updated, then the stored averages will be outdated. You have to decide whether it's acceptable to query outdated aggregate results from the denormalized table, and how often you want to update those stored results. There isn't one answer to this — it's up to your project requirements and your judgment.
I have a report with a grouping of Category which each present their own totals lines. I would like to subtract the total of one group from the total of a separate group.
Group Clients Revenue ATC
Called 1000 50000 50.00
Control 100 1000 10.00
Here is what I want to do:
Variance 900 49000 40.00
Keep in mind that the called and control are already set as a grouping and there is underlying data that can be expanded to show each store's data.
Any suggestions would be helpful.
Thanks,
Scott
If you want to get the variance I assume you only have two groups.
Try this for Clients:
=Sum(IIF(Fields!Group.Value="Called",Fields!Clients.Value,0))
-
Sum(IIF(Fields!Group.Value="Control",Fields!Clients.Value,0))
This for Revenue:
=Sum(IIF(Fields!Group.Value="Called",Fields!Revenue.Value,0))
-
Sum(IIF(Fields!Group.Value="Control",Fields!Revenue.Value,0))
And this for ATC:
=Sum(IIF(Fields!Group.Value="Called",Fields!ATC.Value,0))
-
Sum(IIF(Fields!Group.Value="Control",Fields!ATC.Value,0))
Let me know if this helps.
I'm doing a school assignment that has to do with gifts and their production time. We have the table "gift" which contains the name, gift_number and gift_production_time, and the table "wishes", which contains gift_number, wish_number and person_number. With these tables we have to calculate the amount of time it takes for all the gifts to be made in minutes and days rounded up.
Seeing as this is an introductory course to databases, I've hit a roadblock on this task. The closest I can get is to have a row for each gift, showing the production time of each of them individually, but not the total amount of time it takes.
Here is my closest attempt:
SELECT w.gift_number, count(w.gift_number)*production_time as total_minutes
FROM wishes as w, gifts as g
WHERE g.gift_number = w.gift_number
GROUP BY w.gift_number
I don't think I can get the correct answer with the GROUP BY statement, but the math isn't correct if I omit it. Any help would be much appreciated. :-)
EDIT
Gift table
|gift_number | gift_name | production_time
|_________________________________________
|1 gift1 130
|2 gift2 140
|3 gift3 200
|4 gift4 100
Wishes table
|wish_number | person_number | gift_number |
|___________________________________________
|1 1 2
|2 1 4
|3 2 2
|4 3 1
First, if you are learning SQL, you should learn proper join syntax. As a simple rule: Never use commas in the from clause.
Second, the query that you have written is incorrect from the perspective of SQL. MySQL accepts it, but the problem is production_time. It is not in an aggregation function (such as sum() or min()) and it is not in the group by clause. So, you are using a MySQL extension. And, you should only use this extension when you really, really understand what you are doing.
I suspect that the query you want is more like this:
SELECT sum(production_time) as total_minutes
FROM wishes w JOIN
gifts g
ON g.gift_number = w.gift_number;
Depending on how time is represented (an integer number of minutes? as a time? etc.) and how you want to see the results, you may need to use date/time functions. I would suggest you review them here and play with them to get the results you want.
I am developing an SSRS report with the following dataset. There is a filter for 'Period'. It is a multi-select filter. Data is grouped by 'Account' field. I need to display Total Expense for each group (which was easy). I also need to display 'Budget' on the same group level. The problem is the budget data is redundant - see below.
Say for the first group (Account=100 AND Period=201301), Sum([Budget]) would generate 200, which is not true. I can use the Average function which helps if user selects only one Period from the filter. If they select multiple values (e.g. 201301,201302) then the average will be (100+100+150+150)/4=125, which would be wrong because it has to be 100+150=250. I don't want to average among all rows in the returned dataset.
ID Account Period Expense Budget
1 100 201301 20 100
2 100 201301 30 100
3 100 201302 10 150
4 100 201302 40 150
5 200 ...................
So, how do I write an expression to make this happen?
A dirty workaound would be to eliminate redundant values in the Budget column so I can safely use Sum([Budget]) w/o worrying about duplication. The updated dataset would look like this:
ID Account Period Expense Budget
1 100 201301 20 100
2 100 201301 30 NULL
3 100 201302 10 150
4 100 201302 40 NULL
5 200 ...................
Please advice for either approach. Thank you.
The most elegant way is to use the FIRST() aggregate function.
=FIRST(Fields!Budget.Value, "MyAccountGroupName")
There are some situations where this won't work. Then you need to move the logic to your query as you describe or you can get fancy with embedded code in your report.
I would follow your "dirty workaround" approach. You might possibly be able to achieve the result just inside SSRS with some fancy calculations, but it will be totally obscure.
I am maintaining record of expenses an expenses table looks like this
Expenses(id,name)
Expenses_data(id,amount,expense_id)
Expenses are based on years, lets say 10 years and i am saving it as months, so it would 120 months
If i would have 10 expenses then expenses_data would have 120*10 = 1200 Rows
I want to save it from 1200 rows to 120 rows and data would be like this as i enter in excel
id month marketing electricity bank charges
1 month-1 100 200 300
2 month-2 95.5 5000 100
Please suggest if it is possible and how ?
I think you probably what to stick to the database structure you already have, but use a query to display the data in the format you wish.
If you think about the number of data-points you're storing, there's not much difference between your sought schema and what you already have -- it's still 1200 data-points of expenses. Having to upgrade your schema each time you add an expense column would be pretty invasive.
Sticking with a query for your excel export would allow the database to understand the concept of expense categories, and updating your export query to include the new category would be much easier than modifying the schema. The necessary JOINs could even be calculated programmatically by iterating an initial query of "What Expense Categories are known?"