I have a table with just 21.5 million rows, representing properties sold across the UK from 1995. For each entry I've calculated a new price based on inflation of that year and now want to normalize this inflated price to assign a value between 1 to 100.
The average price in the table is 240000. The data is skewed in a way that 3/4 of the data is below the average. Max is 150 million, min is 1000
Normalizing the data using the SQL query below results in 20 million properties assigned the normalized price of 1.
UPDATE properties p
SET inflatedNorm = round(
1 + (
(p.inflatedPrice - MIN_PRICE) * (100 - 1) / (MAX_PRICE- MIN_PRICE)
)
);
What have I done wrong ? Surely 20 million 1s is wrong and there should be a more varied spread of values with most of them being around the average price.
Don't round the result! Let the database store decimal points. So:
UPDATE properties p
SET inflatedNorm = 1 + (p.inflatedPrice - MIN_PRICE) * (100.0 - 1) / (MAX_PRICE - MIN_PRICE);
The other issue is what the prices look like. I would start with:
select max(price), min(price)
from properties p;
If the maximum is 100 times the minimum, then you'll see the phenomenon you are seeing. The range is the only thing important for your calculation, not the actual distribution within the range.
That is, if you considers the net worth of Americans and include Bill Gates in your data, then 99+% of Americans will have a net worth less than 1% of Bill Gates.
Related
Essentially the query works like this:
Select Sum(Ceiling((Exp_Cart)*Exp_Qty) from Table_X where Item in ('3')
I am receiving an error in my query and wondering if there is a better way to approach this.
The situation is I need to count whole cartons, and a lot of our data has cartons in fractions / sometimes we sell at quantities that result in partial cartons. For capacity purposes, we always round up.
Select Sum(Ceiling((Exp_Cart)*Exp_Qty) from Table_X where Item in ('3')
I expect it to round up IE:
Carton = 0.25
Quantity = 4
Multiply gives me 1 carton
Sometimes we have orders of odd quantities which leads to partial cartons
Carton = 0.25
Quantity = 5
Multiply now gives me 1.25, which I want to round up to 2.
We then need to sum up all of the quantities to provide the correct carton-level information.
I have two tables stored in a MySQL database. The first contains pricing information about SKUs I want to sell tomorrow. The second contains historical sales and visitation data for SKUs. What I want to do is calculate the conversion rate (orders divided by visits) for each SKU at the price specified in the tomorrow's pricing table based on prior sales data at tomorrow's specified price, conditioned that the SKU at that price has at least 100 visits OR 50 visits and 1 order, and if not then I want it to add the visits and orders from the next highest price point.
I've created a table containing example data in an SQL fiddle
http://www.sqlfiddle.com/#!9/bf4273
Using the example data in the sqlfiddle, what I want to return then would be
ABC1, 100.0, 100, 0
ABC2, 75.0, 24, 0
ABC3, 35.0, 1312, 57
ABC4, 190, 55, 1
ABC5, 250, 72, 1
ABC6, 250.0, 80, 1
If you examine the data in the sql fiddle, you can see that ABC1 doesn't meet the count requirements at the specified $100 price point, so it rolls up the visits from the next higher price point, in that case 110, but keeps the reported price at 100.
ABC2 doesn't have enough visits at the specified price, but there aren't any visits at any other price point higher than the specified one, so it just returns the 24 and 0 visits.
ABC3 meets the requirements, so it just returns the specified value
ABC4 meets the > 50 visits and > 1 sale, so it's returned
ABC5 doesn't have enough visits and orders at the specified price, so it sums up the visits and orders to include the next highest price point. This logic is re-applied again if rolling up to the next highest price point still doesn't meet the requirements for 100 visits or 50 visits and 1 order.
ABC6 is similar to ABC5 except that now the orders at the specified price point for ABC6 for tomorrow is zero, but we still take the orders count from the next highest price point too (e.g. we sum up visits and orders from the next highest price point until we meet the necessary visits and/or orders requirements).
At the moment I've applied this logic inside Python, but it's a bit cumbersome and if possible I'd like to have it all in SQL. Can anyone help? Even telling me it's not possible to do this in MySQL would be valued.
Thanks,
Brad
i have an sql script that calculates the discount for items, now while i can do the calculation just fine i'm running into the problem for cents in the price
i have an item which is $9.95, the discount to it is 50% so it's new price is $4.975, since prices in the store should have a precision of 2 the value would be $4.97 (according to policy, discounted prices must be the lowest value)
problem is that the value is wrong, since prices on the website have to match what's in store and since in Australia the smallest change is 5 cents a price has to be a multiple of of that since a cashier can't give a customer 1 - 4 cents change and people will complain if they buy an item online that was discounted to $4.97 and in store it's $4.95 (yes, people are that picky)
is there a way to round to the nearest 5 or 0
Will this work for you?
SELECT floor(price_after_discount * 20) / 20 AS price FROM table
In SQL Server I can do:
select
Val,
FLOOR(val) +
CASE
WHEN (val * 100)%100 >= 50 THEN 0.05
ELSE 0
END Rounded_Val
from #Test
I have a table with transactions. All transactions are stored as positive numbers, if its a deposit or withdrawl only the action changes. How do i write a query that can sum up the numbers based on the action
-actions-
1 Buy 2 Sell 5 Dividend
ID ACTION SYMBOL PRICE SHARES
1 1 AGNC 27.50 150
2 2 AGNC 30.00 50
3 5 AGNC 1.25 100
So the query should show AGNC has a total of 100 shares.
SELECT
symbol,sum(shares) AS shares,
ROUND(abs(sum((price * shares))),2) AS cost,
FROM bf_transactions
WHERE (action_id <> 5)
GROUP BY symbol
HAVING sum(shares) > 0
I was originally using that query when i had positive/negative numbers and that worked great.. but i dont know how to do it now with just positive numbers.
This ought to do it:
SELECT symbol, sum(case action
when 1 then shares
when 2 then -shares
end) as shares
FROM bf_transactions
GROUP BY symbol
SQL Fiddle here
It is however good practice to denormalize this kind of data - what you appear to have now is a correctly normalized database with no duplicate data, but it's rather impractical to use as you can see in cases like this. You should keep a separate table with current stock portfolio that you update when a transaction is executed.
Also, including a HAVING-clause to 'hide' corrupted data (someone has sold more than they have purchased) seems rather bad practice to me - when a situation like that is detected you should definitely throw some kind of error, or an internal alert.
So this is for my dissertation and it is coming allong pretty well. almost finished it now xD
Anyway, I'm making a pub Epos system in access and its all OK, except now I have reached the stock control.
To get the query (Stock = Stock - Sales) I need to do a count query, which is easy enough, though the problem with pubs is they often serve half pints...
Is there any way to get the count SQL function to count certain ProductID's as 0.5?
This is a part of the table, and Product ID 2,4,6,8 and 10 are all relating to half pints and so count needs to recognise them as 0.5 instead of 1.
[URL=http://imageshack.us/photo/my-images/688/2121212e.png/][IMG]http://img688.imageshack.us/img688/64/2121212e.png[/IMG][/URL]
Thanks
Sam
It seems that you need to separate products purchased from products sold, lets say you called products sold "servings".
product (productId, supplierId, orderQuantity, reorderTrigger)
serving (servingId, productId, servingDesc, volumeOfServe)
this way you can have two servings of the one product (eq guiness)
servingDesc - "Guiness 1/2 pint" volumeOfServe - 0.5
servingDesc - "Guiness pint" volumeOfServe - 1