Write a MySQL query to get required result - mysql

I am working with MySQL database.
There are four types of risk factors Critical , High , Moderate , Low
Table contains data like:
id
uaid
attribute
value
time
risk factor
1
1234
Edge
Exist
16123
NONE
2
1234
Edge
Not Exist
16124
CRITICAL
3
1234
Edge
Exist
16125
NONE
4
1237
Chrome
Exist
124745
NONE
5
1237
Chrome
Not Exist
124759
HIGH
the required result should be like below:
Attribute
Risk Factor
UAID
Failed Value
Present Value
Edge
CRITICAL
1234
Not Exist
Exist
Chrome
HIGH
1237
Not Exist
Not Exist
Explanation:
we need to show data which have risk factor critical , moderate , high , low.
Failed Value = at the time (latest one) when risk factor is critical then value for that attribute represent as failed value
Present value = it is represented as current value for that attribute in database.
i have tried with the solution of two sql queries. one for taking getting rows which have risk factor equal to critical. and the second one for getting current value of each unique attribuite. and then some formatting of data from the both queries.
I am looking for solution which removes the extra overhead of data formatting according to requirement.
Schema table(id,uaid,attribute,value,time,risk_factor)

If I understand correctly, you want last value that is one of the four that you specify (i.e. not 'NONE'). Window functions are probably the simplest solution:
select t.*
from (select t.*,
first_value(value) over (partition by uaid order by id desc) as current_value
from t
) t
where risk_factor <> 'NONE';

Related

MYSQL Alternative to UNION for same table reusing same columns selected as new name

I'm trying to generate a result set from a table with effectively a unique/primary key as billyear, billmonth and type along with cost and consumption. So there could be 3 bill year and bill month identical entries but the type could be one of three values: E, W or NG.
I need to create a result set that has just one row per billyear and billmonth entry.
(
select month as billmonth, year as billyear, cost_estimate as eleccost, consumption_estimate as eleccons from tblbillforecast where buildingid=19 and type='E'
)
UNION (
select month as billmonth, year as billyear, cost_estimate as gascost, consumption_estimate as gascons from tblbillforecast where buildingid=19 and type='NG'
)
UNION (
select month as billmonth, year as billyear, cost_estimate as watercost, consumption_estimate as watercons from tblbillforecast where buildingid=19 and type='W'
)
This generates a result set with only billmonth, billyear, eleccost and eleccons columns. I've tried all kinds of solutions but the above example is the simplest to show where it's going wrong.
Additionally it still has 3 rows per billmonth/billyear unique combination instead of merging to one.
UPDATE:
Sample data
SELECT month AS billmonth,
year AS billyear,
SUM(CASE type WHEN 'E' THEN cost_estimate END) AS eleccost,
SUM(CASE type WHEN 'NG' THEN cost_estimate END) AS gascost,
SUM(CASE type WHEN 'W' THEN cost_estimate END) AS watercost
FROM tblbillforecast
WHERE buildingid=19
GROUP BY billmonth, billyear;
Result:
Expected result, eg:
year | month | eleccost | gascost | watercost
2018 | 1 | 32800 | 4460 | 4750
This is behaving correctly. An SQL query result set has one name per column, and this name applies to all the rows. So if you try to rename the column in the second or subsequent queries of the UNION, those new names are ignored. The name of the column is determined only by the first query of the UNION.
Additionally it still has 3 rows per billmonth/billyear unique combination instead of merging to one.
That's also correct behavior, according to the query you tried. UNION does not merge multiple rows into one, it only appends sets of rows.
As Akina hinted in the comments above, you may use multiple columns:
SELECT month AS billmonth,
year AS billyear,
SUM(CASE type WHEN 'E' THEN cost_estimate END) AS eleccost,
SUM(CASE type WHEN 'NG' THEN cost_estimate END) AS gascost,
SUM(CASE type WHEN 'W' THEN cost_estimate END) AS watercost
FROM tblbillforecast
WHERE buildingid=19
GROUP BY billmonth, billyear;
This uses GROUP BY to "merge" rows together, so you get one row in the result per month/year.
A quick bit of guidance on various data shaping operations in SQL:
JOIN - makes resultsets wider (more columns) by bringing together tables/resultsets in a side-by-side fashion generating output rows that have all the columns of the two input column sets
SELECT - typically makes resultsets narrower by allowing you to specify which columns you're interested in and which you are not; by not mentioning an available column it disappears meaning you output fewer columns
UNION - makes resultsets taller (more rows) by bringing together resultsets and outputting one on top of the other. Because columns always have a fixed data type and one name, you must have the same number of and type of, and order of columns
WHERE - makes resultsets shorter (fewer rows) by allowing you to specify truth based filters that exclude rows
It's not hard and fast; you can use select to create more columns too, but just in a very rudimentary sense these concepts hold true - JOIN to widen, UNION for taller, SELECT for narrower and WHERE for shorter. All the work you do with SQL is a data shaping exercise; you're either paring a rectangular block of data down or extending it, and in either a vertical or horizontal direction (or a mix).
I'm not going to get into grouping because that mixes rows up, and isn't something you tried in the question.. The reason for me writing this out was purely because you'd attempted to use a UNION (height-increasing) operation when you actually wanted a widen which, regardless of how it is done (JOIN or as per Bill's answer a SELECT+GROUP, which is valid, but relies on the "mixes rows up" aspect of grouping), specifically isn't done with a UNION. Union only makes stuff taller.
To give an example of how it might be done in an alternative way to Bill's approach, this task of yours has one huge table that is "too tall" - it uses 3 rows where 1 would do, if only it were a bit wider. That is to say if only there were 3 columns for electric/gas/water then we wouldn't need 3 rows with 1 utility in each.
Of course, we have this "one utility per row" because it is very flexible. Database tables don't have varying numbers of columns but they DO have varying numbers of rows. If a new bill type came along tomorrow - internet - no table changes are needed to accommodate it; add a new type I, and away you go, adding another row. We now store 4 rows of 1 utility where 1 row with 4 columns would do, but crucially we didn't have to change the table structure. We could have infinite different kinds of bills, and not need infinite columns because we can already have infinite rows
So you want to reshape your data from 4-rows-by-1-column to 1-row-by-4-columns. It could be solved as :
narrow the table to just year,month,building,type,cost AND shorten it to just electricity
separately narrow the table to just year,month,building,type,cost AND shorten it to just gas
separately narrow the table to just year,month,building,type,cost AND shorten it to just water
join (widening) all these newly created result sets , then narrow to remove the repeated year,month,building,type columns
That would look like:
SELECT e.year, e.month, e.building, e.cost, g.cost, w.cost
FROM
(SELECT year,month,building,cost FROM t WHERE type = 'E') e
JOIN
(SELECT year,month,building,cost FROM t WHERE type = 'NG') g
ON
e.year = g.year AND e.month = g.month AND e.building = g.building
JOIN
(SELECT year,month,building,cost FROM t WHERE type = 'W') w
ON
e.year = w.year AND e.month = w.month AND e.building = w.building
WHERE
e.building = 19
You can see clearly the 3 narrowing-and-shortening operations that pick out "just the gas", "just the electric", and "just the water" - they're the (SELECT year,month,building,cost FROM t WHERE type = 'NG') and that's what reduces the height of the original table, making it three times shorter than it was in each case. If we had 999 rows X 5 cols in the big table it goes to 3 sets of 333 x 5 rows each
You can see that we then JOIN these together to widen the results - our e.g 3 sets of 333 x 5 rows each widens to 333 x 15 when JOINed..
Then went from 333x15 down to 333 X 7 when SELECTed to ditch the repeated columns
It's likely not perfect (I'd perhaps left join all 3 onto a 4th set of numbers that are just the common columns in case some utilities aren't present for a particular month), and perhaps some people will come along complaining that it's less performant because it hits the table 3 times.. All that is accessory to the point I'm making about SQL being an exercise in reshaping data - tables are the starting blocks of data and you cut them up narrower and shorter, then stick them together side by side, or on top of each other and that becomes your new data block that's maybe wider, higher, both.. In any case it's definitely a different shape to what you started with. And then you can cut and shape again, and again..
Go with Bill's conditional agg (though this way would be fine if there is one row per building/year/month) but take away a stronger notion about in what direction these common operations (SELECT/JOIN/WHERE/UNION) reshape your data
Footnote about Bill's conditional aggregation (I know I said I wouldn't talk about it but it might make more sense to now). If you have:
Type, Cost
E, 123
NG, 456
W, 789
And you do a
SELECT
CASE WHEN Type = 'E' THEN Cost END as CostE,
CASE WHEN Type = 'NG' THEN Cost END as CostG,
CASE WHEN Type = 'W' THEN Cost END as CostW
...
It spreads the data out over more columns - the data has "gone from vertical to diagonal"
CostE, CostNG, CostW
123, NULL, NULL
NULL, 456, NULL
NULL, NULL, 789
But it's still too tall. If you then run a GROUP BY, which mixes rows up and ask for e.g. just the MAX from each column, then all the NULLs will disappear (because there is a non null somewhere in the column, and NULL is lost if there is a non null, no matter what you're doing) and the rows collapse, mixing together, into one:
CostE, CostNG, CostW
123, 456, 789
The data has pivoted round from being vertical, to being horizontal - another data shaping. It was pulled wider, and squashed flatter

SQL - Add To Existing Average

I'm trying to build a reporting table to track server traffic and popularity overall. Each SID is a unique game server hosting a particular game, and each UCID is a unique player key connecting to that server.
Say I have a table like so:
SID UCID AvgTime NumConnects
-----------------------------------------
1 AIE9348ietjg 300.55 5
1 Po328gieijge 500.66 7
2 AIE9348ietjg 234.55 3
3 Po328gieijge 1049.88 18
We can see that there are 2 unique players, and 3 unique servers, with SID 1 having 2 players that have connected to it at some point in the past. The AvgTime is the average amount of time those players spent on that server (in seconds), and the NumConnects is the size of the average (ie. 300.55 is averaged out of 5 elements).
Now I run a job in the background where I process a raw connection table and pull out player connections like so:
SID UCID ConnectTime DisconnectTime
-----------------------------------------
1 AIE9348ietjg 90.35 458.32
2 Po328gieijge 30.12 87.15
2 AIE9348ietjg 173.12 345.35
This table has no ID or other fluff to help condense my example. There may be multiple connect/disconnect records for multiple players in this table. What I want to do is add to my existing AvgTime for each SID these new values.
There is a formula from here I am trying to use (taken from this math stackexchange: https://math.stackexchange.com/questions/1153794/adding-to-an-average-without-unknown-total-sum/1153800#1153800)
Average = (Average * Size + NewValue) / Size + 1
How can I write an update query to update each ServerIDs traffic table above, and add to the average using the above formula for each pair of records. I tried something like the following but it didn't work (returned back null):
UPDATE server_traffic st
LEFT JOIN connect_log l
ON st.SID = l.SID AND st.UCID = l.UCID
SET AvgTime = (AvgTime * NumConnects + SUM(l.DisconnectTime - l.ConnectTime) / NumConnects + COUNT(l.UCID)
I would prefer an answer in MySql, but I'll accept MS SQL as well.
EDIT
I understand that statistics and calculations are generally not to be stored in tables and that you can run reports that would crunch the numbers for you. My requirement is that users can go to a website and view the popularity of various servers. This needs to be done in a way that
A: running a complex query per user doesn't crash or slow down the system
B: the page returns the data within a few seconds at most
See this example here: https://bf4stats.com/pc/shinku555555
This is a web page for battlefield 4 stats - notice that the load is almost near instant for this player, and I get back a load of statistics without waiting for some complex report query to return the data. I'm assuming they store these calculations in preprocessed tables where the webpage just needs to do a simple select to return back the values. That's the same approach I want to take with my Database and Web Application design.
Sorry if this is off topic to the original question - but hopefully this adds additional context that helps people understand my needs.
Since you cannot run aggregate functions like SUM and COUNT by themselves at the unit level in SQL but contained in an aggregate query, consider joining to an aggregate subquery for the UPDATE...LEFT JOIN. Also, adjust parentheses in SET to match above formula.
Also, note that since you use LEFT JOIN, rows with non-match IDs will render NULL for aggregate fields and this entity cannot be used in arithmetic operations and will return NULL. You can convert to zero with IFNULL() but may fail with formula's division.
UPDATE server_traffic s
LEFT JOIN
(SELECT SID, UCID, COUNT(UCID) As GrpCount,
SUM(DisconnectTime - ConnectTime) AS SumTimeDiff
FROM connect_log
GROUP BY SID, UCID) l
ON s.SID = l.SID AND s.UCID = l.UCID
SET s.AvgTime = (s.AvgTime * s.NumConnects + l.SumTimeDiff) / s.NumConnects + l.GrpCount
Aside - reconsider saving calculations/statistics within tables as they can always be run by queries even by timestamps. Ideally, database tables should store raw values.

How to balance out row mode and column mode in cygnus?

I have a weather-station that transmits data each hour. During that hour it makes four recordings (one every 15 minutes). In my current situation I am using attr_persistence=row to store data in my MySql database.
With row mode I get the default generated columns:
recvTimeTs | recvTime | entityId | entityType | attrName | attrType | attrValue | attrMd
But my weather station sends me the following data:
| attrName | attrValue
timeRecorded 14:30:0,22.5.2015
measurement1 18.799
measurement2 94.0
measurement3 1.19
These attrValue are represented in the database as string.
Is there a way to leave the three measurements in row mode and switch the timeRecorded to column mode? And if not, then what is my alternative?
The point of all this is to query the time recorded value, but I cannot query date as long as it is string.
As a side note: having the weather station send the data as soon as it is recorded (every 15 minutes) is out of the question, firstly because I need to conserve battery power and more importantly because in the case of a problem with the post, it will send all of the recordings at once.
So if an entire day went without sending any data, the weather station will send all 24*4 readings at once...
The proposed solution is to use the STR_TO_DATE function of MySQL in order to translate the stored string-based "timeRecorded" attribute into a real MySQL Timestamp type.
Nevertheless, "timeRecorded" attribute appears every 4 rows in the table due to the "row" attribute persistence mode of OrionMySQLSink. In this case I think you can use the ROWNUM keyword from MySQL in order to get only every 4 rows, something like (not an expert on MySQL):
SELECT STR_TO_DATE( attrValue, '%m/%d/%Y' ) FROM def_servpath_0004_weatherstation where (ROWNUM / 4 = 0);
The alternative is to move to "column" mode (in this case you have to provision de table by yourself). By using this mode you will have a single row with all the 4 attributes, being one of these attributes the "timeRecorded" one. In this case, you can provision the table by directly specifying the type of the "timeRecorded" column as Timestamp, instead of Text. That way, you will avoid the STR_TO-DATE part.

MySQL- Counting rows VS Setting up a counter

I have 2 tables posts<id, user_id, text, votes_counter, created> and votes<id, post_id, user_id, vote>. Here the table vote can be either 1 (upvote) or -1(downvote). Now if I need to fetch the total votes(upvotes - downvotes) on a post, I can do it in 2 ways.
Use count(*) to count the number of upvotes and downvotes on that post from votes table and then do the maths.
Set up a counter column votes_counter and increment or decrement it everytime a user upvotes or downvotes. Then simply extract that votes_counter.
My question is which one is better and under what condition. By saying condition, I mean factors like scalability, peaktime et cetera.
To what I know, if I use method 1, for a table with millions of rows, count(*) could be a heavy operation. To avoid that situation, if I use a counter then during peak time, the votes_counter column might get deadlocked, too many users trying to update the counter!
Is there a third way better than both and as simple to implement?
The two approaches represent a common tradeoff between complexity of implementation and speed.
The first approach is very simple to implement, because it does not require you to do any additional coding.
The second approach is potentially a lot faster, especially when you need to count a small percentage of items in a large table
The first approach can be sped up by well designed indexes. Rather than searching through the whole table, your RDBMS could retrieve a few records from the index, and do the counts using them
The second approach can become very complex very quickly:
You need to consider what happens to the counts when a user gets deleted
You should consider what happens when the table of votes is manipulated by tools outside your program. For example, merging records from two databases may prove a lot more complex when the current counts are stored along with the individual ones.
I would start with the first approach, and see how it performs. Then I would try optimizing it with indexing. Finally, I would consider going with the second approach, possibly writing triggers to update counts automatically.
As this sounds a lot like StackExchange, I'll refer you to this answer on the meta about the database schema used on the site. The votes table looks like this:
Votes table:
Id
PostId
VoteTypeId, one of the following values:
1 - AcceptedByOriginator
2 - UpMod
3 - DownMod
4 - Offensive
5 - Favorite (if VoteTypeId = 5, UserId will be populated)
6 - Close
7 - Reopen
8 - BountyStart (if VoteTypeId = 8, UserId will be populated)
9 - BountyClose
10 - Deletion
11 - Undeletion
12 - Spam
15 - ModeratorReview
16 - ApproveEditSuggestion
UserId (only present if VoteTypeId is 5 or 8)
CreationDate
BountyAmount (only present if VoteTypeId is 8 or 9)
And so based on that it sounds like the way it would be run is:
SELECT VoteTypeId FROM Votes WHERE VoteTypeId = 2 OR VoteTypeId = 3
And then based on the value, do the maths:
int score = 0;
for each vote in voteQueryResults
if(vote == 2) score++;
if(vote == 3) score--;
Even with millions of results, this is probably going to be a very fast operation as it's so simple.

Adding or subtracting values based on another field's contents

I have a table with transactions. All transactions are stored as positive numbers, if its a deposit or withdrawl only the action changes. How do i write a query that can sum up the numbers based on the action
-actions-
1 Buy 2 Sell 5 Dividend
ID ACTION SYMBOL PRICE SHARES
1 1 AGNC 27.50 150
2 2 AGNC 30.00 50
3 5 AGNC 1.25 100
So the query should show AGNC has a total of 100 shares.
SELECT
symbol,sum(shares) AS shares,
ROUND(abs(sum((price * shares))),2) AS cost,
FROM bf_transactions
WHERE (action_id <> 5)
GROUP BY symbol
HAVING sum(shares) > 0
I was originally using that query when i had positive/negative numbers and that worked great.. but i dont know how to do it now with just positive numbers.
This ought to do it:
SELECT symbol, sum(case action
when 1 then shares
when 2 then -shares
end) as shares
FROM bf_transactions
GROUP BY symbol
SQL Fiddle here
It is however good practice to denormalize this kind of data - what you appear to have now is a correctly normalized database with no duplicate data, but it's rather impractical to use as you can see in cases like this. You should keep a separate table with current stock portfolio that you update when a transaction is executed.
Also, including a HAVING-clause to 'hide' corrupted data (someone has sold more than they have purchased) seems rather bad practice to me - when a situation like that is detected you should definitely throw some kind of error, or an internal alert.