How to balance out row mode and column mode in cygnus? - fiware

I have a weather-station that transmits data each hour. During that hour it makes four recordings (one every 15 minutes). In my current situation I am using attr_persistence=row to store data in my MySql database.
With row mode I get the default generated columns:
recvTimeTs | recvTime | entityId | entityType | attrName | attrType | attrValue | attrMd
But my weather station sends me the following data:
| attrName | attrValue
timeRecorded 14:30:0,22.5.2015
measurement1 18.799
measurement2 94.0
measurement3 1.19
These attrValue are represented in the database as string.
Is there a way to leave the three measurements in row mode and switch the timeRecorded to column mode? And if not, then what is my alternative?
The point of all this is to query the time recorded value, but I cannot query date as long as it is string.
As a side note: having the weather station send the data as soon as it is recorded (every 15 minutes) is out of the question, firstly because I need to conserve battery power and more importantly because in the case of a problem with the post, it will send all of the recordings at once.
So if an entire day went without sending any data, the weather station will send all 24*4 readings at once...

The proposed solution is to use the STR_TO_DATE function of MySQL in order to translate the stored string-based "timeRecorded" attribute into a real MySQL Timestamp type.
Nevertheless, "timeRecorded" attribute appears every 4 rows in the table due to the "row" attribute persistence mode of OrionMySQLSink. In this case I think you can use the ROWNUM keyword from MySQL in order to get only every 4 rows, something like (not an expert on MySQL):
SELECT STR_TO_DATE( attrValue, '%m/%d/%Y' ) FROM def_servpath_0004_weatherstation where (ROWNUM / 4 = 0);
The alternative is to move to "column" mode (in this case you have to provision de table by yourself). By using this mode you will have a single row with all the 4 attributes, being one of these attributes the "timeRecorded" one. In this case, you can provision the table by directly specifying the type of the "timeRecorded" column as Timestamp, instead of Text. That way, you will avoid the STR_TO-DATE part.

Related

Write a MySQL query to get required result

I am working with MySQL database.
There are four types of risk factors Critical , High , Moderate , Low
Table contains data like:
id
uaid
attribute
value
time
risk factor
1
1234
Edge
Exist
16123
NONE
2
1234
Edge
Not Exist
16124
CRITICAL
3
1234
Edge
Exist
16125
NONE
4
1237
Chrome
Exist
124745
NONE
5
1237
Chrome
Not Exist
124759
HIGH
the required result should be like below:
Attribute
Risk Factor
UAID
Failed Value
Present Value
Edge
CRITICAL
1234
Not Exist
Exist
Chrome
HIGH
1237
Not Exist
Not Exist
Explanation:
we need to show data which have risk factor critical , moderate , high , low.
Failed Value = at the time (latest one) when risk factor is critical then value for that attribute represent as failed value
Present value = it is represented as current value for that attribute in database.
i have tried with the solution of two sql queries. one for taking getting rows which have risk factor equal to critical. and the second one for getting current value of each unique attribuite. and then some formatting of data from the both queries.
I am looking for solution which removes the extra overhead of data formatting according to requirement.
Schema table(id,uaid,attribute,value,time,risk_factor)
If I understand correctly, you want last value that is one of the four that you specify (i.e. not 'NONE'). Window functions are probably the simplest solution:
select t.*
from (select t.*,
first_value(value) over (partition by uaid order by id desc) as current_value
from t
) t
where risk_factor <> 'NONE';

SQL - Add To Existing Average

I'm trying to build a reporting table to track server traffic and popularity overall. Each SID is a unique game server hosting a particular game, and each UCID is a unique player key connecting to that server.
Say I have a table like so:
SID UCID AvgTime NumConnects
-----------------------------------------
1 AIE9348ietjg 300.55 5
1 Po328gieijge 500.66 7
2 AIE9348ietjg 234.55 3
3 Po328gieijge 1049.88 18
We can see that there are 2 unique players, and 3 unique servers, with SID 1 having 2 players that have connected to it at some point in the past. The AvgTime is the average amount of time those players spent on that server (in seconds), and the NumConnects is the size of the average (ie. 300.55 is averaged out of 5 elements).
Now I run a job in the background where I process a raw connection table and pull out player connections like so:
SID UCID ConnectTime DisconnectTime
-----------------------------------------
1 AIE9348ietjg 90.35 458.32
2 Po328gieijge 30.12 87.15
2 AIE9348ietjg 173.12 345.35
This table has no ID or other fluff to help condense my example. There may be multiple connect/disconnect records for multiple players in this table. What I want to do is add to my existing AvgTime for each SID these new values.
There is a formula from here I am trying to use (taken from this math stackexchange: https://math.stackexchange.com/questions/1153794/adding-to-an-average-without-unknown-total-sum/1153800#1153800)
Average = (Average * Size + NewValue) / Size + 1
How can I write an update query to update each ServerIDs traffic table above, and add to the average using the above formula for each pair of records. I tried something like the following but it didn't work (returned back null):
UPDATE server_traffic st
LEFT JOIN connect_log l
ON st.SID = l.SID AND st.UCID = l.UCID
SET AvgTime = (AvgTime * NumConnects + SUM(l.DisconnectTime - l.ConnectTime) / NumConnects + COUNT(l.UCID)
I would prefer an answer in MySql, but I'll accept MS SQL as well.
EDIT
I understand that statistics and calculations are generally not to be stored in tables and that you can run reports that would crunch the numbers for you. My requirement is that users can go to a website and view the popularity of various servers. This needs to be done in a way that
A: running a complex query per user doesn't crash or slow down the system
B: the page returns the data within a few seconds at most
See this example here: https://bf4stats.com/pc/shinku555555
This is a web page for battlefield 4 stats - notice that the load is almost near instant for this player, and I get back a load of statistics without waiting for some complex report query to return the data. I'm assuming they store these calculations in preprocessed tables where the webpage just needs to do a simple select to return back the values. That's the same approach I want to take with my Database and Web Application design.
Sorry if this is off topic to the original question - but hopefully this adds additional context that helps people understand my needs.
Since you cannot run aggregate functions like SUM and COUNT by themselves at the unit level in SQL but contained in an aggregate query, consider joining to an aggregate subquery for the UPDATE...LEFT JOIN. Also, adjust parentheses in SET to match above formula.
Also, note that since you use LEFT JOIN, rows with non-match IDs will render NULL for aggregate fields and this entity cannot be used in arithmetic operations and will return NULL. You can convert to zero with IFNULL() but may fail with formula's division.
UPDATE server_traffic s
LEFT JOIN
(SELECT SID, UCID, COUNT(UCID) As GrpCount,
SUM(DisconnectTime - ConnectTime) AS SumTimeDiff
FROM connect_log
GROUP BY SID, UCID) l
ON s.SID = l.SID AND s.UCID = l.UCID
SET s.AvgTime = (s.AvgTime * s.NumConnects + l.SumTimeDiff) / s.NumConnects + l.GrpCount
Aside - reconsider saving calculations/statistics within tables as they can always be run by queries even by timestamps. Ideally, database tables should store raw values.

MS Access order table by order in which records were entered

I have a table in Access that I use as a progress tracker/to do list, with one field containing the date (short text) and what I did that day (long text). An example would be like this
date | progress
----------------
6/20 | did item1
| tomorrow do item2 and item3
6/21 | long text I continue in the next line for visibilty
| continued
| to do tomorrow
6/22 | item6 completed
etc. I enter these things manually. the past 3 weeks or so that I have been updating this table, it opens in the same order every time -- the order in which I created the records. Recently, the table opened in a completely random order, and continues to open in that new order.
I know now that it would have been good to create an autonumber field and order it by that, or have the date field's default value =Now(). I ahve far to many fields to make a new ID field and manually number each record in the order I created them.
Is it at all possible to force the table to order the records in either the order I created the records, or at least the previous configuration (which was just ordered by the time created)?
Also, is there a better way to be doing this...? I want to just have a record for other people who will work with this in the future. In addition, I am new to SQL, and the Access SQl has weird/unique ways of doing things, so for some queries I know I may need in the future I keep a table with the query name and some documentation for what exactly it does and some notes about the syntax (the SQL editor does not allow for -- comments). Is there a better way to do this, too?
Thanks for any help!
one field containing the date (short text)
Dates should never be stored as anything else than date values. No exceptions.
So change the data type of the field to Date, sort on this, and your troubles are gone.
To order Null dates last:
Order By Abs([DateField] Is Null), [DateField]

Representing percentages up to and including 100% in MySQL

I have a table in my database that represents the similarity between two things. Some thing like:
+------------+------+
| Field | Type |
+------------+------+
| id_a | int |
| id_b | int |
| similarity | ??? |
+------------+------+
similarity will hold the degree of similarity between id_a and id_b in percent, and can range from 100% similar (identical things) down to but not including 0%. I won't be storing links for things which are 0% similar (i.e. completely different). In other words I need to store the range [100, 0). The amount of decimal places isn't terribly important, but 1 or 2 would be nice.
The solution I've typically seen suggested is to use something like decimal(4,2). The problem with that for my use case is that it stores (100,0].
I've come up with two possible solutions, both using decimal(4,2), but they both seem like hacks:
option 1
Store similarity - 0.01 and add the 0.01 back when retrieving it. Something like:
INSERT INTO similarities (id_a, id_b, similarity) VALUES (1, 2, ? - 0.01);
And then:
SELECT id_a, id_b, similarity + 0.01 FROM similarities;
option 2
Store percent differences from 0%-99.99%, and then convert to similarity when retrieving:
SELECT id_a, id_b, 100 - difference AS similarity FROM similarities;
In both cases I would probably create a view using MERGE, rather than leaving the addition and subtraction in the queries.
Are there any better options than these? If there aren't, which would you choose and why?
note: I don't mind using some other representation, like [1,0), as long as it represents the range well.
Edit to clarify: Inserts are done rarely, and are only done by me, not users, and are done in large batches. I know that the data I'm inserting will always be in [100,0), so it's not a question of enforcement, but rather of what the most efficient/natural representation is
In a dbms that complies with SQL standards, you would declare the column to be of type decimal(5, 2) (or use the equivalent decimal fraction), and use a CHECK constraint to limit the range.
create table data (
id integer primary key,
pct decimal(5, 2) not null check (pct > 0 and pct <= 100)
);
But MySQL doesn't comply with SQL standards. It doesn't enforce CHECK constraints. So I think you have two choices.
Write a trigger to check the range, and rollback inserts and updates
that fall outside your chosen range.
Use a foreign key reference to a table of valid values. In your
case, that table would only have 10,000 rows, right?
If I needed to use the percentage in further calculations, I'd much prefer values in the range of .0001 to 1.0000, so they could be used directly. It doesn't look like that's a concern in your application, though.
Instead of assigning the similarity as a percentage, per sè, give them similarity-scores in the range [1,10000] (or (0,10000] if you like). That gives you 100 points per percentage-point (efficiently two decimals if you need).
Storage: int(32)
View: SELECT id_a, id_b, similarity/100 FROM similarities;

MySQL, how to repeat same line x times

I have a query that outputs address order data:
SELECT ordernumber
, article_description
, article_size_description
, concat(NumberPerBox,' pieces') as contents
, NumberOrdered
FROM customerorder
WHERE customerorder.id = 1;
I would like the above line to be outputted NumberOrders (e.g. 50,000) divided by NumberPerBox e.g. 2,000 = 25 times.
Is there a SQL query that can do this, I'm not against using temporary tables to join against if that's what it takes.
I checked out the previous questions, however the nearest one:
is to be posible in mysql repeat the same result
Only gave answers that give a fixed number of rows, and I need it to be dynamic depending on the value of (NumberOrdered div NumberPerBox).
The result I want is:
Boxnr Ordernr as_description contents NumberOrdered
------+--------------+----------------+-----------+---------------
1 | CORDO1245 | Carrying bags | 2,000 pcs | 50,000
2 | CORDO1245 | Carrying bags | 2,000 pcs | 50,000
....
25 | CORDO1245 | Carrying bags | 2,000 pcs | 50,000
First, let me say that I am more familiar with SQL Server so my answer has a bit of a bias.
Second, I did not test my code sample and it should probably be used as a reference point to start from.
It would appear to me that this situation is a prime candidate for a numbers table. Simply put, it is a table (usually called "Numbers") that is nothing more than a single PK column of integers from 1 to n. Once you've used a Numbers table and aware of how it's used, you'll start finding many uses for it - such as querying for time intervals, string splitting, etc.
That said, here is my untested response to your question:
SELECT
IV.number as Boxnr
,ordernumber
,article_description
,article_size_description
,concat(NumberPerBox,' pieces') as contents
,NumberOrdered
FROM
customerorder
INNER JOIN (
SELECT
Numbers.number
,customerorder.ordernumber
,customerorder.NumberPerBox
FROM
Numbers
INNER JOIN customerorder
ON Numbers.number BETWEEN 1 AND customerorder.NumberOrdered / customerorder.NumberPerBox
WHERE
customerorder.id = 1
) AS IV
ON customerorder.ordernumber = IV.ordernumber
As I said, most of my experience is in SQL Server. I reference http://www.sqlservercentral.com/articles/Advanced+Querying/2547/ (registration required). However, there appears to be quite a few resources available when I search for "SQL numbers table".