How to get the time that a value remains above a limit - mysql

I have a thermometer which is storing all the reading data.
Example:
http://sqlfiddle.com/#!9/c0aab3/9
The idea is to obtain the time that remained with temperature above 85 fahrenheit.
I have invested everything according to my knowledge and I have not been able to find the solution.
Currently, what I'm doing is getting the time when I went above 85 and then getting the next low value of 85 to calculate the difference in time.
If the temperature is maintained at 85 for 5 consecutive minutes the data may fail.
Please, what would be the way to calculate this?
According to the example of sqlfiddle, the results shown are greater than or equal to 85, but in some cases it was not maintained means that low.
that peak from the beginning to the low must be taken and the time is calculated in seconds, therefore, I must do it successively until the end.
Then add all the seconds and get the time.

Base answer (no modification on the table)
I could find a way around with variables and some IF functions that manipulate them. See if this works for you :
SET #currIndex = 1;
SET #indicator = FALSE;
SET #prevIndex = 0;
SELECT Q2.sen,
MIN(Q2.subTime) as 'From',
MAX(Q2.subTime) AS 'To',
TIMEDIFF(MAX(Q2.subTime), MIN(Q2.subTime)) as diff
FROM (SELECT IF(Q1.temp < 85,
IF(#prevIndex = #currIndex,
(#currIndex := #currIndex +1) -1,
#currIndex),
#prevIndex := #currIndex) AS 'Ind',
Q1.subTime,
Q1.temp,
Q1.sen
FROM (SELECT IF(sen_temp.temp < 85,
(#indicator XOR (#indicator := FALSE)),
#indicator := TRUE) as ind,
s_time AS subTime,
temp,
sen
FROM sen_temp
) AS Q1
WHERE Q1.ind
) AS Q2
GROUP BY Q2.`Ind`
ORDER BY Q2.subTime;
Here's an SQL fiddle based on your own.
The main problem of this query is its performance. Since there is no ID on the table, data had to be carried through the queries.
Performance answer (table modification required)
After a lot of optimization work, I ended up adding an ID to the table. It allowed me to have only one sub query instead of 2 and to carry less data in the sub query.
SET #indicator = FALSE;
SET #currentIndex = 0;
SELECT T1.sen, MIN(T1.s_time) as 'From', MAX(T1.s_time) AS 'To',
TIMEDIFF(MAX(T1.s_time), MIN(T1.s_time)) as diff
FROM (SELECT id, (CASE WHEN (temp >= 85) THEN
#currentIndex + (#indicator := TRUE)
WHEN (#indicator) THEN
#currentIndex := #currentIndex + (#indicator := FALSE) + 1
ELSE
#indicator := FALSE
END) as ind
FROM sen_temp
ORDER BY id, s_date, s_time) AS Q1
INNER JOIN sen_temp AS T1 ON Q1.id = T1.id
WHERE Q1.ind > 0
GROUP BY T1.sen, Q1.ind
Please check this fiddle for this more efficient version.
Performance difference explanation
When creating a MySQL Query, performance is always key. If it is simple, the query will execute efficiently and you should not have any problem unless you get into some syntax error or other optimization problems like filtering or ordering data on a field that's not indexed.
When we create sub-queries, it's harder for the database to handle the query. The reason is quite simple : it potentially uses more RAM. When a query containing sub-queries is executed, the sub-queries are executed first ("obviously!" you might say). When a sub-query is executed, the server needs to keep those values for the upper-query, so it kind of creates a temporary table in the RAM, allowing itself to consult the data in the upper-query if it needs to. Even though RAM is quite fast, it may seem slowed down a lot when handling a ludicrous amount of data. It is even worse when the query makes the database server handle so much data that it won't even fit in the RAM, forcing the server to use the much slower system's storage.
If we limit the amount of data generated in sub-queries to the minimum and then only recover wanted data in the main query, the amount of RAM the server uses for the sub-queries is more negligible and the query runs faster.
Therefore, the smaller the data amount is in sub-queries, the faster the whole query will execute. This much is true for pretty much all "SQL like" databases.
Here's a nice reading explaining how queries are optimized

Related

What database for 3D distance calculations?

All-
I thing i've finally out grown MySQL for one of my solutions. Right now I have 70 million rows that simply store the x,y,z of objects in 3D space. Unfortunately I don't know how else to optimize my database to handle the inserts/queries anymore. I need to query based on distance (get objects within distance).
Does anyone have a suggestions on a good replacement? I don't know if I should be looking at something like hbase or non-relational databases, as I may run into a similar problem. I generally insert about 100 rows per minute, and my query looks like:
// get objects within 500 yards
SELECT DISTINCT `object_positions`.`entry` FROM `object_positions` WHERE `object_positions`.`type` = 3 AND `object_positions`.`continent` = '$p->continent' AND SQRT(POW((`object_positions`.`x` - $p->x), 2) + POW((`object_positions`.`y` - $p->y), 2) + POW((`object_positions`.`z` - $p->z), 2)) < 500;
Nothing crazy complicated, but I think the math involved is what is causing MySQL to explode and I'm wondering if I should be looking at a cloud based database solution? It could easily have to handle 10-100 queries per second.
It's not MySQL that's giving you trouble, it's the need to apply indexing to your problem. You have a problem that no amount of NoSQL or cloud computing is going to solve by magic.
Here's your query simplified just a bit for clarity.
SELECT DISTINCT entry
FROM object_positions
WHERE type = 3
AND continent = '$p->continent'
AND DIST(x,$p->x, y, $p->y, z,$p-z) < 500
DIST() is shorthand for your Cartesian distance function.
You need to put separate indexes on x, y, and z in your table, then you need to do this:
SELECT DISTINCT entry
FROM object_positions
WHERE type = 3
AND continent = '$p->continent'
AND x BETWEEN ($p->x - 500) AND ($p->x + 500)
AND y BETWEEN ($p->y - 500) AND ($p->y + 500)
AND z BETWEEN ($p->z - 500) AND ($p->z + 500)
AND DIST(x,$p->x, y, $p->y, z,$p-z) < 500
The three BETWEEN clauses of the WHERE statement will allow indexes to be used to avoid a full table scan of your table for each query. They'll select all your points in a 1000x1000x1000 cube surrounding your candidate point. Then the DIST computation will toss out the ones that are outside the radius you want. You'll get the same batch of points but much more efficiently.
You don't have to actually create a DIST function; the formula you have in your question is fine.
You do have an index on (type, continent), don't you? If not you need it too.

How could I know how much time it takes to query items in a table of MYSQL?

Our website has a problem: The visiting time of one page is too long. We have found out that it has a n*n matrix in that page; and for each item in the matrix, it queries three tables from MYSQL database. Every item in that matrix do the query quiet alike.
So I wonder maybe it is the large amount of MYSQL queries lead to the problem. And I want to try to fix it. Here is one of my confusions I list below:
1.
m = store.execute('SELECT X FROM TABLE1 WHERE I=1')
result = store.execute('SELECT Y FROM TABLE2 WHERE X in m')
2.
r = store.execute('SELECT X, Y FROM TABLE2');
result = []
for each in r:
i = store.execute('SELECT I FROM TABLE1 WHERE X=%s', each[0])
if i[0][0]=1:
result.append(each)
It got about 200 items in TABLE1 and more then 400 items in TABLE2. I don't know witch part takes the most time, so I can't make a better decision of how to write my sql statement.
How could I know how much time it takes to do some operation in MYSQL? Thank you!
Rather than installing a bunch of special tools, you could take a dead-simple approach like this (pardon my Ruby):
start = Time.new
# DB query here
puts "Query XYZ took #{Time.now - start} sec"
Hopefully you can translate that to Python. OR... pardon my Ruby again...
QUERY_TIMES = {}
def query(sql)
start = Time.new
connection.execute(sql)
elapsed = Time.new - start
QUERY_TIMES[sql] ||= []
QUERY_TIMES[sql] << elapsed
end
Then run all your queries through this custom method. After doing a test run, you can make it print out the number of times each query was run, and the average/total execution times.
For the future, plan to spend some time learning about "profilers" (if you haven't already). Get a good one for your chosen platform, and spend a little time learning how to use it well.
I use the MySQL Workbench for SQL development. It gives response times and can connect remotely to MySQL servers granted you have the permission (which in this case will give you a more accurate reading).
http://www.mysql.com/products/workbench/
Also, as you've realized it appears you have a SQL statement in a for loop. That could drastically effect performance. You'll want to take a different route with retrieving that data.

When using skip and take to page data, how can I get the total record count without a separate query?

I don't see how this is possible, but I really, really hate to run my query an extra time just to get the record count so I can build a pager. When I say a "pager" I simply mean the common gizmo with a link for each 10 records for example.
Assuming you are building a query in the selecting event, the best you could do is construct the full query, get and save the count, then take or skip it into the e.result.
And by best I mean, the easiest read code from a single query, rather than two. You'll still be running two separate evaluations on the database though. Use query analyser to see if the statements are a 'Select Count' then a 'Select take' or a dirty big select pared down by LINQ after the retrieve. I think LINQ does the former.
As far as I know it is not possible to return the total count and the items retrieved by skip and take at the same time.
I wrote a custom data source control and view, which caches the count for a short duration. I invalidate the cache whenever the criteria changes that would affect the number of results, but not when the data is paged, or when the data is sorted for instance.
I was concerned about this same question. Here is are the results of my experimenting in Linqpad--the actual behavior of Linq does not create a full new query to the SQL server:
This simple test, in one of my development databases:
var query = from p in HrPersons select p;
var x = query.Skip(20).Take(10).Dump();
var t = query.Count().Dump();
generates the following actual SQL queries:
SELECT [t1].[company] AS [Company], [t1].[processLevel] AS [ProcessLevel], [t1].[emplId] AS [EmplId], [t1].[sn] AS [Sn], [t1].[givenName] AS [GivenName], [t1].[middleInitial] AS [MiddleInitial], [t1].[nickName] AS [NickName], [t1].[formerName] AS [FormerName], [t1].[ssn] AS [Ssn], [t1].[cn] AS [Cn], [t1].[costCenter] AS [CostCenter], [t1].[title] AS [Title], [t1].[status] AS [Status], [t1].[batch] AS [Batch], [t1].[rowversion] AS [Rowversion], [t1].[id] AS [Id], [t1].[source] AS [Source]
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY [t0].[company], [t0].[processLevel], [t0].[emplId], [t0].[sn], [t0].[givenName], [t0].[middleInitial], [t0].[nickName], [t0].[formerName], [t0].[ssn], [t0].[cn], [t0].[costCenter], [t0].[title], [t0].[status], [t0].[batch], [t0].[rowversion], [t0].[id], [t0].[source]) AS [ROW_NUMBER], [t0].[company], [t0].[processLevel], [t0].[emplId], [t0].[sn], [t0].[givenName], [t0].[middleInitial], [t0].[nickName], [t0].[formerName], [t0].[ssn], [t0].[cn], [t0].[costCenter], [t0].[title], [t0].[status], [t0].[batch], [t0].[rowversion], [t0].[id], [t0].[source]
FROM [HrPerson] AS [t0]
) AS [t1]
WHERE [t1].[ROW_NUMBER] BETWEEN #p0 + 1 AND #p0 + #p1
ORDER BY [t1].[ROW_NUMBER]
GO
SELECT COUNT(*) AS [value]
FROM [HrPerson] AS [t0]
So while there is a second SQL query, it is a trivial one that only requests the total count. I believe this is reasonable and acceptable as a pattern.

Can we control LINQ expression order with Skip(), Take() and OrderBy()

I'm using LINQ to Entities to display paged results. But I'm having issues with the combination of Skip(), Take() and OrderBy() calls.
Everything works fine, except that OrderBy() is assigned too late. It's executed after result set has been cut down by Skip() and Take().
So each page of results has items in order. But ordering is done on a page handful of data instead of ordering of the whole set and then limiting those records with Skip() and Take().
How do I set precedence with these statements?
My example (simplified)
var query = ctx.EntitySet.Where(/* filter */).OrderByDescending(e => e.ChangedDate);
int total = query.Count();
var result = query.Skip(n).Take(x).ToList();
One possible (but a bad) solution
One possible solution would be to apply clustered index to order by column, but this column changes frequently, which would slow database performance on inserts and updates. And I really don't want to do that.
EDIT
I ran ToTraceString() on my query where we can actually see when order by is applied to the result set. Unfortunately at the end. :(
SELECT
-- columns
FROM (SELECT
-- columns
FROM (SELECT -- columns
FROM ( SELECT
-- columns
FROM table1 AS Extent1
WHERE EXISTS (SELECT
-- single constant column
FROM table2 AS Extent2
WHERE (Extent1.ID = Extent2.ID) AND (Extent2.userId = :p__linq__4)
)
) AS Project2
limit 0,10 ) AS Limit1
LEFT OUTER JOIN (SELECT
-- columns
FROM table2 AS Extent3 ) AS Project3 ON Limit1.ID = Project3.ID
UNION ALL
SELECT
-- columns
FROM (SELECT -- columns
FROM ( SELECT
-- columns
FROM table1 AS Extent4
WHERE EXISTS (SELECT
-- single constant column
FROM table2 AS Extent5
WHERE (Extent4.ID = Extent5.ID) AND (Extent5.userId = :p__linq__4)
)
) AS Project6
limit 0,10 ) AS Limit2
INNER JOIN table3 AS Extent6 ON Limit2.ID = Extent6.ID) AS UnionAll1
ORDER BY UnionAll1.ChangedDate DESC, UnionAll1.ID ASC, UnionAll1.C1 ASC
My workaround solution
I've managed to workaround this problem. Don't get me wrong here. I haven't solved precedence issue as of yet, but I've mitigated it.
What I did?
This is the code I've used until I get an answer from Devart. If they won't be able to overcome this issue I'll have to use this code in the end.
// get ordered list of IDs
List<int> ids = ctx.MyEntitySet
.Include(/* Related entity set that is needed in where clause */)
.Where(/* filter */)
.OrderByDescending(e => e.ChangedDate)
.Select(e => e.Id)
.ToList();
// get total count
int total = ids.Count;
if (total > 0)
{
// get a single page of results
List<MyEntity> result = ctx.MyEntitySet
.Include(/* related entity set (as described above) */)
.Include(/* additional entity set that's neede in end results */)
.Where(string.Format("it.Id in {{{0}}}", string.Join(",", ids.ConvertAll(id => id.ToString()).Skip(pageSize * currentPageIndex).Take(pageSize).ToArray())))
.OrderByDescending(e => e.ChangedOn)
.ToList();
}
First of all I'm getting ordered IDs of my entities. Getting only IDs is well performant even with larger set of data. MySql query is quite simple and performs really well. In the second part I partition these IDs and use them to get actual entity instances.
Thinking of it, this should perform even better than the way I was doing it at the beginning (as described in my question), because getting total count is much much quicker due to simplified query. The second part is practically very very similar, except that my entities are returned rather by their IDs instead of partitioned using Skip and Take...
Hopefully someone may find this solution helpful.
I haven't worked directly with Linq to Entities, but it should have a way to hook specific stored procedures into certain locations when needed. (Linq to SQL did.) If so, you could turn this query into a stored procedure, doing exacly what is required, and doing it efficiently.
Assuming from you comment the persisting the values in a List is not acceptable:
There's no way to completely minimize the iterations, as you intended (and as I would have tried too, living in hope). Cutting the iterations down by one would be nice. Is it possible to just get the Count once and cache/session it? Then you could:
int total = ctx.EntitySet.Count; // Hopefully you can not repeat doing this.
var result = ctx.EntitySet.Where(/* filter */).OrderBy(/* expression */).Skip(n).Take(x).ToList();
Hopefully you can cache the Count somehow, or avoid needing it every time. Even if you can't, this is the best you can do.
Could you please create a sample illusrating the problem and send it to us (support * devart * com, subject "EF: Skip, Take, OrderBy")?
Hope we will be able to help you.
You can also contact us using our forums or contact form.
Are you absolutely certain the ordering is off? What does the SQL look like?
Can you reorder your code as follows and post the output?
// Redefine your queries.
var query = ctx.EntitySet.Where(/* filter */).OrderBy(e => e.ChangedDate);
var skipped = query.Skip(n).Take(x);
// let's look at the SQL, shall we?
var querySQL = query.ToTraceString();
var skippedSQL = skipped.ToTraceString();
// actual execution of the queries...
int total = query.Count();
var result = skipped.ToList();
Edit:
I'm absolutely certain. You can check my "edit" to see trace result of my query with skipped trace result that is imperative in this case. Count is not really important.
Yeah, I see it. Wow, that's a stumper. Might even be an outright bug. I note you're not using SQL Server... what DB are you using? Looks like it might be MySQl.
One way:
var query = ctx.EntitySet.Where(/* filter */).OrderBy(/* expression */).ToList();
int total = query.Count;
var result = query.Skip(n).Take(x).ToList();
Convert it to a List before skipping. It's not too efficient, mind you...

Practical limit to length of SQL query (specifically MySQL)

Is it particularly bad to have a very, very large SQL query with lots of (potentially redundant) WHERE clauses?
For example, here's a query I've generated from my web application with everything turned off, which should be the largest possible query for this program to generate:
SELECT *
FROM 4e_magic_items
INNER JOIN 4e_magic_item_levels
ON 4e_magic_items.id = 4e_magic_item_levels.itemid
INNER JOIN 4e_monster_sources
ON 4e_magic_items.source = 4e_monster_sources.id
WHERE (itemlevel BETWEEN 1 AND 30)
AND source!=16 AND source!=2 AND source!=5
AND source!=13 AND source!=15 AND source!=3
AND source!=4 AND source!=12 AND source!=7
AND source!=14 AND source!=11 AND source!=10
AND source!=8 AND source!=1 AND source!=6
AND source!=9 AND type!='Arms' AND type!='Feet'
AND type!='Hands' AND type!='Head'
AND type!='Neck' AND type!='Orb'
AND type!='Potion' AND type!='Ring'
AND type!='Rod' AND type!='Staff'
AND type!='Symbol' AND type!='Waist'
AND type!='Wand' AND type!='Wondrous Item'
AND type!='Alchemical Item' AND type!='Elixir'
AND type!='Reagent' AND type!='Whetstone'
AND type!='Other Consumable' AND type!='Companion'
AND type!='Mount' AND (type!='Armor' OR (false ))
AND (type!='Weapon' OR (false ))
ORDER BY type ASC, itemlevel ASC, name ASC
It seems to work well enough, but it's also not particularly high traffic (a few hundred hits a day or so), and I wonder if it would be worth the effort to try and optimize the queries to remove redundancies and such.
Reading your query makes me want to play an RPG.
This is definitely not too long. As long as they are well formatted, I'd say a practical limit is about 100 lines. After that, you're better off breaking subqueries into views just to keep your eyes from crossing.
I've worked with some queries that are 1000+ lines, and that's hard to debug.
By the way, may I suggest a reformatted version? This is mostly to demonstrate the importance of formatting; I trust this will be easier to understand.
select *
from
4e_magic_items mi
,4e_magic_item_levels mil
,4e_monster_sources ms
where mi.id = mil.itemid
and mi.source = ms.id
and itemlevel between 1 and 30
and source not in(16,2,5,13,15,3,4,12,7,14,11,10,8,1,6,9)
and type not in(
'Arms' ,'Feet' ,'Hands' ,'Head' ,'Neck' ,'Orb' ,
'Potion' ,'Ring' ,'Rod' ,'Staff' ,'Symbol' ,'Waist' ,
'Wand' ,'Wondrous Item' ,'Alchemical Item' ,'Elixir' ,
'Reagent' ,'Whetstone' ,'Other Consumable' ,'Companion' ,
'Mount'
)
and ((type != 'Armor') or (false))
and ((type != 'Weapon') or (false))
order by
type asc
,itemlevel asc
,name asc
/*
Some thoughts:
==============
0 - Formatting really matters, in SQL even more than most languages.
1 - consider selecting only the columns you need, not "*"
2 - use of table aliases makes it short & clear ("MI", "MIL" in my example)
3 - joins in the WHERE clause will un-clutter your FROM clause
4 - use NOT IN for long lists
5 - logically, the last two lines can be added to the "type not in" section.
I'm not sure why you have the "or false", but I'll assume some good reason
and leave them here.
*/
Default MySQL 5.0 server limitation is "1MB", configurable up to 1GB.
This is configured via the max_allowed_packet setting on both client and server, and the effective limitation is the lessor of the two.
Caveats:
It's likely that this "packet" limitation does not map directly to characters in a SQL statement. Surely you want to take into account character encoding within the client, some packet metadata, etc.)
SELECT ##global.max_allowed_packet
this is the only real limit it's adjustable on a server so there is no real straight answer
From a practical perspective, I generally consider any SELECT that ends up taking more than 10 lines to write (putting each clause/condition on a separate line) to be too long to easily maintain. At this point, it should probably be done as a stored procedure of some sort, or I should try to find a better way to express the same concept--possibly by creating an intermediate table to capture some relationship I seem to be frequently querying.
Your mileage may vary, and there are some exceptionally long queries that have a good reason to be. But my rule of thumb is 10 lines.
Example (mildly improper SQL):
SELECT x, y, z
FROM a, b
WHERE fiz = 1
AND foo = 2
AND a.x = b.y
AND b.z IN (SELECT q, r, s, t
FROM c, d, e
WHERE c.q = d.r
AND d.s = e.t
AND c.gar IS NOT NULL)
ORDER BY b.gonk
This is probably too large; optimizing, however, would depend largely on context.
Just remember, the longer and more complex the query, the harder it's going to be to maintain.
Most databases support stored procedures to avoid this issue. If your code is fast enough to execute and easy to read, you don't want to have to change it in order to get the compile time down.
An alternative is to use prepared statements so you get the hit only once per client connection and then pass in only the parameters for each call
I'm assuming you mean by 'turned off' that a field doesn't have a value?
Instead of checking if something is not this, and it's also not that etc. can't you just check if the field is null? Or set the field to 'off', and check if type or whatever equals 'off'.