I want to select some rows with a specific filter, but don't limit if I don't get, at least, 40 rows.
It's a pseudo-example:
SELECT * FROM `table`
WHERE
SUM(1) < 40 OR
`age` > 18
It's similar to LIMIT, but LIMIT will consider the WHERE filter ever. I want to ignore the filter if I don't have at least 40 rows (but accept the firsts rows).
How I do that?
Edit: a lot of people had doubts what I really wanted.
This is an example:
ID AGE
1 10
2 20
3 30
4 10
5 20
6 30
7 10
I want to get the first 2 rows EVER. And only after at least two rows, get new rows that match the given conditions (WHERE).
For example: I want the first 2 rows more rows whose age is 30. The result would be equivalent to:
ID AGE
1 10 <first>
2 20 <second>
3 30 <conditional>
6 30 <conditional>
You can use an increasing variable #rownum to simulate the same functionality. However, this is much less efficient than limit because the server brings the filtered-out rows into memory and continuously performs the #rownum:=#rownum+1 calculation.
SELECT #rownum:=#rownum+1, t.*
FROM `table` t, (SELECT #rownum:=0) r
WHERE
#rownum <= 40 OR
`age` > 18
Related
I have a table,
Name Seconds Status_measure
a 0 10
a 10 13
a 20 -1
a 30 15
a 40 20
a 50 12
a 60 -1
Here I want for a particular name a new column which is calculated by, "The number of times the value goes >-1 only after once the -1 is met" . So in this particular data I want a new column for the name "a" which has the value=3 , because once the -1 is reached in Status_measure, we have 3 values (15 and 20 and 12)>-1
Required data frame:
Id Name Seconds Status_measure Value
1 a 0 10 3
2 a 10 13 3
3 a 20 -1 3
4 a 30 15 3
5 a 40 20 3
6 a 50 12 3
7 a 60 -1 3
I tried doing
count(status_measure>-1) over (partition by name order by seconds)
But this is not giving any desired result
You can do it in 2 steps, group data, count entries of the grp = 1.
select *, sum(Status_measure > -1 and grp = 1) over(partition by name) n
from (
select *
, row_number() over(partition by name order by Seconds) - sum(Status_measure > -1 ) over(partition by name order by Seconds) grp
from tbl
) t
An option is using a variable update, which:
starts from 0
increases its value when reaches a -1
decreases its value when reaches a second -1
Once you have this column, you can run a sum over your values.
SET #change = 0;
SELECT *, SUM(CASE WHEN Status_measure = -1
THEN IF(#change=0, #change := #change + 1, #change := #change - 1)
ELSE #change END) OVER() -1 AS Value_
FROM tab
Check the demo here.
Limitations: this solution assumes you have only one range of interesting values between -1s.
Note: there's a -1 decrement from your sum because the first update of the variable will leave 1 in the same row of -1, which you don't want. For better understanding, comment out the application of SUM() OVER and see intermediate output.
More of a clarification to your question first. I want to expand your original data to include another row for the sake of 2 vs 3 entries. Also, is there some auto-increment ID in your data that the sequential consideration is applicable such as
Id Name Seconds Status_measure Value
1 a 0 10 3
2 a 10 13 3
3 a 20 -1 3
4 a 30 15 3
5 a 40 20 3
6 a 50 12 3
7 a 60 -1 3
If sequential, and you have IDs 1 & 2 above the -1 at ID #3. This would indicate two entries. But then for IDs 4-6 above -1 have a count of three entries before ID #7.
So, what "VALUE" do you want to have in your result. The max count of 3 for all rows, or would it be a value of 2 for ID#s 1, 2 and 3? And value of 3 for Ids 4-7? Or, do you want ALL entries to recognize the greatest count before -1 measure to show 3 for all entries.
Please EDIT your question, you can copy/paste this in your original question if need be and provide additional clarification as requested (auto-increment as well as that is an impact of final output / determining break).
I have a table that has many rows in it, with rows occurring at the rate of 400-500 per minute (I know this isn't THAT many), but I need to do some sort of 'trend' analysis on the data that has been collected over the last 1 minute.
Instead of pulling all records that have been entered and then processing each of those, I would really like to be able to select, say, 10 records - which occur at a -somewhat- even distribution through the timeframe specified.
ID DEVICE_ID LA LO CREATED
-------------------------------------------------------------------
1 1 23.4 948.7 2018-12-13 00:00:01
2 2 22.4 948.2 2018-12-13 00:01:01
3 2 28.4 948.3 2018-12-13 00:02:22
4 1 26.4 948.6 2018-12-13 00:02:33
5 1 21.4 948.1 2018-12-13 00:02:42
6 1 22.4 948.3 2018-12-13 00:03:02
7 1 28.4 948.0 2018-12-13 00:03:11
8 2 23.4 948.8 2018-12-13 00:03:12
...
492 2 21.4 948.4 2018-12-13 00:03:25
493 1 22.4 948.2 2018-12-13 00:04:01
494 1 24.4 948.7 2018-12-13 00:04:02
495 2 27.4 948.1 2018-12-13 00:05:04
Considering this data set, instead of pulling all those rows, I would like to maybe pull a row from the set every 50 records (10 rows for roughly ~500 rows returned).
This does not need to be exact, I just need a sample in which to perform some sort of linear regression on.
Is this even possible? I can do it in my application code if need be, but I wanted to see if there was a function or something in MySQL that would handle this.
Edit
Here is the query I have tried, which works for now - but I would like the results more evenly distributed, not by RAND().
SELECT * FROM (
SELECT * FROM (
SELECT t.*, DATE_SUB(NOW(), INTERVAL 30 HOUR) as offsetdate
from tracking t
HAVING created > offsetdate) as parp
ORDER BY RAND()
LIMIT 10) as mastr
ORDER BY id ASC;
Do not order by RAND() as the rand calculated for every row, then reordered and only then you are selecting a few records.
You can try something like this:
SELECT
*
FROM
(
SELECT
tracking.*
, #rownum := #rownum + 1 AS rownum
FROM
tracking
, (SELECT #rownum := 0) AS dummy
WHERE
created > DATE_SUB(NOW(), INTERVAL 30 HOUR)
) AS s
WHERE
(rownum % 10) = 0
Index on created is "the must".
Also, you might consider to use something like 'AND (UNIX_TIMESTAMP(created) % 60 = 0)' which is slightly different from what you wanted, however might be OK (depends on your insert distribution)
I have 1 table from which I return search results and display them in a a specific order. This example is an exact, simplified version of my db structure: http://www.java2s.com/Code/SQL/Select-Clause/Orderbyvaluefromsubquery.htm
and here is my current code, which works but heavily impacts performance to a large extend because of the subquery used:
SELECT * FROM `table` AS p1
WHERE CONCAT(title,artist,creator,version) LIKE '%searchInput%'
ORDER BY
(SELECT
MAX(`rating`) FROM `table` AS p2 WHERE p1.setId=p2.setId
) DESC
the above code searches and sorts the result sets by the highest rating in the set and that all rows from the same set are kept together, for example:
id setId rating title,artist,etc...
1 1 5
2 1 5
3 2 7
4 1 6
5 2 1
6 3 3
would sort to:
id setId rating title,artist,etc...
3 2 7
5 2 1
4 1 6
1 1 5
2 1 5
6 3 3
Currently it takes around 8.5sec to query 1000 rows and over half a minute for a large amount of rows, is there any way to improve the performance or would it be better to fetch all the results and sort them in PHP memory?
Help is much appreciated
You can probably speed things up a bit by separating the LIKEs:
SELECT p1.* FROM `table` AS p1
WHERE (title LIKE '%searchInput%')
OR (artist LIKE '%searchInput%')
OR (creator LIKE '%searchInput%')
OR (version LIKE '%searchInput%')
ORDER BY
(SELECT MAX(`rating`) FROM `table` AS p2 WHERE p1.setId=p2.setId) DESC
You could also try to
CREATE INDEX tbl_ndx ON table(setId, rating)
to improve sorting performances.
i have a table with around 1 128 910 rows and now my SQL statement is starting to run very slow.
My table looks like this:
SU_Id SU_User SU_Skill SU_Value SU_Date
int(10) int(10) int(10) float int(10)
1 1 23 45.34 1300978612
2 1 23 48.51 1300978865
3 1 23 47.21 1300979203
4 3 23 61.01 1300979245
5 2 23 38.93 1300979412
6 1 17 12.76 1300979712
7 2 23 65.30 1300979998
As seen in SU_Skill a user can have more then one entry with the same skill number. SU_Value hold the value of a skill, it can go up and down. SU_Date holds the date when a value was added.
I want a SQL statement that selects the 20 currently highest values of a skill. The following SQL statement is what i use today but it is slow and i think there is a better way of doing it.
SELECT DISTINCT SU_User AS Player,
(SELECT SU_Value FROM WOU__SkillUploads WHERE SU_User = Player AND SU_Skill = 23 ORDER BY SU_Date DESC LIMIT 1) AS Value
FROM WOU__SkillUploads
WHERE SU_Skill = 23
ORDER BY Value DESC LIMIT 20
Is there a faster way? Thanks for reading my question!
Sub-selects are very slow, especially in the way you're using this.
Rewrite this as a JOIN. In this case, because you want all records from SkillUploads where SU_SKILL = 23, this should probably be a RIGHT JOIN.
Can the same user be in the results multiple times? This may work for you.
SELECT SU_USER, MAX(SU_VALUE)
FROM WOU_SKILLUploads
WHERE SU_SKILL=23
GROUP BY SU_USER
ORDER BY MAX(SU_VALUE) DESC
LIMIT 20
I have the following table (points):
recno uid uname points
============================
1 a abc 10
2 b bac 8
3 c cvb 12
4 d aty 13
5 f cyu 9
-------------------------
--------------------------
What I need is to show only the top ten records with by points (desc) and five records on each page. I have following the SQL statement:
select * from points where uid in(a,c) order by uid LIMIT 1, 5
Thanks
for the first page:
SELECT * FROM points p ORDER BY points DESC LIMIT 0, 5
for the second page:
SELECT * FROM points p ORDER BY points DESC LIMIT 5, 5
You can't execute an SQL query to return a set number of pages, you'll have to implement some kind of pagination module or whatever equivalent there is for the scenario you're in and fetch LIMIT 0, 5 for one then LIMIT 5, 5 for the other.
With such few records it wouldn't be an issue but in a production scale environment selected all records then breaking those results down into pages would be a lot of unnecessary overhead, it's good practice to only select the data you need.