I am trying to do SQL code in mysqli query to select rows with higher priority more often. I have a DB where all posts are sorted by priority, but I want it select like this (10 - the highest priority):
**Priority**
10
3
10
9
7
10
9
1
10
How can I do this? I have tried that to solve by more ways but no result. Thank you.
If you want to sample your data with preference to higher priorities, you could do something like this:
SELECT *
FROM (
SELECT OrderDetailID
,mod(OrderDetailID, 10) + 1 AS priority
,rand() * 10 AS rand_priority
FROM OrderDetails
) A
WHERE rand_priority < priority
ORDER BY OrderDetailID
This query runs in MySQL Tryit from W3Schools.
mod(OrderDetailID, 10) + 1 simulates a 1-10 priority - your table just has this value in it already
rand() * 10 gives you a random number between 0 and 10
Then by filtering to only ones where the random number is less than the priority, you get a result set where the higher priorities are more likely.
You may use rank function if your MySQL version supports it. It will order your data by priority in descending order and ranks each row. If the two rows have same priority then both rows will have same ranking. Then you can filter out the first rank data which will give you highest priority rows always.
Select * FROM
(
SELECT
col1,
col2,
priority,
RANK() OVER w AS 'rank'
FROM MyTable
WINDOW w AS (ORDER BY priority)
) MyQuery
Where rank = 1
Note : Syntax might be incorrect, please feel to edit the query.
This post might help you for ranking if your MySql version doesn't support Rank.
Related
This question already has answers here:
Optimizing my mysql statement! - RAND() TOO SLOW
(6 answers)
Closed 8 years ago.
I have a large mysql table with about 25000 rows. There are 5 table fields: ID, NAME, SCORE, AGE,SEX
I need to select random 5 MALES order BY SCORE DESC
For instance, if there are 100 men that score 60 each and another 100 that score 45 each, the script should return random 5 from the first 200 men from the list of 25000
ORDER BY RAND()
is very slow
The real issue is that the 5 men should be a random selection within the first 200 records. Thanks for the help
so to get something like this I would use a subquery.. that way you are only putting the RAND() on the outer query which will be much less taxing.
From what I understood from your question you want 200 males from the table with the highest score... so that would be something like this:
SELECT *
FROM table_name
WHERE age = 'male'
ORDER BY score DESC
LIMIT 200
now to randomize 5 results it would be something like this.
SELECT id, score, name, age, sex
FROM
( SELECT *
FROM table_name
WHERE age = 'male'
ORDER BY score DESC
LIMIT 200
) t -- could also be written `AS t` or anything else you would call it
ORDER BY RAND()
LIMIT 5
I dont think that sorting by random can be "optimised out" in any way as sorting is N*log(N) operation. Sorting is avoided by query analyzer by using indexes.
The ORDER BY RAND() operation actually re-queries each row of your table, assigns a random number ID and then delivers the results. This takes a large amount of processing time for table of more than 500 rows. And since your table is containing approx 25000 rows then it will definitely take a good amount of time.
I'm looking for a mysql select that will allow me to select (LIMIT 8) records after some changing number of first few matches;
select id
from customers
where name LIKE "John%"
Limit 8
So if i have a table with 1000 of johns with various last names
I want to be able to select records 500-508
You can send the offset to the limit statement, like this:
SELECT id
FROM customers
WHERE name LIKE "John%"
LIMIT 8 OFFSET 500
Notice the OFFSET 500 on the limit. That sets the 'start point' past the first 500 entries (at entry #501).
Therefor, entries #501, #502, #503, #504, #505, #506, #507 and #508 will be selected.
This can also be written:
LIMIT 500, 8
Personally, I don't like that as much and don't understand the order.
Pedantic point: 500-508 is 9 entries, so I had to adjust.
As a solution please try executing the following sql query
select id from customers where name LIKE "John%" Limit 500,8
I have this table,
person_id int(10) pk
points int(6) index
other columns not very important
I have this random function which is very fast on a table with 10M rows:
SELECT person_id
FROM persons AS r1 JOIN
(SELECT (RAND() *
(SELECT MAX(person_id)
FROM persons)) AS id)
AS r2
WHERE r1.person_id >= r2.id
ORDER BY r1.person_id ASC
LIMIT 1
This is all great but now I wish to show only people with points > 0. Example table:
PERSON_ID POINTS
1 4
2 6
3 0
4 3
When I append AND points > 0 to the where clause, person_id 3 can't be selected, so a gap is created and when the random select person_id 3, person_id 4 will be selected. This gives person 4 a bigger chance to be chosen. Any one got suggestions how I can adjust the query to make it work with the where clause and give all rows same % of chance to be selected.
Info table: The table is uniform, no gaps in person_id's. About 90% will have 0 points. I want to make the query for where points = 0 and points > 0.
Before someone will say, use rand(): this is not solution for tables with more than a few 100k rows.
Bonus question: will it be possible to select x random rows in 1 query, so I do not have to call this query a few times when i want more random rows?
Important note: performance is key, with 10M+ rows the query may not take much longer than the current query, which takes 0.0005 seconds, I prefer to stay under 0.05 second.
Last note: If you think the query will never be this fast with above requirements, but another solution is possible (like fetching 100 rows and showing x random which has more than 0 points), please tell :)
Really appreciate your help and all help is welcome :)
You could generate in-line gap-free id's for the records that you really want to work with, and generate then the random selector using the total number of records available.
Try with this (props to the chosen answer here for the row_number generator):
SELECT r1.*
FROM
(SELECT person_id,
#curRow := #curRow + 1 AS row_number
FROM persons as p,
(SELECT #curRow := 0) r0
WHERE points>0) r1
, (SELECT COUNT(1) * RAND() id
FROM persons
WHERE points>0) r2
WHERE r1.person_id>=r2.id
ORDER BY r1.person_id ASC
LIMIT 1;
You can mess with it in this sqlfiddle.
I have hundreds of thousands of price points spanning 40 years plus. I would like to construct a query that will only return 3000 total data points, with the last 500 being the most recent data points, and the other 2500 being just a sample of the rest of the data, evenly distributed.
Is it possible to do this in one query? How would I select just a sample of the large amount of data? This is a small example of what I mean for getting just a sample of the other 2500 data points:
1
2
3
4
5
6
7
8
9
10
And I want to return something like this:
1
5
10
Here's the query for the last 500:
SELECT * FROM price ORDER BY time_for DESC LIMIT 500
I'm not sure how to go about getting the sample data from the other data points.
Try this:
(SELECT * FROM price ORDER BY time_for DESC LIMIT 500)
UNION ALL
(SELECT * FROM price WHERE time_for < (SELECT time_for FROM price ORDER BY time_for LIMIT 500, 1) ORDER BY rand() LIMIT 2500)
ORDER BY time_for
Note: It's probably going to be slow. How big is this table?
It might be faster to only get the primary ID from all these rows, then join it to the original in a secondary query once it's narrowed down. This is because ORDER BY rand() LIMIT has to sort the entire table. If the table is large this can take a LONG time, and a lot of disk space. Retrieving only the ID reduces the necessary disk space.
The previous answer is good, but you did specify that you want the results to be evenly distributed so I'll add this possibility too. By iterating a counter over the rows you can use a MOD operator to sample an even distribution. I don't have a MYSQL install right now to test this so apologies if the syntax isn't 100% spot on. But it should be close enough and may give you some ideas.
( SELECT p1.*
FROM price p1
ORDER BY p1.time_for DESC
LIMIT 500 )
UNION ALL
( SELECT #i := #i + 1 AS row_num,
p2.*
FROM price p2,
(SELECT #i: = 0)
WHERE row_num > 500
AND (row_num % 500) = 0
ORDER BY time_for DESC )
The first query gives the 500 latest rows. The second query gives every 500th row after that, thus returning an even distribution from the rest of the data. Obviously you can tune this parameter to achieve the desired sample spacing. Or base it on the total number of rows in the table to calculate the necessary spacing to give exactly 2500 records.
I want to search for records where a particular field either STARTS WITH some string (let's say "ar") OR that field CONTAINS the string, "ar".
However, I consider the two conditions different, because I'm limiting the number of results returned to 10 and I want the STARTS WITH condition to be weighted more heavily than the CONTAINS condition.
Example:
SELECT *
FROM Employees
WHERE Name LIKE 'ar%' OR Name LIKE '%ar%'
LIMIT 10
The catch is that is that if there are names that START with "ar" they should be favored. The only way I should get back a name that merely CONTAINS "ar" is if there are LESS than 10 names that START with "ar"
How can I do this against a MySQL database?
You need to select them in 2 parts, and add a Preference tag to the results. 10 from each segment, then merge them and take again the best 10. If segment 1 produces 8 entries, then segment 2 of UNION ALL will product the remaining 2
SELECT *
FROM
(
SELECT *, 1 as Preferred
FROM Employees
WHERE Name LIKE 'ar%'
LIMIT 10
UNION ALL
SELECT *
FROM
(
SELECT *, 2
FROM Employees
WHERE Name NOT LIKE 'ar%' AND Name LIKE '%ar%'
LIMIT 10
) X
) Y
ORDER BY Preferred
LIMIT 10
Assign a code value to results, and sort by the code value:
select
*,
(case when name like 'ar%' then 1 else 2 end) as priority
from
employees
where
name like 'ar%' or name like '%ar%'
order by
priority
limit 10
Edit:
See Richard aka cyberkiwi's answer for a more efficient solution if there are potentially lots of matches.
My solution is:
SELECT *
FROM Employees
WHERE Name LIKE '%ar%'
ORDER BY instr(name, 'ar'), name
LIMIT 10
The instr() looks for the first occurrence of the pattern in question. AR% will come before xxAR.
This prevents:
Should only do table scan 1 time. Unions and derived tables do 3. The first two on the columns to filter out the patterns and then the 3rd on the subset to find where they equal - since union filters out dupes.
Gives a true sort based on the location of the pattern. Wx > xW > xxW > etc...
Try this (don't have a MySQL instance immediately available to test with):
SELECT * FROM
(SELECT * FROM Employees WHERE Name LIKE 'ar%'
UNION
SELECT * FROM Employees WHERE Name LIKE '%ar%'
)
LIMIT 10
There are probably better ways to do it, but that immediately sprang to mind.
SELECT *
FROM Employees
WHERE Name LIKE 'ar%' OR Name LIKE '%ar%'
ORDER BY LIKE 'ar%' DESC
LIMIT 10
Should work orders by the binary true / false for like and if index'ed should benefit from the index