Distinct on column random values

Distinct on column random values - mysql

I need to fetch 12 questions from my table. I have a field called "bucket" which might have duplicate values.
While fetching, I need only unique bucket values to be fetched and the number of rows has to be 12.
This is my query:
select *
from (
select
DISTINCT ON (q.bucket) bucket,
row_number() over (partition by dl.value order by random()) as rn,
row_number() over (partition by dl.value, LOWER(qc.value) = LOWER('general') order by random()) as rnc,
dl.value, qc.value as question_category,
q.question_text, q.option_a, q.option_b, q.option_c, q.option_d,
q.correct_answer, q.image_link, q.question_type
from
questions_bank q
inner join
question_category qc on qc.id = q.question_category_id
inner join
sports_type st on st.id = q.sports_type_id
inner join
difficulty_level dl on dl.id = q.difficulty_level_id
where st.game_type = lower('cricket') and dl.value in ('E','M','H')
) s
where
(value = 'E' and rnc <= 3 and LOWER(question_category) != LOWER('general')) or
(value = 'E' and rnc <= 3 and LOWER(question_category) = LOWER('general')) or
value = 'M' and rn <= 4 or
value = 'H' and rn <= 2;
Can anyone please tell me what am I doing wrong here? It sometimes doesn't return 12 rows. This happens whenever a same bucket value is found. I think, whenever distinct is applied that duplicate row's value is removed. Hence, when I do rn<=4 it is not able to find say, 3. Thus returning only 3 rows instead of 4.
So, I need to apply distinct on bucket first and then get row_numbers. How shall I do that?

Related

MySQL: Add a WHERE in a LEFT JOIN

I have the following table:
SequenceNumber are always a multiple of 10. I would like to get, for a specific cpId, the smallest free sequence number (still in a multiple of 10). For example, for cpId = 1, the smallest available should be 20. For cpId = 2, it should be 10.
I have the following statement to get the smallest available sequenceNumber for all cpId, and I don't know how I can add a WHERE cpId = x inside the statement:
SELECT MIN(t1.sequenceNumber + 10) AS nextID
FROM LogicalConnection t1
LEFT JOIN LogicalConnection t2
ON t1.sequenceNumber + 10 = t2.sequenceNumber
WHERE t2.sequenceNumber IS NULL;
DB fiddle: https://www.db-fiddle.com/f/ag67AkFzfwPZEva8bTN7Q3/2#&togetherjs=L9nHb3Uu7O
Thank you for your help!

You can use lead() to get the next number and then some simple logic:
select cpid,
(case when min_sn > 10 then 10
else min(sequenceNumber) + 10
end)
from (select t.*,
min(sequenceNumber) over (partition by cpid) as min_sn,
lead(sequenceNumber) over (partition by cpid order by sequenceNumber) as next_sn
from t
) t
where next_sn is null or next_sn <> sequenceNumber + 10
group by cpid, min_sn;

You have to join both tables on cpId column and group the rows with similar cpId.
Where can be used to filter rows.
Below query gives you the cpId and their corresponding next available minimum sequence number.
SELECT t1.cpId, MIN(t1.sequenceNumber + 10) AS nextID
FROM LogicalConnection t1
LEFT JOIN LogicalConnection t2
ON t1.sequenceNumber + 10 = t2.sequenceNumber
and t1.cpId = t2.cpId
group by (t1.cpId)

update row if count(*) > n

my DB has this structure:
ID | text | time | valid
This is my current code. I'm trying to find a way to do this as one query.
rows = select * from table where ID=x order by time desc;
n=0;
foreach rows{
if(n > 3){
update table set valid = -1 where rows[n];
}
n++
}
I'm checking how many rows exist for a given ID. Then I need to set valid=-1 for all rows where n >3;
Is there a way to do this with one query?

You can use a subquery in the WHERE clause, like this:
UPDATE table
SET valid=-1
WHERE (
SELECT COUNT(*)
FROM table tt
WHERE tt.time > table.time
AND tt.ID = table.ID
) > 3
The subquery counts the rows with the same ID and a later time. This count will be three or less for the three latest rows; the remaining ones would have a greater count, so their valid field would be updated.

Assuming that (id,time) has a UNIQUE constraint, i.e. no two rows have the same id and same time:
UPDATE
tableX AS tu
JOIN
( SELECT time
FROM tableX
WHERE id = #X -- the given ID
ORDER BY time DESC
LIMIT 1 OFFSET 2
) AS t3
ON tu.id = #X -- given ID again
AND tu.time < t3.time
SET
tu.valid = -1 ;

update table
set valid = -1
where id in (select id
from table
where id = GIVEN_ID
group by id
having count(1) >3)
Update: I really like dasblinkenlight's solution because is very neat, but I wanted to try also to do it in my way, a quite verbose one:
update Table1
set valid = -1
where (id, time) in (select id,
time
from (select id,time
from table1
where id in (select id
from table1
group by id
having count(1) >3)
-- and id = GIVEN_ID
order by time
limit 3, 10000000)
t);
Also in SQLFiddle

to do it for all ids, or only for one if you set a where in the a subquery
UPDATE TABLE
LEFT JOIN (
SELECT *
FROM (
SELECT #rn:=if(#prv=id, #rn+1, 1) AS rId,
#prv:=id AS id,
TABLE.*
FROM TABLE
JOIN ( SELECT #prv:=0, #rn:=0 ) tmp
ORDER BY id, TIMESTAMP
) a
WHERE rid > 3
) ordered ON ordered.id = TABLE.id
AND ordered.TIMESTAMP = TABLE.TIMESTAMP
AND ordered.text = TIMESTAMP.text
SET VALID = -1
WHERE rid IS NOT NULL

Return Rows That Share A Common Value But Another Column Must Match Multiple Criteria

I have a table that is sorted by id and value in descending order. I want to return all id's that match a group of keys in a specific order. So given (a5, a3) I want to return a and b but not d.
id value key
a 3 a5
a 2 a3
a 1 a4
b 4 a5
b 2 a3
c 6 a1
c 2 a2
d 4 a3
d 2 a5
The expected output would be
id
a
b
So far I've managed to match (a5, a3) but in any order. Here I'm returning all rows and fields that match in any order; not just the id.
SELECT tablename.*
FROM tablename, (SELECT * FROM tablename a
WHERE key IN ('a5', 'a3')
GROUP BY id
HAVING COUNT(*) >= 1) AS result
WHERE tablename.id = result.id

This is an example of a set-within-sets query, although it is a bit more complicated then most.
select id
from tablename t
group by id
having (max(case when "key" = 'a5' then value end) >
max(case when "key" = 'a3' then value end)
);
What this is doing is finding the value for "a5" and "a3" and directly comparing them. If neither is present, then the max(case . . .) will return NULL and the comparison will fail. If there is more than one value for either (or both), then it returns the largest value.
This should be pretty easy to generalize to additional keys in a particular order, by adding more similar clauses. This is why I like the aggregation with having approach to this sort of query -- it works in a lot of cases.
For the "nothing-in-between" case that you mention, I think this will work:
select id
from (select t.*, #rn := #rn + 1 as seqnum
from tablename t cross join (select #rn := 0) const
order by key, value
) t
group by id
having (max(case when "key" = 'a5' then seqnum end) = max(case when "key" = 'a3' then seqnum end) + 1
);
The appends a sequence number and then checks that they are consecutive for your two values.

For this you can use following query -
select distinct t_1.ID from tablname t_1, tablename t_2
where t_1.id = t_2.id
and t_1.key = 'a5' and t_2.key = 'a3'
and t_1.value > t_2.value

MySQL SELECT best-fit record based on optional NULL values

Please refer to this SQLFiddle: http://sqlfiddle.com/#!2/9db4f
Table 'rate' stores hourly chargeout rates for job roles, with the option to store varying rates for a role based on the client company, group (a 'group' is just a division of a company) and client contact.
Rates can also vary over time.
I'd like to select the single most recent, best-fit rate for a given combination of role, company, group and contact. It should try to match, in this order:
client_contact, client_group, client_company and role
client_group, client_company and role
client_company and role
just role
For example: I'm looking for a rate matching role ID 3, company ID 3 and client ID 4.
There isn't a record matching all of those, so it should look for one matching just role ID 3 and company ID 3. (The other fields - client_contact and client_group - must be NULL). There are two of those: row ID's 2 and 3. It should select row ID 3, as it has the most recent 'date_from' date.
Another example: I'm looking for a rate matching role ID 3, and company ID 25.
There isn't one of those either so it should look for one matching just role ID 3, and NULLs for all the other values. There's only one matching row: number 1.
The query on the current SQLFiddle does the 'fetch the most recent' bit, but I'm stuck as to getting it to match optional columns if they're present.
Halp :(
Edit: oops, it looks like SQLFiddle only saves the schema, not the query. This is what I've got:
SELECT
rate.*
FROM
rate
LEFT JOIN rate AS newest ON (
rate.role = newest.role
AND COALESCE(rate.client_company, 1) = COALESCE(newest.client_company, 1)
AND COALESCE(rate.client_group, 1) = COALESCE(newest.client_group, 1)
AND COALESCE(rate.client_contact, 1) = COALESCE(newest.client_contact, 1)
AND newest.date_from > rate.date_from
)
WHERE newest.id IS NULL

I would approach it like this.
Asssuming you are looking for:
client_contact = 5
client_group= 3
client_company= 3
role = 3
Query:
select *
from rate
where ifnull(client_contact, 5) = 5
and ifnull(client_group, 3) = 3
and ifnull(client_company, 3) = 3
and ifnull(role, 3) = 3
order by case
when client_contact = 5 and client_group = 3 and client_company = 3 and role = 3
then 1
when client_contact is null and client_group = 3 and client_company = 3 and role = 3
then 2
when client_contact is null and client_group is null and client_company = 3 and role = 3
then 3
when client_contact is null and client_group is null and client_company is null and role = 3
then 4
end, date_from desc
limit 1
SQL Fiddle Example

I took this to mean that for each rate record, you want the most recent rate that matches. The following works on the given data, but the real answer comes after this.
select rate.*,
rmax.hourly_rate,
rmax.maxdate
from rate join
(select rmax.*, forhr.hourly_rate
from (select role, client_company, client_group, client_contact, MAX(date_from) as maxdate
from rate
group by role, client_contact, client_group, client_company, role with rollup
) rmax join
rate forhr
on coalesce(rmax.role, '') = coalesce(forhr.role, '') and
coalesce(rmax.client_company , '') = coalesce(forhr.client_company, '') and
coalesce(rmax.client_contact, '') = coalesce(forhr.client_contact, '') and
coalesce(rmax.client_group, '') = coalesce(forhr.client_group, '') and
rmax.maxdate = forhr.date_from
) rmax
on coalesce(rmax.role, '') = coalesce(rate.role, '') and
coalesce(rmax.client_company , '') = coalesce(rate.client_company, '') and
coalesce(rmax.client_contact, '') = coalesce(rate.client_contact, '') and
coalesce(rmax.client_group, '') = coalesce(rate.client_group, '')
The inner query is just to get the most recent rate associated with the date. The outer query does the match to each record in the rate table.
I'm not sure what I was thinking above; I think I was interrupted in my thought processes. I think the best way to approach this is with a correlated subquery. The following gets the max rate. You can use the same approach to get the id and additional information:
select r.*,
(select hourly_rate
from rates r2
where coalesce(r.role, '') = coalesce(r2.role, '') and
coalesce(r.client_company , '') = coalesce(r2.client_company, '') and
coalesce(r.client_contact, '') = coalesce(r2.client_contact, '') and
coalesce(r.client_group, '') = coalesce(r2.client_group, '')
order by (case when r2.client_contact is not null then 1
when r2.client_group is not null then 2
when r2.client_company is not null then 3
when r2.role is not null then 4
else 5
end),
date_from desc
limit 1
) as most_recent_hourly_rate
from rates r
This uses a correlated subquery to get the row that has the most matches. The key is the order by clause, which orders by the "most matched" field to the "least matched", and then by date. In this case, it pulls the hourly rate. In practice, I would pull the id and join back to rates to get the rate and other information (such as the date).
In the given form, it assumes the data in rates is structured as you say and that you don't have rows with, say, client_group and role, but not the other fields.

On your returned set, I would try to assign a rank based on the optional column, so higher the rank, closest the match and then will get the record with Max rank .... (Tested on SQL Fiddle)
Expanding your query:
SELECT T.*
FROM
(
SELECT
rate.*,
((CASE WHEN COALESCE(rate.client_company, 1) = COALESCE(newest.client_company, 1)
AND rate.client_company = 25 THEN 1
ELSE 0 END) +
(CASE WHEN COALESCE(rate.client_group, 1) = COALESCE(newest.client_group, 1)
AND rate.client_group = NULL THEN 1
ELSE 0 END) +
(CASE WHEN COALESCE(rate.client_contact, 1) = COALESCE(newest.client_contact, 1)
AND rate.client_contact = NULL THEN 1
ELSE 0 END)) AS RANK
FROM
rate
LEFT JOIN rate AS newest
ON rate.role = newest.role
AND COALESCE(rate.client_company, 1) = COALESCE(newest.client_company, 1)
AND COALESCE(rate.client_group, 1) = COALESCE(newest.client_group, 1)
AND COALESCE(rate.client_contact, 1) = COALESCE(newest.client_contact, 1)
AND newest.date_from > rate.date_from
WHERE newest.id IS NULL
AND rate.role = 3
) T
HAVING T.Rank = MAX(T.Rank)
Please Note:
In code where we have rate.client_company = 25, 25 should be replaced with input parameter of Company Code.
In code where we have rate.client_group = NULL, NULL should be replaced with input parameter of Client Group Id
In code where we have rate.client_contact = NULL, NULL should be replaced with input parameter of ClientContactId
IN Code where we have rate.role = 3, 3 should be replaced with input parameter of RoleId.
I tried to declare variables in SQl Fiddle but was unable to do so, so i hardcoded the above values...

Don't return any result if the precedent query has at least 1 result?

Basically I have this query:
( SELECT * FROM tbl WHERE type = 'word1' )
UNION
( SELECT * FROM tbl WHERE type = 'word2' ) // Run this query if there are no results with type = 1
Basically I would like to run the second query only if the first hasn't any results. is it possible?

The FIRST "PreCheck" query does a count of how many records ARE of type = 1. After that, if the count is greater than 1, then return 1, otherwise return 2.
Now, THAT answer can be used in the join (which is always a single row via COUNT(*)) which will either have a 1 or 2 value. THAT value will be the second value is the EQUALITY conditon. So, if there IS an entry of 1, the result will be as if
WHERE t1.Type = 1
Thus never allowing any 2 in the test. HOWEVER, if NO entries are found, it will have a value of 2 and thus create a WHERE clause of
WHERE t1.type = 2
select t1.*
from
( select if( count(*) > 0, 1, 2 ) IncludeType
from tbl t2
where t2.type = 1 ) preCheck,
tbl t1
where
t1.type = preCheck.IncludeType
If there is an index on the "type" column, the first query should be almost instantaneous.

You could write
select * from tbl
where type = 1
union
select * from tbl
where type = 2
and not exists( select * from tble where type = 1 )
but this probably won't perform as well as just doing it in your program

It does the trick:
SELECT tbl.* FROM tbl JOIN (SELECT min(type) min_type FROM tbl WHERE type between 1 and 2 ) on min_type = type
First, it selects the lesser of these two types, if any exists, and then oins this one number table to your table. It is actually a simple filter. You can use WHERE instead of JOIN, if you want.
SELECT tbl.* FROM tbl WHERE (SELECT min(type) FROM tbl WHERE type between 1 and 2 ) = type

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008