using levenshtein distance ratio to compare 2 records

using levenshtein distance ratio to compare 2 records - mysql

I've created the mysql user function using the levenshtein distance and ratio source codes. I am comparing 2 records and based on a 75% match I want to select the record.
Order comes into table paypal_ipn_orders with an ITEM title
A query executes against a table itemkey to find a 75% match in a record called ITEM as well
if a 75% title is match it assigns an eight digit number from table itemkey to table paypal_ipn_orders
Here is the query
UPDATE paypal_ipn_orders
SET sort_num = (SELECT sort_id
FROM itemkey
WHERE levenshtein_ratio(itemkey.item, paypal_ipn_orders.item_name) > 75)
WHERE packing_slip_printed = 0
AND LOWER(payment_status) = 'completed'
AND address_name <> ''
AND shipping < 100
I have adjusted this a few times but it's failing between line 4 and 5 at the levenshtein_ratio part. If it works it says that the subquery returns more than one row. I don't know how to fix it to make it return the correct result, I just lost as to how to make this work.

A subquery on a SET should only return one value. If itemkey has more than one item that is 75% of item_name what do you want to do? The below will use one of the best matches:
UPDATE paypal_ipn_orders
SET sort_num = (SELECT sort_id
FROM itemkey
WHERE levenshtein_ratio(itemkey.item, paypal_ipn_orders.item_name) > 75
ORDER BY levenshtein_ratio(itemkey.item, paypal_ipn_orders.item_name) DESC
LIMIT 1)
WHERE packing_slip_printed = 0
AND LOWER(payment_status) = 'completed'
AND address_name <> ''
AND shipping < 100

Related

mysql select counts by if else condition

i have a table named cq500_all(to record diffrent doctor feedback)
now i want know counts when condition status is
field dr_1_finish and field dr_2_finish value is all fill 1
and
when field dr_1 different dr_2 (like dr_1=1 and dr_2=0,or dr_1=0 and dr_2=1 )
cause i want to know two doctors feedback counts (when different doctor's feedback on jpg)
for example image show CQ500-CT-1_36_08.jpg and CQ500-CT-1_36_09.jpg is match my select counts
it will be two (select counts result)
how to make the query on mysql?

You can count as
select count(*) as total
from cq500_all
where dr_1_finish = 1 and dr_2_finish = 1 and dr_1 != dr_2
You will got result in total

Pretty much just the way you've described it:
select *
from cq500_all
where dr_1_finish = 1 and dr_2_finish = 1
and dr_1 != dr_2
or (if dr_1 or dr_2 might not be just 0 and 1):
select *
from cq500_all
where dr_1_finish = 1 and dr_2_finish = 1
and ((dr_1 = 1 and dr_2 = 0) or (dr_1 = 0 and dr_2 = 1))

MySQL updating duplicate IDs based on match and no match criteria all in one table

Hopefully I can explain this clearly. I have a table that has what need to be unique IDs for people within a group. The IDs are generated using first 3 letters of the first name and date of birth. Normally, with smaller groups (less than 500) this works fine. However in large groups we do hit upon some duplicates. We'd then just append a -1, -2, -3 etc. to any duplicate IDs. For example:
ID GROUP UID FIRST_NAME
1 123456 ALE19900123 ALEXIS
2 123456 ALE19900123 ALEXANDER
3 123456 ALE19900123 ALEJANDRO
4 789789 ALE19900123 ALEX
What I'd like to do is for ID 2 and 3 append a -1 and -2 respectively to their UID field so that 1,2 and 3 are now unique (GROUP + UID). ID 4 would be ignored because the GROUP is different
I've started with something like this:
UPDATE table A
JOIN table B
ON B.GROUP = A.GROUP
AND B.UID = A.UID
AND B.FIRST_NAME <> A.FIRST_NAME
AND B.ID < A.ID
SET A.duplicate_record = 1;
That should set the duplicate_record field = 1 for IDs 2 and 3. But then I still need to append a -1, -2, -3 etc. to those UIDs and I'm not sure how to do that. Maybe instead of just setting a flag = 1 for duplicate I should set the count of records that are duplicates?

If group, UID tuple is unique (and it should be), why not insert ignore the first one (without any value appended), check for how many rows were affected by SELECT ROW_COUNT();, and if that is zero, append -1? If you put it in a for cycle (pseudocode):
while i < 1000 do
insert ignore into people (group, uid, first_name) values (123456, concat(their_uid, "-", i), first name);
if ((select row_count();) == 1):
break;
i=i+1;
end while;

SQL SELECT ORDER BY multiple columns depending on value of other column

I have a table with the following columns:
id | revisit (bool) | FL (decimal) | FR (decimal) | RL (decimal) | RR (decimal) | date
I need to write a SELECT statement that will ORDER BY on multiple columns, depending on the value of the 'revisit' field.
ORDER BY 'revisit' DESC - records with this field having the value 1 will be first, and 0 will be after
If 'revisit' = 1 order by the lowest value that exists in FL, FR, RL and RR. So if record 1 has values 4.6, 4.6, 3.0, 5.0 in these fields, and record 2 has values 4.0, 3.1, 3.9, and 2.8 then record 2 will be returned first as it holds a lowest value within these four columns.
If 'revisit' = 0 then order by date - oldest date will be first.
So far I have the 'revisit' alone ordering correctly, and ordering by date if 'revisit' = 0, but ordering by the four columns simultaneously when 'revisit' = 1 does not.
SELECT *
FROM vehicle
ORDER BY
`revisit` DESC,
CASE WHEN `revisit` = 1 THEN `FL` + `FR` + `RR` + `RL` END ASC,
CASE WHEN `revisit` = 0 THEN `date` END ASC
Instead it seems to be ordering by the total of the four columns (which would make sense given addition symbols), so how do I ORDER BY these columns simultaneously, as individual columns, rather than a sum.
I hope this makes sense and thanks!

In your current query, you order by the sum of the four columns. You can use least to get the lowest value, so your order by clause could look like:
SELECT *
FROM vehicle
ORDER BY
`revisit` DESC,
CASE WHEN `revisit` = 1 THEN LEAST(`FL`, `FR`, `RR`, `RL`) END ASC,
CASE WHEN `revisit` = 0 THEN `date` END ASC
Of course this would sort only by the lowest value. If two rows would both share the same lowest value, there is no sorting on the second-lowest value. To do that is quite a bit harder, and I didn't really get from your question whether you need that.

how can I tell if the last x rows of 'state' = 1

I need help with a SQL query.
I have a table with a 'state' column. 0 means closed and 1 means opened.
Different users want to be notified after there have been x consecutive 1 events.
With an SQL query, how can I tell if the last x rows of 'state' = 1?

If, for example, you want to check if the last 5 consecutive rows have a state equals to 1, then here's you could probably do it :
SELECT IF(SUM(x.state) = 5, 1, 0) AS is_consecutive
FROM (
SELECT state
FROM table
WHERE Processor = 3
ORDER BY Status_datetime DESC
LIMIT 5
) as x
If is_consecutive = 1, then, yes, there is 5 last consecutive rows with state = 1.
Edit : As suggested in the comments, you'll have to use ORDER BY in your query, to get the last nth rows.
And for more accuracy, since you have a timestamp column, you should use Status_datetime to order the rows.

You should be able to use something like this (replace the number in the HAVING with the value of x you want to check for):
SELECT Processor, OpenCount FROM
(
SELECT TOP 10 Processor, DateTime, Sum(Status) AS OpenCount
FROM YourTable
WHERE Processor = 3
ORDER BY DateTime DESC
) HAVING OpenCount >= 10

SQL : How to calculate limited rows and reset the counter

I am dealing with an issue and need some expert advice on to achieve the problem, my sql query generates output with two columns, 1st column displays id (for e.g. abc-123 in following table) and next column displays corresponding result to the id stored in db which is pass or fail.
I need to implement, when resolution is pass it should display success attempt, in following example, abc-123 failed 1st time however def-456 passed in next attempt thus success rate is 50%, now counter should reset and go to next row where there is pass thus it should show 100%, again when code hits pass counter resets then goes next and displays 33% bec there are two fail and one pass at the end, how it can be achieved in sql? (id and resolution are column names)
**date** **id resolution**
6/6/2012 abc-123 fail 50%
6/7/2012 abc-456 pass
6/8/2012 abc-789 pass 100%
6/9/2012 abc-799 fail 33%
6/10/2012 abc-800 fail
6/1/2012 abc-900 pass
Thanks

SELECT
*
FROM
table
INNER JOIN
(
SELECT
MIN(g.id) AS first_id,
MAX(g.id) AS last_id,
COUNT(*) AS group_size
FROM
table AS p
INNER JOIN
table AS g
ON g.id > COALESCE(
(SELECT MAX(id) FROM table WHERE id < p.id AND resolution = 'pass'),
''
)
AND g.id <= p.id
WHERE
p.resolution = 'pass'
GROUP BY
p.id
)
AS groups
ON table.id >= groups.first_id
AND table.id <= groups.last_id

There's more than one way to do it:
SELECT st.*,
#prev:=#counter + 1,
#counter:= CASE
WHEN st.resolution = 'pass'
THEN 0
ELSE #counter + 1
END c,
CASE WHEN #counter = 0
THEN CONCAT(FORMAT(100/#prev, 2), '%')
ELSE '-'
END res
FROM so_test st, (SELECT #counter:=0) sc
Here's proof of concept.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

using levenshtein distance ratio to compare 2 records - mysql

Related

mysql select counts by if else condition

MySQL updating duplicate IDs based on match and no match criteria all in one table

SQL SELECT ORDER BY multiple columns depending on value of other column

how can I tell if the last x rows of 'state' = 1

SQL : How to calculate limited rows and reset the counter

Categories

Resources