The query returns wrong result when i want to find who is the shortest player in a NBA Database - mysql

I am working with a NBA script in MySQL and I have to find out who is the shortest player in database. I am using feet as measurement and after executing the query i found out that the player the query was giving me was not the right answer.
The query is
select * from players where height=(select min(height) from players);
And it gaves me:
'420', 'Carlos Arroyo', 'Florida International', ' 6-2', '202', 'G', 'Magic'
where 6-2 is the height.
Instead of giving me one of these results
'26', 'Brevin Knight', 'Stanford', '5-10', '170', 'G', 'Clippers'
'113', 'Nate Robinson', 'Washington', '5-9', '180', 'G', 'Knicks'
'182', 'Earl Boykins', 'Eastern michigan', '5-5', '133', 'G', 'Bobcats'
'372', 'Damon Stoudamire', 'Arizona', '5-10', '171', 'G', 'Spurs'
'482', 'Chucky Atkins', 'South Florida', '5-11', '185', 'G', 'Nuggets'
And if I order by height players, the result it's a bit annoying:
'Carlos Arroyo', ' 6-2'
'Shareef Abdur-Rahim', ' 6-9'
'Louis Amundson', ' 6-9'
'Brevin Knight', '5-10'
'Damon Stoudamire', '5-10'
'Chucky Atkins', '5-11'
'Earl Boykins', '5-5'
'Nate Robinson', '5-9'
'Aaron Brooks', '6-0'
'Allen Iverson', '6-0'
'Kyle Lowry', '6-0'
'Jammer Nelson', '6-0'
'Sebastian Telfair', '6-0'
'Chris Paul', '6-0'

Convert the height-string to a number which you can use for numeric comparison.
select player, height
from players
where cast(substring_index(height, '-', 1) as unsigned)*100+
cast(right(concat('0', substring_index(height, '-', -1)),2) as unsigned)
in (
select min(cast(substring_index(height, '-', 1) as unsigned)*100+
cast(right(concat('0', substring_index(height, '-', -1)),2) as unsigned))
from players
)
See dbfiddle

...
where 6-2 is the height. Instead of giving me one of these results
...
You tell that all values '5-xx' are equivalent to each other, i.e. only value before the dash is taken into account.
Also you tell that you need in only one output row, and any row of shown 5 rows matches - i.e. you do not need in secondary sorting.
If so then you may simply do
SELECT *
FROM players
ORDER BY CAST(height AS UNSIGNED) LIMIT 1

Related

BLEU - Error N-gram overlaps of lower order

I ran the code below
a = ['dog', 'in', 'plants', 'crouches', 'to', 'look', 'at', 'camera']
b = ['a', 'brown', 'dog', 'in', 'the', 'grass', ' ', ' ']
from nltk.translate.bleu_score import corpus_bleu
bleu1 = corpus_bleu(a, b, weights=(1.0, 0, 0, 0))
print(bleu1)
This is the error
The hypothesis contains 0 counts of 3-gram overlaps. Therefore the
BLEU score evaluates to 0, independently of how many N-gram overlaps
of lower order it contains. Consider using lower n-gram order or use
SmoothingFunction() warnings.warn(_msg)
Can someone tell me what is the problem here? I can not find the solution on google. Thank you.
Best,
DD
I found the solution. Basically, I need a list inside a list for list 'a'. So code below will work without error.
a = [['dog', 'in', 'plants', 'crouches', 'to', 'look', 'at', 'camera']]
b = ['a', 'brown', 'dog', 'in', 'the', 'grass', ' ', ' ']
from nltk.translate.bleu_score import corpus_bleu
bleu1 = corpus_bleu(a, b, weights=(1.0, 0, 0, 0))
print(bleu1)

Update birthday with full year in MySQL

I have a columns that is for birthday and it's varchar type, I want to change in into date and add full year instead for only 2-digit.
if someone born on 05061985 the MySQL remove first 0 and show as 50685
Change 50685
To ==> 05061985
All users birthday are from 1900 until 1999
Lets do that step by step
We can have strings with len 5 or 6 so we ensure we have a len 6 string left padded with zero
select LPAD('50685', 6, '0');
Now we insert the '19' in the string between the 4th and 5th position
select CONCAT(LEFT(LPAD('50685', 6, '0'), 4), '19', RIGHT(LPAD('50685', 6, '0'), 2));
Now the last step we are going to update all the BIRTHDAY fields in the table FOOBAR
update FOOBAR set BIRTHDAY=CONCAT(LEFT(LPAD(BIRTHDAY, 6, '0'), 4), '19', RIGHT(LPAD(BIRTHDAY, 6, '0'), 2));
Anyway in this case you still have a string field, I suggest to modify the format even more to do a proper date field conversion, something like YYYY-MM-DD
update FOOBAR set BIRTHDAY=LPAD(BIRTHDAY, 6, '0');
update FOOBAR set BIRTHDAY=CONCAT('19' ,
RIGHT(BIRTHDAY, 2),
'-',
SUBSTR(BIRTHDAY, 3, 2),
'-',
LEFT(BIRTHDAY, 2));
alter table FOOBAR modify BIRTHDAY date;

Compare two strings according to defined values in a query

I have to compare two string fields containing letters but not alphabetically.
I want to compare them according to this order :
"J" "L" "M" "N" "P" "Q" "R" "S" "T" "H" "V" "W" "Y" "Z"
So if I compare H with T, H will be greater than T (unlike alphabetically)
And if I test if a value is greater than 'H' (> 'H') I will get all the entries containing the values ("V" "W" "Y" "Z") (again, unlike alphabetical order)
How can I achieve this in one SQL query?
Thanks
SELECT *
FROM yourtable
WHERE
FIELD(col, 'J', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'H', 'V', 'W', 'Y', 'Z') >
FIELD('H', 'J', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'H', 'V', 'W', 'Y', 'Z')
^ your value
Or also:
SELECT *
FROM yourtable
WHERE
LOCATE(col, 'JLMNPQRSTHVWYZ')>
LOCATE('H', 'JLMNPQRSTHVWYZ')
Please see fiddle here.
You can do
SELECT ... FROM ... ORDER BY yourletterfield='J' DESC, yourletterfield='L' DESC, yourletterfield='M' DESC, ...
The equality operator will evaluate to "1" when it's true, "0" when false, so this should give you the desired order.
There's actually a FIELD() function that will make this a bit less verbose. See this article for details.
SELECT ... FROM ... ORDER BY FIELD(yourletterfield, 'J', 'L', 'M', ...)

SQL Server : how does the Replace function work as its acting strange for me

I am 'randomising' some strings in a SQL Server table to do a primitive encryption on them.
I have a nested SQL replace function around 35 times (A-Z,1-9) that basically takes every letter in the alphabet and number and replaces it with another letter or number. example of which would be
Replace(Replace(Replace('a', 'c'), 'b', 'a'), 'c', 'b')
I figured that the replace function would go though a string like 'abc' and replace everything once and stop - 'cab'. It doesn't!
It seems to want to change some characters again resulting in 'abc'->'cab'->'ccb'.
This is fine except if I have another string called 'aac' this could result in duplicate string and I lose traceability back to original.
Can anyone explain how I could stop REPLACE() partially going back over my string?
SELECT * INTO example_temp FROM example;
Update KAP_db.dbo.example_temp Set col1 = replace(replace(replace(replace(replace(
replace(replace(replace(replace(replace(‌​replace(replace(replace(replace(replace(
replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(
replace‌​(replace(replace(replace(replace(replace(replace(replace(replace(replace(
col1, 'A', 'N'),'B', 'O'), 'C', 'P'), 'D', 'Q'), 'E', 'R'), 'F', 'S'), 'G', 'T'),
'H', 'U'), 'I', 'V'), 'J', 'W'), 'K', 'X'), 'L', 'Y'), 'M', 'Z'), 'O', 'A'), 'P', 'B'),
'Q', 'C'), 'R', 'D'),'S', 'E'),'T', 'E'),'U', 'E'),'V', 'F'),'W', 'G'),'X', 'H'),
'Y', 'I'),'Z', 'J'), '1', '9'),'2','8'),'3','7'),'4','6'),'5','5'),'6','4'),'7','3'),
'8','2'),'9','1'),' ','');
The above results in '8EVHUAB' and '8EVHHAB' both outputting '2DFEENA'
Update -------------------------------------------------------------------
Ok i have redone the code and so far have:
DECLARE #Input AS VarChar(1000)
DECLARE #i AS TinyInt
Declare #Substring AS VarChar(1000)
Declare #Prestring AS VarChar(1000)
Declare #Poststring AS VarChar(1000)
Select #Input='ABCDEFGHIJKLMNOPQRSTUVWXYZ123456789'
SELECT #i = 1
Select #Substring ='na'
WHILE #i <= LEN(#Input) BEGIN
Select #Prestring = SUBSTRING(#Input,-1,#i)
Select #Poststring = SUBSTRING(#Input,#i+1,LEN(#Input))
SELECT #Substring = replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace
(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace
(SUBSTRING(#Input,#i,1), 'A', 'N'),'B', 'O'), 'C', 'P'), 'D', 'Q'), 'E', 'R'), 'F', 'S'), 'G', 'T'), 'H', 'U'), 'I', 'V'), 'J', 'W'), 'K', 'X'), 'L', 'Y'), 'M', 'Z'), 'N', 'A'), '0', 'B'), 'P', 'C')
, 'Q', 'D'),'R', 'E'),'S', 'E'),'T', 'E'),'U', 'F'),'V', 'G'),'W', 'H'),'X', 'I'),'Y', 'J'), '1', '9'),'2','8'),'3','7'),'4','6'),'5','5'),'6','4'),'7','3'),'8','2'),'9','1'),' ','')
Select #Input = #Prestring + #Substring + #Poststring
SELECT #i = #i + 1
print 'END
'
END
This doesnt work correctly though, the code does not execute as its written, any suggestions?
Why you're seeing this: replace is a function; all it knows are its arguments. replace(replace('aba', 'a', 'b'), 'b', 'a') is absolutely equivalent to replace('bbb', 'b', 'a'), because the outer replace has no way of knowing that its first argument was created by a different call to replace. Does that make sense?
You can think of it just like a function in algebra. If we define f(x) = x2, then f(f(2)) = f(22) = f(4) = 42 = 16. There's no way to tell f to behave differently when its argument is f(2) from when its argument is 4, because f(2) is 4.
Similarly, replace('aba', 'a', 'b') is 'bbb', so there's no way to tell replace to behave differently when its first argument is replace('aba', 'a', 'b') from when its first argument is 'bbb'.
(This is usually true in computer science. Functions in computer science aren't always like functions in algebra — for example, they frequently actually do things, rather than just returning a value — but it's usually the case that they receive arguments as values, or as opaque references to values, and have no way of knowing where they came from or how they were constructed.)
How to address this: I don't think there's any very clean way to do this. Gordon Linoff suggested that you could use intermediate placeholder characters (specifically — lowercase letters) that don't exist in the initial string and don't exist in the final string, so that you can safely replace them without worrying about interference; and I think that's probably the best approach.
Your results are not surprising, since each replace is returning the string to the next level. There is no way to distinguish between the original character value and the replaced value, when they are the same character.
If you were only working with alpha characters you could do the following.
Change the collation of the original string.
Upper case the string
Replace the upper case letters with the lower case
Lower case the entire string at the end (should be redundant)
Unfortunately, I can't think of an analog for numbers that would work the same way.
Here is a link to a site that has code for the function http://www.dbforums.com/microsoft-sql-server/1216565-oracle-translate-function-equivalent-sql-server.html. The equivalent function in Oracle is called translate.
The Replace() function simply performs the operation and returns. It doesn't keep state to the next Replace(). It doesn't prevent a later invocation from replacing the characters you previously replaced. To cycle characters around, you have to have an additional "placeholder" value, and then take care to not replace a character that was already change from something else.
First, let me show you an analogy:
There are three buckets full of an equal number of marbles. The buckets are labeled "A", "B", and "C". You must perform the following instructions:
Pour all marbles from A into C.
Pour all marbles from B into A.
Pour all marbles from C into B.
What result would you expect to have? The answer is: an empty bucket C, and B having twice the number of marbles as are in A.
If you want to preserve the marbles in their original counts, you have to have four containers so that you don't mix the marbles while moving them:
Pour all marbles from C into X.
Pour all marbles from A into C.
Pour all marbles from B into A.
Pour all marbles from X into B. (rather than from C--which already has A's marbles)
Now you have the results you expected, and can discard the empty bucket X.
Try this expression and see if it gives what you want:
Replace(Replace(Replace(Replace('c', char(1)), 'a', 'c'), 'b', 'a'), char(1), 'b')

Find duplicates mysql up to number digits

I'm looking for duplicates in a Google Map table. The problem is the latitudes are slightly different and the addresses are slightly different (but I can tell they're duplicates). Therefore, I'd like to get all records where the latitude is the same up to 4 places after the decimal point. How can this be done?
Something like:
Select * from googlemap
having count(len latitude > 4) > 1
Fields:
point_id, entry_id, latitude, longitude, address, city, zipcode, state, field_id, icon, supplier_id
Sample data from first answer:
'9', '51.5124', '9,557,885,908,964,1353,2145,2947'
'17', '32.7921', '17,19,94,2652'
'37', '32.7799', '37,101'
'54', '34.0953', '54,165'
'71', '42.3582', '71,2724'
'73', '25.7660', '73,125'
'100', '25.7906', '100,106'
'112', '25.7870', '112,378'
'113', '32.7114', '113,316'
'114', '25.7689', '114,140'
'129', '25.7708', '129,138,142'
'148', '25.7518', '148,155'
'156', '25.7710', '156,171'
'172', '35.6563', '172,175'
'174', '35.6559', '174,184'
'194', '48.8677', '194,261'
'195', '48.8661', '195,210,248,268'
SELECT *
FROM googlemap g1
INNER JOIN googlemap g2
ON TRUNCATE(g1.latitude, 4) = TRUNCATE(g2.latitude, 4)
-- AND TRUNCATE(g1.longitude, 4) = TRUNCATE(g2.longitude, 4)
AND g1.point_id < g2.point_id
How about this?
SELECT ROUND(latitude,4), GROUP_CONCAT(point_id SEPARATOR ',')
FROM googlemap
GROUP BY ROUND(latitude,4)
HAVING COUNT(*) > 1