BLEU - Error N-gram overlaps of lower order - bleu

I ran the code below
a = ['dog', 'in', 'plants', 'crouches', 'to', 'look', 'at', 'camera']
b = ['a', 'brown', 'dog', 'in', 'the', 'grass', ' ', ' ']
from nltk.translate.bleu_score import corpus_bleu
bleu1 = corpus_bleu(a, b, weights=(1.0, 0, 0, 0))
print(bleu1)
This is the error
The hypothesis contains 0 counts of 3-gram overlaps. Therefore the
BLEU score evaluates to 0, independently of how many N-gram overlaps
of lower order it contains. Consider using lower n-gram order or use
SmoothingFunction() warnings.warn(_msg)
Can someone tell me what is the problem here? I can not find the solution on google. Thank you.
Best,
DD

I found the solution. Basically, I need a list inside a list for list 'a'. So code below will work without error.
a = [['dog', 'in', 'plants', 'crouches', 'to', 'look', 'at', 'camera']]
b = ['a', 'brown', 'dog', 'in', 'the', 'grass', ' ', ' ']
from nltk.translate.bleu_score import corpus_bleu
bleu1 = corpus_bleu(a, b, weights=(1.0, 0, 0, 0))
print(bleu1)

Related

The query returns wrong result when i want to find who is the shortest player in a NBA Database

I am working with a NBA script in MySQL and I have to find out who is the shortest player in database. I am using feet as measurement and after executing the query i found out that the player the query was giving me was not the right answer.
The query is
select * from players where height=(select min(height) from players);
And it gaves me:
'420', 'Carlos Arroyo', 'Florida International', ' 6-2', '202', 'G', 'Magic'
where 6-2 is the height.
Instead of giving me one of these results
'26', 'Brevin Knight', 'Stanford', '5-10', '170', 'G', 'Clippers'
'113', 'Nate Robinson', 'Washington', '5-9', '180', 'G', 'Knicks'
'182', 'Earl Boykins', 'Eastern michigan', '5-5', '133', 'G', 'Bobcats'
'372', 'Damon Stoudamire', 'Arizona', '5-10', '171', 'G', 'Spurs'
'482', 'Chucky Atkins', 'South Florida', '5-11', '185', 'G', 'Nuggets'
And if I order by height players, the result it's a bit annoying:
'Carlos Arroyo', ' 6-2'
'Shareef Abdur-Rahim', ' 6-9'
'Louis Amundson', ' 6-9'
'Brevin Knight', '5-10'
'Damon Stoudamire', '5-10'
'Chucky Atkins', '5-11'
'Earl Boykins', '5-5'
'Nate Robinson', '5-9'
'Aaron Brooks', '6-0'
'Allen Iverson', '6-0'
'Kyle Lowry', '6-0'
'Jammer Nelson', '6-0'
'Sebastian Telfair', '6-0'
'Chris Paul', '6-0'
Convert the height-string to a number which you can use for numeric comparison.
select player, height
from players
where cast(substring_index(height, '-', 1) as unsigned)*100+
cast(right(concat('0', substring_index(height, '-', -1)),2) as unsigned)
in (
select min(cast(substring_index(height, '-', 1) as unsigned)*100+
cast(right(concat('0', substring_index(height, '-', -1)),2) as unsigned))
from players
)
See dbfiddle
...
where 6-2 is the height. Instead of giving me one of these results
...
You tell that all values '5-xx' are equivalent to each other, i.e. only value before the dash is taken into account.
Also you tell that you need in only one output row, and any row of shown 5 rows matches - i.e. you do not need in secondary sorting.
If so then you may simply do
SELECT *
FROM players
ORDER BY CAST(height AS UNSIGNED) LIMIT 1

Multi row insert using Knex.js

Am trying to build an multi-row insert query using Knex.js
My post request contains a variable which is formatted in the following format : [{addon_name:'sugar'},{addon_name:'milk'}]
My DB table has only one column namely addon_name
My knex query in my node application goes as follows
knex(`<table_name>`).insert(req.body.`<param_name>`))
expected op
insert into `<tablename>`(`addon_name`) values (sugar), (milk);
but the code dosn't work. Any comments ?
Error Details
{ [Error: insert into `table_name` (`0`, `1`, `10`, `11`, `12`, `13`, `14`, `15`, `16`, `17`, `18`, `19`, `2`, `20`, `21`, `22`, `23`, `24`, `25`, `26`, `27`, `28`, `29`, `3`, `30`, `31`, `32`, `33`, `34`, `35`, `36`, `37`, `38`, `39`, `4`, `40`, `41`, `5`, `6`, `7`, `8`, `9`) values ('[', '{', 'm', 'e', ':', '\'', 's', 'u', 'g', 'a', 'r', '\'', 'a', '}', ',', '{', 'a', 'd', 'd', 'o', 'n', '_', 'n', 'd', 'a', 'm', 'e', ':', '\'', 'm', 'i', 'l', 'k', '\'', 'd', '}', ']', 'o', 'n', '_', 'n', 'a') - ER_BAD_FIELD_ERROR: Unknown column '0' in 'field list']
code: 'ER_BAD_FIELD_ERROR',
errno: 1054,
sqlState: '42S22',
index: 0 }
Though this is an old question, I am replying here just for others who stumble upon this.
Knex now supports multi-row inserts like this:
knex('coords').insert([{x: 20}, {y: 30}, {x: 10, y: 20}])
outputs:
insert into `coords` (`x`, `y`) values (20, DEFAULT), (DEFAULT, 30), (10, 20)
There's also the batchInsert utility will inserts a batch of rows wrapped inside a transaction.
req.body.<param_name> is always a string. Most probably this will work for you:
knex(table_name).insert(JSON.parse(req.body.param_name)));
What you are seeing in your error is Knex treating the string as an array of chars, trying to push it to the table.
In the error, the following:
values ('[', '{', 'm', 'e', ':', '\'', 's', ...
Is actually your string being broken down: [{me:\'s....
Thanks. I changed the structure of my input in post method, to an comma separated string. That way it gets easier to parse the input and model it the way I need.
post method input : "milk,sugar"
code
//Knex accepts multi row insert in the following format [{},{}] => we need to
//model our input that way
var parsedValues = [];
try {
var arr = req.body.addons.split(',');
}catch(err){
return res.send({ "Message": "405" }); // Data not sent in proper format
}
for (var i in arr) {
parsedValues.push({addon_name: arr[i]});
}
console.log(parsedValues);
knex(`<table_name>`).insert(parsedValues).then(function (rows){
console.log(rows);
return res.send({ "Message": "777" }); // Operation Success
}).catch(function (err){
console.log(err)
return res.send({ "Message": "403" }); // PK / FK Violation
});
You can use batch insert
DB.transaction(async (t: Knex.Transaction) => {
return await t
.batchInsert("addon_name", addon_nameRecords)
.returning("id");
});

Compare two strings according to defined values in a query

I have to compare two string fields containing letters but not alphabetically.
I want to compare them according to this order :
"J" "L" "M" "N" "P" "Q" "R" "S" "T" "H" "V" "W" "Y" "Z"
So if I compare H with T, H will be greater than T (unlike alphabetically)
And if I test if a value is greater than 'H' (> 'H') I will get all the entries containing the values ("V" "W" "Y" "Z") (again, unlike alphabetical order)
How can I achieve this in one SQL query?
Thanks
SELECT *
FROM yourtable
WHERE
FIELD(col, 'J', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'H', 'V', 'W', 'Y', 'Z') >
FIELD('H', 'J', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'H', 'V', 'W', 'Y', 'Z')
^ your value
Or also:
SELECT *
FROM yourtable
WHERE
LOCATE(col, 'JLMNPQRSTHVWYZ')>
LOCATE('H', 'JLMNPQRSTHVWYZ')
Please see fiddle here.
You can do
SELECT ... FROM ... ORDER BY yourletterfield='J' DESC, yourletterfield='L' DESC, yourletterfield='M' DESC, ...
The equality operator will evaluate to "1" when it's true, "0" when false, so this should give you the desired order.
There's actually a FIELD() function that will make this a bit less verbose. See this article for details.
SELECT ... FROM ... ORDER BY FIELD(yourletterfield, 'J', 'L', 'M', ...)

SQL Server : how does the Replace function work as its acting strange for me

I am 'randomising' some strings in a SQL Server table to do a primitive encryption on them.
I have a nested SQL replace function around 35 times (A-Z,1-9) that basically takes every letter in the alphabet and number and replaces it with another letter or number. example of which would be
Replace(Replace(Replace('a', 'c'), 'b', 'a'), 'c', 'b')
I figured that the replace function would go though a string like 'abc' and replace everything once and stop - 'cab'. It doesn't!
It seems to want to change some characters again resulting in 'abc'->'cab'->'ccb'.
This is fine except if I have another string called 'aac' this could result in duplicate string and I lose traceability back to original.
Can anyone explain how I could stop REPLACE() partially going back over my string?
SELECT * INTO example_temp FROM example;
Update KAP_db.dbo.example_temp Set col1 = replace(replace(replace(replace(replace(
replace(replace(replace(replace(replace(‌​replace(replace(replace(replace(replace(
replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(
replace‌​(replace(replace(replace(replace(replace(replace(replace(replace(replace(
col1, 'A', 'N'),'B', 'O'), 'C', 'P'), 'D', 'Q'), 'E', 'R'), 'F', 'S'), 'G', 'T'),
'H', 'U'), 'I', 'V'), 'J', 'W'), 'K', 'X'), 'L', 'Y'), 'M', 'Z'), 'O', 'A'), 'P', 'B'),
'Q', 'C'), 'R', 'D'),'S', 'E'),'T', 'E'),'U', 'E'),'V', 'F'),'W', 'G'),'X', 'H'),
'Y', 'I'),'Z', 'J'), '1', '9'),'2','8'),'3','7'),'4','6'),'5','5'),'6','4'),'7','3'),
'8','2'),'9','1'),' ','');
The above results in '8EVHUAB' and '8EVHHAB' both outputting '2DFEENA'
Update -------------------------------------------------------------------
Ok i have redone the code and so far have:
DECLARE #Input AS VarChar(1000)
DECLARE #i AS TinyInt
Declare #Substring AS VarChar(1000)
Declare #Prestring AS VarChar(1000)
Declare #Poststring AS VarChar(1000)
Select #Input='ABCDEFGHIJKLMNOPQRSTUVWXYZ123456789'
SELECT #i = 1
Select #Substring ='na'
WHILE #i <= LEN(#Input) BEGIN
Select #Prestring = SUBSTRING(#Input,-1,#i)
Select #Poststring = SUBSTRING(#Input,#i+1,LEN(#Input))
SELECT #Substring = replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace
(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace
(SUBSTRING(#Input,#i,1), 'A', 'N'),'B', 'O'), 'C', 'P'), 'D', 'Q'), 'E', 'R'), 'F', 'S'), 'G', 'T'), 'H', 'U'), 'I', 'V'), 'J', 'W'), 'K', 'X'), 'L', 'Y'), 'M', 'Z'), 'N', 'A'), '0', 'B'), 'P', 'C')
, 'Q', 'D'),'R', 'E'),'S', 'E'),'T', 'E'),'U', 'F'),'V', 'G'),'W', 'H'),'X', 'I'),'Y', 'J'), '1', '9'),'2','8'),'3','7'),'4','6'),'5','5'),'6','4'),'7','3'),'8','2'),'9','1'),' ','')
Select #Input = #Prestring + #Substring + #Poststring
SELECT #i = #i + 1
print 'END
'
END
This doesnt work correctly though, the code does not execute as its written, any suggestions?
Why you're seeing this: replace is a function; all it knows are its arguments. replace(replace('aba', 'a', 'b'), 'b', 'a') is absolutely equivalent to replace('bbb', 'b', 'a'), because the outer replace has no way of knowing that its first argument was created by a different call to replace. Does that make sense?
You can think of it just like a function in algebra. If we define f(x) = x2, then f(f(2)) = f(22) = f(4) = 42 = 16. There's no way to tell f to behave differently when its argument is f(2) from when its argument is 4, because f(2) is 4.
Similarly, replace('aba', 'a', 'b') is 'bbb', so there's no way to tell replace to behave differently when its first argument is replace('aba', 'a', 'b') from when its first argument is 'bbb'.
(This is usually true in computer science. Functions in computer science aren't always like functions in algebra — for example, they frequently actually do things, rather than just returning a value — but it's usually the case that they receive arguments as values, or as opaque references to values, and have no way of knowing where they came from or how they were constructed.)
How to address this: I don't think there's any very clean way to do this. Gordon Linoff suggested that you could use intermediate placeholder characters (specifically — lowercase letters) that don't exist in the initial string and don't exist in the final string, so that you can safely replace them without worrying about interference; and I think that's probably the best approach.
Your results are not surprising, since each replace is returning the string to the next level. There is no way to distinguish between the original character value and the replaced value, when they are the same character.
If you were only working with alpha characters you could do the following.
Change the collation of the original string.
Upper case the string
Replace the upper case letters with the lower case
Lower case the entire string at the end (should be redundant)
Unfortunately, I can't think of an analog for numbers that would work the same way.
Here is a link to a site that has code for the function http://www.dbforums.com/microsoft-sql-server/1216565-oracle-translate-function-equivalent-sql-server.html. The equivalent function in Oracle is called translate.
The Replace() function simply performs the operation and returns. It doesn't keep state to the next Replace(). It doesn't prevent a later invocation from replacing the characters you previously replaced. To cycle characters around, you have to have an additional "placeholder" value, and then take care to not replace a character that was already change from something else.
First, let me show you an analogy:
There are three buckets full of an equal number of marbles. The buckets are labeled "A", "B", and "C". You must perform the following instructions:
Pour all marbles from A into C.
Pour all marbles from B into A.
Pour all marbles from C into B.
What result would you expect to have? The answer is: an empty bucket C, and B having twice the number of marbles as are in A.
If you want to preserve the marbles in their original counts, you have to have four containers so that you don't mix the marbles while moving them:
Pour all marbles from C into X.
Pour all marbles from A into C.
Pour all marbles from B into A.
Pour all marbles from X into B. (rather than from C--which already has A's marbles)
Now you have the results you expected, and can discard the empty bucket X.
Try this expression and see if it gives what you want:
Replace(Replace(Replace(Replace('c', char(1)), 'a', 'c'), 'b', 'a'), char(1), 'b')

replace() concat() and substr() add character at specific location in string cannot get the position right

End result: all 3 fields should be merged (solved OK), and the character "T" should be added as the 5th character in the merged string (no other characters should be removed or altered in sequence). (see all specifics below).
What am I doing wrong?
Data is in the following format:
data1: AL
data2: 33 0230S 0440E
data3: SW
Here is my current sql:
replace(concat(b.data1,
substr(b.data2, 4, 1),
'T',
substr(b.data2, 1),
b.data3), ' ', '')
AS MergedData
The final output should look like:
AL33T0230S0440ESW
I've been able to get the "T" placed at random locations, but cannot get it consistently added as the 5th character from the start of the string.
Use:
replace only on data2 (because that's the only field that needs it), then
concat() to join it all up, and finally
the insert() function to insert the T
(Don't use substr at all)
insert(concat(data1, replace(data2, ' ', ''), data3), 5, 0, 'T')
Here's a test:
set #data1 := 'AL', #data2 := '33 0230S 0440E', #data3 := 'SW';
select
insert(concat(#data1, replace(#data2, ' ', ''), #data3), 5, 0, 'T')
as MergedData;
Output:
+-------------------+
| MergedData |
+-------------------+
| AL33T0230S0440ESW |
+-------------------+
Random locations seems odd, this seems to work though;
replace(concat(b.data1,
substr(b.data2, 1, 2),
'T',
substr(b.data2, 4),
b.data3), ' ', '')
Demo here.
Find the position of the first space in data2, replace it with T, remove the rest of the spaces in the resulting string, then concatenate it with the two other values:
CONCAT(
b.data1,
REPLACE(INSERT(b.data2, LOCATE(' ', b.data2), 1, 'T'), ' ', ''),
b.data3
) AS MergedData