I would like to compute Jaccard Similarity on text using R.
I already found a way to compute JS using a function. Which works fine when I apply it stand-alone.
I have a dataset with utterances in conversation. I would like to add a column that presents the Jaccard similarity of each utterance with the (immediate) previous one.
Like I said I already use a function to compute JS.
jaccard <- function(a, b) {
intersection = length(intersect(a, b))
union = length(a) + length(b) - intersection
return (intersection/union)
}
I have already tried multiple things looking like this:
Text$J <- 0
# for every row in DT
for (i in 1:length(Text)) {
if(i==1) {
#using NA at first line
Text[i,2] <- NA
} else {
Text$J <- jaccard(Text$Utterance,Text$Utterance[i-1])
}
}
Is it possible to integrate a function like the above in a 'for every row' code? So far, my attempts are not successful, but that might be me. What happens in most cases is that is just pastes one Jaccard value to the whole column. Thank you in advance!
I have the following haskell code:
Why doesn't x1's pattern matching to function f?
It's pretty hard to read as-is. Let's use some creative whitespace to line things up.
f ( [_ ]:[(x,[xs ])]:[y ,ys ] :[]) = 1
x1 = [[(1,[1,2])],[(1,[1,2])],[(1,[1,2]),(1,[1,2])],[]]
Okay. So there's actually a couple different things that aren't going as you expect!
[xs] does not match [1, 2], because [xs] is a one-element list and [1, 2] is a two-element list (possible fix: xs instead of [xs])
[y, ys] happens to match, but I suspect not in the way you intended: y matches to the first element of the list, just as I think you intend, but ys to the second element of the list, not the remainder of the list I think you intend (possible fix: (y:ys) instead of [y, ys])
your pattern's :[] matches the closing bracket of a list definition, not a final [] element (possible fix: :[]:[] instead of :[]; the first [] there matches the element, and the second [] matches the end-of-list marker)
In my use of Lilypond, I often face the same kind of problems: Say I have four scores (3-4 lines each) that fit in two pages but not necessarily in one.
I refuse to have page breaks within scores. If possible, I want all the scores on the same page. When it's not possible however, I would like the page break to occur between the first and second scores. If that is not possible either, between the second and the third. And only if that's really necessary between the third and the fourth. That is, by order of preference, | representing the page break:
1 2 3 4 |
1 | 2 3 4
1 2 | 3 4
1 2 3 | 4
Is there a way to achieve that without trying and adding the page breaks myself? Maybe by having page-break penalties going in increasing order after each score (but remaining smaller than the penalty for adding a new page)?
Thank you by advance for your help.
You should use ly:page-turn-breaking (see Optimal page turning in the documentation). You'll probably have to play also with \pageTurn, \noPageTurn, \allowPageTurn in order to have the best control.
Here's a minimal example:
\version "2.19.82"
\header {
title = "Page turn breaking"
}
\paper {
% The default page breaking will make Score 1 end at beginning of page 2.
% The following option prevents this and keeps Score 1 all in the first page.
page-breaking = #ly:page-turn-breaking
}
\score {
\header { piece = "Score 1" }
\new Staff {
\clef "treble_8"
\repeat unfold 14 { c'1*4 }
}
\layout {
indent = 0
system-count = 14
}
}
\score {
\header { piece = "Score 2" }
\new Staff {
\clef "treble_8"
\repeat unfold 13 { e'2 f }
}
\layout {
indent = 0
system-count = 13
}
}
So Im working on a Spreedsheet in Google Docs, my question is: I would like to combine say columns A,B,C,ect... in column I with a space between them. I'm currently using something like this =A1&" "&B1&" "&C1&" "&ect... This works fine and dandy but if the cell is blank I would like to ignore it. Should this be done via script or formula?
So in my head I'm thinking if cell A = value then grab it and combine it with B (if that contains a value if not leave blank or skip) But I'm not good at PHP So any help would be great!!! Happy NY to everyone ; )
Here's a custom function that will return a string with the given range values joined by the given separator. Any blanks are skipped.
function joinVals( rangeValues, separator ) {
function notBlank(element) {return element != '';}
return rangeValues[0].filter(notBlank).join(separator);
}
Examples:
A B C Formula Result
1 1 2 3 =joinVals(A1:C1," x ") 1 x 2 x 3
2 1 2 =joinVals(A1:C1," x ") 1 x 2
3 1 3 =joinVals(A1:C1," x ") 1 x 3
4 1 2 3 =joinVals(A1:C1) 1,2,3
By using IF and ISBLANK you can determine whether a cell should be included, or ignored. Like this:
=if(ISBLANK(A1),"",A1 & " ")
That reads "if the cell is blank, ignore it, otherwise echo it with a trailing space." You can daisy-chain a series of those together:
=if(ISBLANK(A1),"",A1 & " ")&if(ISBLANK(B1),"",B1 & " ")&if(ISBLANK(C1),"",C1 & " ")...
That gets pretty long and repetitive. Adding ARRAYFORMULA and JOIN, we can have that repetitive piece apply across a range of cells, A1:F1 in this case:
=JOIN("",ARRAYFORMULA(IF(ISBLANK(A1:F1),"",A1:F1&" ")))
How can I define a cluster in Haskell using list comprehension?
I want to define a function for the cluster :
( a b c ) = [ a <- [1 .. 10],b<-[2 .. 10], c = (a, b)]
In your comment you gave the example [(1,2,1),(1,3,1),(1,4,1),(1,5,1),(1,6,1),(1,7,1)].
In that example, only the middle number changes, the other two are always 1. You can do this particular one with
ones = [(1,a,1)| a<-[1..7]]
However, you might want to vary the other ones. Let's have a look at how that works, but I'll use letters instead to make it clearer:
> [(1,a,b)| a<-[1..3],b<-['a'..'c']]
[(1,1,'a'),(1,1,'b'),(1,1,'c'),(1,2,'a'),(1,2,'b'),(1,2,'c'),(1,3,'a'),(1,3,'b'),(1,3,'c')]
You can see that the letters are varying more frequently than the numbers - the b<-[1..3] is like an outer loop, with c<-['a'..'c'] being the inner loop.
You could copy the c into the first of the three elements of the tuple:
> [(b,a,b)| a<-[1..3],b<-['a'..'b']]
[('a',1,'a'),('b',1,'b'),('a',2,'a'),('b',2,'b'),('a',3,'a'),('b',3,'b')]
Or give each its own varying input
> [(a,b,c)| a<-[1..2],b<-['a'..'b'],c<-[True,False]]
[(1,'a',True),(1,'a',False),(1,'b',True),(1,'b',False),(2,'a',True),(2,'a',False),(2,'b',True),(2,'b',False)]