I'm stuck and don't know how to proceed further. How do I order my results accordingly?
10 x 2 ml
10 x 10 ml
4 x 20 ml
10 x 2 ml should come first because 2 ml is smaller than 10 ml.
And then order by the number that comes before the multiplication sign.
This is how I solved my own question:
ORDER BY SUBSTR(size, INSTR(size, 'x') + 2) + 0, size + 0
You could try this, but it's really ugly, especially if the tables are big and you need performance:
ORDER BY TRIM(REPLACE(REPLACE(field_name,CONCAT(SUBSTRING_INDEX(field_name,'x',1),'x'),''),'ml',''))
it replaces SUBSTRING_INDEX('ABCx123ml','x',1); //ABC and ml with blanks, triming it, leaving only the value needed for order...
I have a matrix (size: 28 columns and 47 rows) with numbers. This matrix has an extra row that is contains headers for the columns ("ordinal" and "nominal").
I want to use the Gower distance function on this matrix. Here says that:
The final dissimilarity between the ith and jth units is obtained as a weighted sum of dissimilarities for each variable:
d(i,j) = sum_k(delta_ijk * d_ijk ) / sum_k( delta_ijk )
In particular, d_ijk represents the distance between the ith and jth unit computed considering the kth variable. It depends on the nature of the variable:
factor or character columns are
considered as categorical nominal
variables and d_ijk = 0 if
x_ik =x_jk, 1 otherwise;
ordered columns are considered as
categorical ordinal variables and
the values are substituted with the
corresponding position index, r_ik in
the factor levels. These position
indexes (that are different from the
output of the R function rank) are
transformed in the following manner
z_ik = (r_ik - 1)/(max(r_ik) - 1)
These new values, z_ik, are treated as observations of an
interval scaled variable.
As far as the weight delta_ijk is concerned:
delta_ijk = 0 if x_ik = NA or x_jk =
NA;
delta_ijk = 1 in all the other cases.
I know that there is a gower.dist function, but I must do it that way.
So, for "d_ijk", "delta_ijk" and "z_ik", I tried to make functions, as I didn't find a better way.
I started with "delta_ijk" and i tried this:
Delta=function(i,j){for (i in 1:28){for (j in 1:47){
+{if (MyHeader[i,j]=="nominal")
+ result=0
+{else if (MyHeader[i,j]=="ordinal") result=1}}}}
+;result}
But I got error. So I got stuck and I can't do the rest.
P.S. Excuse me if I make mistakes, but English is not a language I very often.
Why do you want to reinvent the wheel billyt? There are several functions/packages in R that will compute this for you, including daisy() in package cluster which comes with R.
First things first though, get those "data type" headers out of your data. If this truly is a matrix then character information in this header row will make the whole matrix a character matrix. If it is a data frame, then all columns will likely be factors. What you want to do is code the type of data in each column (component of your data frame) as 'factor' or 'ordered'.
df <- data.frame(A = c("ordinal",1:3), B = c("nominal","A","B","A"),
C = c("nominal",1,2,1))
Which gives this --- note that all are stored as factors because of the extra info.
> head(df)
A B C
1 ordinal nominal nominal
2 1 A 1
3 2 B 2
4 3 A 1
> str(df)
'data.frame': 4 obs. of 3 variables:
$ A: Factor w/ 4 levels "1","2","3","ordinal": 4 1 2 3
$ B: Factor w/ 3 levels "A","B","nominal": 3 1 2 1
$ C: Factor w/ 3 levels "1","2","nominal": 3 1 2 1
If we get rid of the first row and recode into the correct types, we can compute Gower's coefficient easily.
> headers <- df[1,]
> df <- df[-1,]
> DF <- transform(df, A = ordered(A), B = factor(B), C = factor(C))
> ## We've previously shown you how to do this (above line) for lots of columns!
> str(DF)
'data.frame': 3 obs. of 3 variables:
$ A: Ord.factor w/ 3 levels "1"<"2"<"3": 1 2 3
$ B: Factor w/ 2 levels "A","B": 1 2 1
$ C: Factor w/ 2 levels "1","2": 1 2 1
> require(cluster)
> daisy(DF)
Dissimilarities :
2 3
3 0.8333333
4 0.3333333 0.8333333
Metric : mixed ; Types = O, N, N
Number of objects : 3
Which gives the same as gower.dist() for this data (although in a slightly different format (as.matrix(daisy(DF))) would be equivalent):
> gower.dist(DF)
[,1] [,2] [,3]
[1,] 0.0000000 0.8333333 0.3333333
[2,] 0.8333333 0.0000000 0.8333333
[3,] 0.3333333 0.8333333 0.0000000
You say you can't do it this way? Can you explain why not? As you seem to be going to some degree of effort to do something that other people have coded up for you already. This isn't homework, is it?
I'm not sure what your logic is doing, but you are putting too many "{" in there for your own good. I generally use the {} pairs to surround the consequent-clause:
Delta=function(i,j){for (i in 1:28) {for (j in 1:47){
if (MyHeader[i,j]=="nominal") {
result=0
# the "{" in the next line before else was sabotaging your efforts
} else if (MyHeader[i,j]=="ordinal") { result=1} }
result}
}
Thanks Gavin and DWin for your help. I managed to solve the problem and find the right distance matrix. I used daisy() after I recoded the class of the data and it worked.
P.S. The solution that you suggested at my other topic for changing the class of the columns:
DF$nominal <- as.factor(DF$nominal)
DF$ordinal <- as.ordered(DF$ordinal)
didn't work. It changed only the first nominal and ordinal column.
Thanks again for your help.
I don't really understand how modulus division works.
I was calculating 27 % 16 and wound up with 11 and I don't understand why.
I can't seem to find an explanation in layman's terms online.
Can someone elaborate on a very high level as to what's going on here?
Most explanations miss one important step, let's fill the gap using another example.
Given the following:
Dividend: 16
Divisor: 6
The modulus function looks like this:
16 % 6 = 4
Let's determine why this is.
First, perform integer division, which is similar to normal division, except any fractional number (a.k.a. remainder) is discarded:
16 / 6 = 2
Then, multiply the result of the above division (2) with our divisor (6):
2 * 6 = 12
Finally, subtract the result of the above multiplication (12) from our dividend (16):
16 - 12 = 4
The result of this subtraction, 4, the remainder, is the same result of our modulus above!
The result of a modulo division is the remainder of an integer division of the given numbers.
That means:
27 / 16 = 1, remainder 11
=> 27 mod 16 = 11
Other examples:
30 / 3 = 10, remainder 0
=> 30 mod 3 = 0
35 / 3 = 11, remainder 2
=> 35 mod 3 = 2
The simple formula for calculating modulus is :-
[Dividend-{(Dividend/Divisor)*Divisor}]
So, 27 % 16 :-
27- {(27/16)*16}
27-{1*16}
Answer= 11
Note:
All calculations are with integers. In case of a decimal quotient, the part after the decimal is to be ignored/truncated.
eg: 27/16= 1.6875 is to be taken as just 1 in the above mentioned formula. 0.6875 is ignored.
Compilers of computer languages treat an integer with decimal part the same way (by truncating after the decimal) as well
Maybe the example with an clock could help you understand the modulo.
A familiar use of modular arithmetic is its use in the 12-hour clock, in which the day is divided into two 12 hour periods.
Lets say we have currently this time: 15:00
But you could also say it is 3 pm
This is exactly what modulo does:
15 / 12 = 1, remainder 3
You find this example better explained on wikipedia: Wikipedia Modulo Article
The modulus operator takes a division statement and returns whatever is left over from that calculation, the "remaining" data, so to speak, such as 13 / 5 = 2. Which means, there is 3 left over, or remaining from that calculation. Why? because 2 * 5 = 10. Thus, 13 - 10 = 3.
The modulus operator does all that calculation for you, 13 % 5 = 3.
modulus division is simply this : divide two numbers and return the remainder only
27 / 16 = 1 with 11 left over, therefore 27 % 16 = 11
ditto 43 / 16 = 2 with 11 left over so 43 % 16 = 11 too
Very simple: a % b is defined as the remainder of the division of a by b.
See the wikipedia article for more examples.
I would like to add one more thing:
it's easy to calculate modulo when dividend is greater/larger than divisor
dividend = 5
divisor = 3
5 % 3 = 2
3)5(1
3
-----
2
but what if divisor is smaller than dividend
dividend = 3
divisor = 5
3 % 5 = 3 ?? how
This is because, since 5 cannot divide 3 directly, modulo will be what dividend is
I hope these simple steps will help:
20 % 3 = 2
20 / 3 = 6; do not include the .6667 – just ignore it
3 * 6 = 18
20 - 18 = 2, which is the remainder of the modulo
Easier when your number after the decimal (0.xxx) is short. Then all you need to do is multiply that number with the number after the division.
Ex: 32 % 12 = 8
You do 32/12=2.666666667
Then you throw the 2 away, and focus on the 0.666666667
0.666666667*12=8 <-- That's your answer.
(again, only easy when the number after the decimal is short)
27 % 16 = 11
You can interpret it this way:
16 goes 1 time into 27 before passing it.
16 * 2 = 32.
So you could say that 16 goes one time in 27 with a remainder of 11.
In fact,
16 + 11 = 27
An other exemple:
20 % 3 = 2
Well 3 goes 6 times into 20 before passing it.
3 * 6 = 18
To add-up to 20 we need 2 so the remainder of the modulus expression is 2.
The only important thing to understand is that modulus (denoted here by % like in C) is defined through the Euclidean division.
For any two (d, q) integers the following is always true:
d = ( d / q ) * q + ( d % q )
As you can see the value of d%q depends on the value of d/q. Generally for positive integers d/q is truncated toward zero, for instance 5/2 gives 2, hence:
5 = (5/2)*2 + (5%2) => 5 = 2*2 + (5%2) => 5%2 = 1
However for negative integers the situation is less clear and depends on the language and/or the standard. For instance -5/2 can return -2 (truncated toward zero as before) but can also returns -3 (with another language).
In the first case:
-5 = (-5/2)*2 + (-5%2) => -5 = -2*2 + (-5%2) => -5%2 = -1
but in the second one:
-5 = (-5/2)*2 + (-5%2) => -5 = -3*2 + (-5%2) => -5%2 = +1
As said before, just remember the invariant, which is the Euclidean division.
Further details:
What is the behavior of integer division?
Division and Modulus for Computer Scientists
Modulus division gives you the remainder of a division, rather than the quotient.
It's simple, Modulus operator(%) returns remainder after integer division. Let's take the example of your question. How 27 % 16 = 11? When you simply divide 27 by 16 i.e (27/16) then you get remainder as 11, and that is why your answer is 11.
Lets say you have 17 mod 6.
what total of 6 will get you the closest to 17, it will be 12 because if you go over 12 you will have 18 which is more that the question of 17 mod 6. You will then take 12 and minus from 17 which will give you your answer, in this case 5.
17 mod 6=5
Modulus division is pretty simple. It uses the remainder instead of the quotient.
1.0833... <-- Quotient
__
12|13
12
1 <-- Remainder
1.00 <-- Remainder can be used to find decimal values
.96
.040
.036
.0040 <-- remainder of 4 starts repeating here, so the quotient is 1.083333...
13/12 = 1R1, ergo 13%12 = 1.
It helps to think of modulus as a "cycle".
In other words, for the expression n % 12, the result will always be < 12.
That means the sequence for the set 0..100 for n % 12 is:
{0,1,2,3,4,5,6,7,8,9,10,11,0,1,2,3,4,5,6,7,8,9,10,11,0,[...],4}
In that light, the modulus, as well as its uses, becomes much clearer.
Write out a table starting with 0.
{0,1,2,3,4}
Continue the table in rows.
{0,1,2,3,4}
{5,6,7,8,9}
{10,11,12,13,14}
Everything in column one is a multiple of 5. Everything in column 2 is a
multiple of 5 with 1 as a remainder. Now the abstract part: You can write
that (1) as 1/5 or as a decimal expansion. The modulus operator returns only
the column, or in another way of thinking, it returns the remainder on long
division. You are dealing in modulo(5). Different modulus, different table.
Think of a Hash Table.
When we divide two integers we will have an equation that looks like the following:
A/B =Q remainder R
A is the dividend; B is the divisor; Q is the quotient and R is the remainder
Sometimes, we are only interested in what the remainder is when we divide A by B.
For these cases there is an operator called the modulo operator (abbreviated as mod).
Examples
16/5= 3 Remainder 1 i.e 16 Mod 5 is 1.
0/5= 0 Remainder 0 i.e 0 Mod 5 is 0.
-14/5= 3 Remainder 1 i.e. -14 Mod 5 is 1.
See Khan Academy Article for more information.
In Computer science, Hash table uses Mod operator to store the element where A will be the values after hashing, B will be the table size and R is the number of slots or key where element is inserted.
See How does a hash table works for more information
This was the best approach for me for understanding modulus operator. I will just explain to you through examples.
16 % 3
When you division these two number, remainder is the result. This is the way how i do it.
16 % 3 = 3 + 3 = 6; 6 + 3 = 9; 9 + 3 = 12; 12 + 3 = 15
So what is left to 16 is 1
16 % 3 = 1
Here is one more example: 16 % 7 = 7 + 7 = 14 what is left to 16? Is 2 16 % 7 = 2
One more: 24 % 6 = 6 + 6 = 12; 12 + 6 = 18; 18 + 6 = 24. So remainder is zero, 24 % 6 = 0
When multiplying (or doing any mathematics) to binary and decimal numbers, would you simply convert then multiply in decimals?
E.g., 3(base10) * 100(base2) would = 3 * 4 = 12?
You can multiply in any base as long as the base is the same for each operand.
In your example, you could have converted the 3(base10) to 11(base2) and multiplied:
11 * 100 = 1100
1100(Base2) = 12(base10)
Numbers are numbers. 3 * 0b100 will always equal 12, regardless of whether you use a lookup table or bit shifting to multiply them.
You would convert them to integers before multiplying, I would hope.
Thus they're all in binary.
convert Base 2 into base 10 number then multiply
For example:
1000 base 2 x 100 base 10
Converting 1000 base2
1000 base 2 = 2x2x2 = 8
so Multiplication result will be
8 base 10 x 100 base 10 = 800 base 10 = 800
Hope problem solved...