Question marks appear in Distance (similarity measure object) - rapidminer

I am trying to use KLDivergence for measuring similarity between text data. But, although other similarity measures work fine, KLDivergence returns question marks in the result. What can cause this problem?

If any of the attributes has a value of zero, the KLdivergence will produce the missing result (the ?). This is probably because of a division by zero.

Related

access 2016 calculated fields round to zero

I am trying to include a simple density calculation in access 2016, but the form returns a value of 0 if the input dimensions (mass or sphere diameter) are < 0.5. The field works fine for larger dimensions, so I assume that the smaller values are getting rounded to 0 somewhere along the way, but I can't figure out where.
For the inputs in my table, I have Field Names "green mass", "green pole", and "green equator" where the data type for each is set to "number," the Field Size is set to "single" (vs. double or decimal), and the Decimal Places is set to 4 digits
The resulting density is displayed in the Field "apparent green density" where the data type is set to "calculated," the Result Type is set to "single" and the Decimal Places is set to 4 digits.
After looking at various access forums and websites, I'm pretty sure I want to use single or double as my field size, but I've also tried decimal and byte and integer I keep getting 0.
Can anyone explain why this isn't working?
The equation is below. It's a bit complicated because it's a 3-part If statement (if dimensions for a sphere are given, caclulate density of a sphere, if dimensions of a disc are give, calculate density of a disc, if dimensions of a cube...) All three cases work for large dimensions (>0.5), but all 3 result in 0 for dimensions <0.5.
IIf([GreenPole],[GreenMass]/(3.14159265359/6*2.54^3*(([GreenPole]+[GreenEquator])/2)^3),IIf([GreenDia],([GreenMass]/(3.14159265359*([GreenDia]/2)^2*[GreenHeight]*2.54^3)),IIf([GreenLength],[GreenMass]/([GreenLength]*[GreenWidth]*[GreenThickness]*2.54^3),0)))
The first part of the equation for density of a sphere, is:
`IIf([GreenPole],[GreenMass]/(3.14159265359/6*2.54^3*(([GreenPole]+[GreenEquator])/2)^3),0)
Oliver Jacot-Descombes got me started in the right direction. I don't have much experience at all with coding, but I think what happened is that field identified in my IIf statement is somehow transformed into a boolean or yes/ no field and anything less than 0.5 is rounded to a no and the result of the truepart is then 0.
I modified the code to:
IIf([GreenPole]>0,[GreenMass]/(3.14159265359/6*2.54^3*(([GreenPole]+[GreenEquator])/2)^3),0)
And everything works now. (I also modified the second and third IIf statments to IIf([GreenLength]>0 and IIF([GreenDia]>0..)

Access UI chart text legend sorted as number

I have searched for a solution but I can't find one suitable on this problem.
I have a chart in access where the Y-axis is text but starts with a number, so up along the y-axis i get this:
I know why, but I don't know how to fix it.
They all have an ID which is fine. I can chose to put the ID on the Y-axis, but then the kW range can't be visualized.
How is this changed?
Changing the text to number is not possible as it needs to be like "a-b kW".
Thanks in advance.
For those who might get this problem, I found out why.
When selecting the data, it chose to use the "Total" function, and "Avg" because that is what the values are. (The picture below says "Group By" but it was automaticly set to "Avg". I just forgot to change it when i snipped for stack.
That results in:
But if i remove the "Total" function i get this:
So, removing the "Total" function works here...
In your query, can you sort by ID, but not display it (e.g., in the query designer, sort 'descending/ascending' on the ID field, but clear the 'display' check box)?
As an aside, I've always had a tough time with Access charts... on my last project, the customer wanted a rather complex bar chart, and I ended-up drawing and resizing rectangle objects to get the look that the customer wanted.

Converting between RGB and HSL/HSV: What to do with overflows?

I've implemented some functions according to the HSL->RGB and HSV->RGB algorithms.
They mostly work fine, but I'm not sure what is the right thing to do then a color component overflows as a result of the conversion.
E.g., the red component ends up being 1.2 whereas the allowed range is [0..1]. If I multiply that by 255 I will obviously get a value that is invalid in the RGB world.
What is the correct way of handling this -- truncating (if > 1 then set to 1) or wrapping around (if > 1 then substract 1)?
It is not possible that the values ​​R, G and B come out of their range if you have properly implemented standard algorithms and inputs are in their ranges.
What algorithms you've implemented?

How to divide tiny double precision numbers correctly without precision errors?

I'm trying to diagnose and fix a bug which boils down to X/Y yielding an unstable result when X and Y are small:
In this case, both cx and patharea increase smoothly. Their ratio is a smooth asymptote at high numbers, but erratic for "small" numbers. The obvious first thought is that we're reaching the limit of floating point accuracy, but the actual numbers themselves are nowhere near it. ActionScript "Number" types are IEE 754 double-precision floats, so should have 15 decimal digits of precision (if I read it right).
Some typical values of the denominator (patharea):
0.0000000002119123
0.0000000002137313
0.0000000002137313
0.0000000002155502
0.0000000002182787
0.0000000002200977
0.0000000002210072
And the numerator (cx):
0.0000000922932995
0.0000000930474444
0.0000000930582124
0.0000000938123574
0.0000000950458711
0.0000000958000159
0.0000000962901528
0.0000000970442977
0.0000000977984426
Each of these increases monotonically, but the ratio is chaotic as seen above.
At larger numbers it settles down to a smooth hyperbola.
So, my question: what's the correct way to deal with very small numbers when you need to divide one by another?
I thought of multiplying numerator and/or denominator by 1000 in advance, but couldn't quite work it out.
The actual code in question is the recalculate() function here. It computes the centroid of a polygon, but when the polygon is tiny, the centroid jumps erratically around the place, and can end up a long distance from the polygon. The data series above are the result of moving one node of the polygon in a consistent direction (by hand, which is why it's not perfectly smooth).
This is Adobe Flex 4.5.
I believe the problem most likely is caused by the following line in your code:
sc = (lx*latp-lon*ly)*paint.map.scalefactor;
If your polygon is very small, then lx and lon are almost the same, as are ly and latp. They are both very large compared to the result, so you are subtracting two numbers that are almost equal.
To get around this, we can make use of the fact that:
x1*y2-x2*y1 = (x2+(x1-x2))*y2 - x2*(y2+(y1-y2))
= x2*y2 + (x1-x2)*y2 - x2*y2 - x2*(y2-y1)
= (x1-x2)*y2 - x2*(y2-y1)
So, try this:
dlon = lx - lon
dlat = ly - latp
sc = (dlon*latp-lon*dlat)*paint.map.scalefactor;
The value is mathematically the same, but the terms are an order of magnitude smaller, so the error should be an order of magnitude smaller as well.
Jeffrey Sax has correctly identified the basic issue - loss of precision from combining terms that are (much) larger than the final result.
The suggested rewriting eliminates part of the problem - apparently sufficient for the actual case, given the happy response.
You may find, however, that if the polygon becomes again (much) smaller and/or farther away from the origin, inaccuracy will show up again. In the rewritten formula the terms are still quite a bit larger than their difference.
Furthermore, there's another 'combining-large&comparable-numbers-with-different-signs'-issue in the algorithm. The various 'sc' values in subsequent cycles of the iteration over the edges of the polygon effectively combine into a final number that is (much) smaller than the individual sc(i) are. (if you have a convex polygon you will find that there is one contiguous sequence of positive values, and one contiguous sequence of negative values, in non-convex polygons the negatives and positives may be intertwined).
What the algorithm is doing, effectively, is computing the area of the polygon by adding areas of triangles spanned by the edges and the origin, where some of the terms are negative (whenever an edge is traversed clockwise, viewing it from the origin) and some positive (anti-clockwise walk over the edge).
You get rid of ALL the loss-of-precision issues by defining the origin at one of the polygon's corners, say (lx,ly) and then adding the triangle-surfaces spanned by the edges and that corner (so: transforming lon to (lon-lx) and latp to (latp-ly) - with the additional bonus that you need to process two triangles less, because obviously the edges that link to the chosen origin-corner yield zero surfaces.
For the area-part that's all. For the centroid-part, you will of course have to "transform back" the result to the original frame, i.e. adding (lx,ly) at the end.

Math.round() wrong calculation in as3?

Can anyone explain this?
what am I doing wrong?
Round is doing the correct thing. 0.285 cannot be exactly represented as a binary floating point value. As you see, when multiplied by 100 it approximates to 28.4999999... which is less than 28.5, so the value is rounded down.
Math.Round(x:Number) rounds x to the nearest integer value. In your case 28 is the nearest integer value for 28.499999999999996. So here the behavior is correct. What is weird is that 0.285 * 100 is not 28.5, but that is a consequence of the precision of the Number class in as3. Here is a little more information about this and a possible solution:
Innacurate math results
Also you can see this SO question:
Very strange number operation issue
Hope this helps.