Terminology - What is the complement of memoization? - duplicates

If I am correct, memoization is associated with READ operations. What is the term which refers to the same technique used for WRITE operations?
Example:
Let's say an app receives the following inputs,
0 1 2 2 2 2 2 3 4 3 3 3 3 3 4 4 4 2 1 2 5 5 5 5 3
Instead of saving everything, we tune the app to save only the transitions. (i.e ignore consecutive duplicates)
0 1 2 3 4 3 4 2 1 2 5 3
What is the (standard) term that can be used to describe the above technique?
I feel bad about using the same term since the final outcome is quite different. In READ operations, if memoization is used, the final outcome will remain the same. But in above example for WRITE operations, the final output is different from the original input.

"Deduplication of adjacent/most-recent entries". Your example looks like what the uniq tool does.
If you preserved the count of duplicates, it would be a form of RLE (run-length encoding).
As an aside, I guess you mean memoization as a way to speed up reads, and this as a method to speed up writes, but I wouldn't say that's the opposite of memoization, since it's opposite of the general goal, but not related to the particular method.

As far as I know, there is no applicable terminology for what you are asking.
And it is NOT memoization ... or (to my mind) the reverse of memorization.
(Likewise, there is no word in English for a cat with three legs.)

Related

Reverse Check digit algorithm

I am trying to reverse engineer an algorithm used to generate a check digit.
Numbers are 8 digits long and the last digit is the check digit. I have thousands of valid numbers to test it on.
I have try several standard algorithm but come up with nothing
Here is some examples of valid numbers:
3482145 6
3482146 4
3482147 2
3482148 3
3482149 9
3482150 1
3482151 0
3482152 8
3482153 6
3482154 4
3482155 2
3482156 3
3482157 9
3482158 7
3482159 5
3482160 8
3482161 6
Is it possible to calculate this? Any ideas?
The amount of data you provided is insufficient to adequately assess the algo. The only thing I can see right now is that the sequence 64239xx8 is repeated twice, and the last digit is also 6.
Not an actual answer, I`m afraid, but StackOverflow does not yet allow me to leave comments.
The algorithm is this:
coef[]={4,2,1,6,3,7,9}
modulus 11
Case 10->0
Case 0->3

Reduction of odd number of elements CUDA

It seems that it possible to make reduction only for odd number of elements. For example, it needs to sum up numbers. When I have even number of elements, it will be like this:
1 2 3 4
1+2
3+3
6+4
But what to do when I have, for instance 1 2 3 4 5? The last iteration is the sum of three elements 6+4+5 or what? I saw the same question here, but couldn't find the answer.
A parallel reduction will add pairs of elements first:
1 1+3 4+6
2 2+4
3
4
Your example with an odd number of elements would typically be realized as:
1 1+4 5+3 8+7
2 2+5 7+0
3 3+0
4 0+0
5
0
0
0
That is to say, typically a parallel reduction will work with a power-of-2 set of threads, and at most one threadblock (the last one) will have less than a full complement of data to work with. The usual method to handle this is to zero-pad the data out to the threadblock size. If you study the cuda parallel reduction sample code, you'll find examples of this.

binary conversion using 3 figures system 0,1,2

Suppose system is evolved by extraterrestrial creatures having only 3 figures and they use the figures 0,1,2 with (2>1>0) ,How to represent the binary equivalent of 222 using this?
I calculated it to be 22020 but the book answers it 11010 .how this.Shouldn't i use the same method to binary conversion as from decimal to binary except using '3' here ???
I think you meant base 3 (not binary) equivalent of decimal 222
22020 in base 3 is 222 in decimal.
220202(your answer) in base 3 is 668 in decimal.
11010 (according to book) in base 3 is 111 in decimal.
222 in binary is 11011110
May be i will be able to tell where you went wrong if you tell the method you used to calculate base 3 equivalent of 222
Edit:
Sorry I could not understand the problem until you provide the link. It says what is binary equivalent of 222 (remember 222 is in base 3)
222 in base 3 = 26 in decimal (base 10)
26 in decimal = 11010 in binary
Mark it as accepted if it solved your problem.
Assuming the start is decimal 222.
Well, without knowing the system used in the book I would decompose it by hand in the following way:
3^4 = 81,
3^3 = 27,
3^2 = 9,
3^1 = 3,
So 81 fits twize into 222 , so the 4th "bit" has the value 2.
Remaining are 60. 27 fits twice into 60 so the next bit is 2 again.
Remaining are 6. 9 fits not into 6, so the next bit is 0.
Remaining are 6. 3 fits twice into 6, so the next bit is 2.
remaining are 0. so the last bit 0
This gives as result 22020.
One quick sanity check on how many "bits" are needed for representation of decimal 222 in a number system with 3 Numbers: 1+log(222)/log(3)=5,9 => nearly 6 "bits" are needed, which goes well with the result 22020.
First see how many figures you have, here we have 3 so
we have to convert 222 to binary when we have only 3 figures so
2×3^2+2×3^1+2×3^0 (if the number were being 121 then →
1×3^2+2×3^1+1×3^0)
which gives 26 then divide this with 2 until we don't get 1/2
when reminder is 1 then write 1 if 0 then 0 you will get
so we get 01011 just reverse it we have the answer
11010
enter image description here

Binary Voltage relation to endian

Lately, I have been studying 6502 microprocessor and came across the fact that binary and voltage relate. 0 for 0volts and 1 for 5 volts.
Now I just recently learned about endian-ness as well. So trying to learn more about both of these topics I was wondering if someone could explain the relation of binary/voltage and the little or big endian.
If there really isn't a difference because 00000001 would only use 5 volts and 10000000 would only used 5 volts as well. Then I am sorry for asking a useless topic. Now if that is the case, please share some more interesting knowledge about endian-ness, binary
and/or Voltage.
Unfortunately I don't have university experience so i am unsure if this is common knowledge, but thanks for any information that you provide.
They're not very related.
When you have a voltmeter and you read a single bit, a 0-volt would correspond to a 0, and a 5-volt would correspond to a 1. Or you could say "high voltage is 1, and low voltage is 0".
Now, to represent a number, let's simply say that we use powers of 2:
1 = 001
2 = 010
3 = 011
4 = 100
5 = 101
And so on. However, what I just used is little-endian: the end bit (the one on the right) is small, it represents 1 (if it's 1) or 0 (if it's 0), as opposed to the bit on the left (4 if it's 1, 0 if it's 0). If we flipped the order around, that would be big-endian.
You could think of each bit (each 0 or 1) as a different wire with either 0 or 5 volts on it.

calculating the density of a set

(I wish my mathematical vocabulary was more developed)
I have a website. On that website is a video. As a user watches the video, a bit of javascript stores how far they have gotten so far in the video. When they stop watching the video, that number of seconds is stored. There's no pattern to when the js will do this, unfortunately.
So if one person is watching the video, we might see this set:
3
6
8
10
12
16
And another person might get bored immediately:
1
3
This data is all stored in the same place, anonymously. So the sorted table with all this info would look like this:
1
3
3
6
8
10
12
16
Finally, the amount of times the video is started at all is stored. In this case it would be 2.
So. How do I get the average 'high-time' (the farthest reached point in the video) for all of the times the video was played?
I know that if we had a value for every second:
1
2
3
4
5
6
7
...
14
15
16
1
2
3
Then we could count up the values and divide by the number of plays:
(19) / 2 = 9.5
Or if the data was otherwise uniform, say in increments of 5, then we could count that up and multiply it by 5 (in the example, we would have some loss of precision, but that's ok):
5
10
15
5
(4) * 5 / 2 = 10
So it seems like I have a general function which would work:
count * 1/d = avg
where d is the density of the numbers (in the example above with 5 second increments, 1/5).
Is there a way to derive the density, d, from a set of effectively random numbers?
Why not just keep the last time that has been provided, and average across those? If you either throw away, or only pay attention to, the last number, it seems like you could just average over these.
You might also want to check out the term standard deviation as the raw average of this might not be the most useful measurement. If you have the standard deviation as well, it could help you realize that you have an average of 7, but it is composed of mostly 1's and 15's.
If you HAVE to have all the data, like you suggested, I will try and think about this a little bit more. I'm not totally certain how you can associate a value with all the previous values that came with it. Do you ALWAYS know the sequence by which numbers are output? If so, I think I know of a way you could derive the 'last' one, which might be slightly computationally expensive.
If you only have a sequence of integers, I think you may be able to increase each value (exponentially?) to 'compensate' for the fact that a later value 'contains' earlier values. I'm still working through this idea, but maybe it will give someone else a seed. What if you average over the sum of these, and then take the base2 logarithm of this average? Does that provide any kind of useful metric? That should 'weight' the later values to the point where they compensate for the sum of earlier values. I think.
In python-esk:
sum = 0
numberOf = 0
for node in nodes:
sum = sum + node.value ^ 2
numberOf = numberOf + 1
weightedAverage = log(sum/numberOf, 2)
print weightedAverage
print "Thanks Brian"
I think that #brian-stiner is on the right track in one of his comments.
Start with something like:
1
3
3
6
8
10
12
16
Turn that into numbers and counts.
1, 1
3, 2
6, 1
8, 1
10, 1
12, 1
16, 1
And then reading from the end down, find all of the points that happened more often than any remaining ones.
3, 2
16, 1
Take differences in counts.
3, 1
16, 1
And you have an estimate of stopping places.
This will not be an unbiased estimate. But if the JavaScript is independently inconsistent and the number of people is large, the biases should be fairly small.
It won't be right, but it will be close enough for government work.
Assuming increments are always around 5, some missing, some a bit longer or shorter. Then it won't be easy (possible?) to do this exactly. My suggestion: compute something like a 'moving count'. Similar to moving average.
So, for second 7: count how many numbers are 5,6,7,8 or 9 and divide by 5. That will give you a pretty good guess of how many people watched the 7th second. Do the same for second 10. The difference would be close to the number of the people who left between second 7 and 10.
To get the total time watched for each user, you'll have parse the list smallest to largest. If you have 4 views, you'll go through your list until you find that you no longer have 4 identical numbers, the last number where you had 4 identical numbers is the maximum of the first view. Then you'll look for when the 3 identical numbers stop, and so on. For example:
4 views data:
1111222233334445566778
4 views side by side:
1 1 1 1
2 2 2 2
3 3 3 3 <- first view max is 3 seconds
4 4 4 <- second view max is 4 seconds
5 5
6 6
7 7 <- third view max is 7 seconds
8 <- fourth view max is 8 seconds
EDIT- Oh, I just noticed that they are not uniform. In that case, the moving average would probably be your best bet.
The number of values roughly corresponds to the number of time periods in which your javascript sends the values (minus 1/2 if the video stop is accompanied with a obligatory time posting, since its moment is random within the interval).
If all clients have similar intervals and you know them, you may just use:
SELECT (COUNT(*) - 0.5) * 5.0 / (SELECT counter FROM countertable)
FROM ticktable
5.0 is the interval between the posts here.
Note that it does not even look at the values: you could as well just store "ticks".
For the max time, you could use MAX() on your field. Perhaps something like...
SELECT MAX(play_time) AS maxTime FROM video
Which would give you the longest time someone has played the video for.
If you want other things, like AVG() then you'll need more complex queries, for collecting on a per-user basis etc etc.
MySQL also contains a Standard Deviation function called STDDEV() and STD() which could help you too.