GIF Understanding Image Decoding - Lempel-Ziv-Welch - gif

I'm trying to build some code I can use to convert a GIF file to another file format (which I already know how to create. [I am trying to streamline conversion from GIF to GRF - a printer graphic file format.])
I am working off of information from Wikipedia (http://en.wikipedia.org/wiki/Graphics_Interchange_Format#Image_coding).
There is a section that describes converting from bytes to 9-bit codes. The examples they show are:
9-bit binary Bytes
(hex) (hex)
00000000 00
100
0101000|1 51
028
111111|00 FC
0FF
00011|011 1B
103
0010|1000 28
102
011|10000 70
103
10|100000 A0
106
1|1000001 C1
107
10000011 83
00000001 01
101
0000000|1 01
I am able to generate the bytes given on the right side from the file. (I created a file exactly the way they described in the article (3x5 with black pixels in 0,0 and 1,1 in MSPaint.)
What I am not understanding is how they are converting these bytes to the 9-bit hex codes.
How does 00 become 100? What does the bar (|) in the binary mean?

Just realized what is happening...
00000000 00 Add the next digit to the beginning - 100000000 = 100
100
0101000|1 51 Next Digit - 000101000 = 028
028
111111|00 FC -011111111 = 0FF
0FF
00011|011 1B -100000011 = 103
103
0010|1000 28 -100000010 = 102
102
011|10000 70 -100000011 = 103
103
10|100000 A0 -100000110 = 106
106
1|1000001 C1 -100000111 = 107
107
10000011 83
00000001 01 -100000001 = 101
101
0000000|1 01

Related

strange output after appending a column

I cbind a column "class" to a data frame and got a new tdm1, tdm1<- cbind(tdm1, class), it's all good
the content of class looks like this
1 715
2 715
3 707
4 705
5 704
6 701
7 701
...
Then after cbind, I want to get a look at the class column by using tdm1[,ncol(tdm1)], somehow i got 35 Levels: 156 174 205 250 295 324 335 340 343 345 348 349 361 370 375 381 382 428 439 451 455 701 704 705 706 ... 72 after the correct values for the entire column. it's like a summary of the column value. Idon't know where it came from. this additional information makes my later knn classification weird. how do i get rid of it?
Your object is a factor. Calling ?factor reveals:
factor returns an object of class "factor" which has a set of integer
codes the length of x with a "levels" attribute of mode character and
unique (!anyDuplicated(.))
The levels attribute being printed to your dismay reflects all the unique values contained within the object you are printing. To get rid of it, try:
as.numeric(as.characer(tdm1[,ncol(tdm1)]))

Grab HTML table using XML

I am trying to read an html table using the package XML, but even though it looks easy, I haven’t managed to do it. I tried everything, but the names of the columns are always fixed by R as V1, V2, V3,…
This is the code:
require(XML)
tbl <- readHTMLTable("http://facedata.ornl.gov/ornl/npp_98-08.html”,
header = c("year","ring","CO2", "stem","root","leaf","fine root", "NPP"),
skip.rows=c(1,2),colClasses=c(rep("factor",3),rep("numeric",5)))
Many thanks for your help
The first row of the table is causing trouble. It maybe easiest to remove it:
library(XML)
appURL <- "http://facedata.ornl.gov/ornl/npp_98-08.html"
doc <- htmlParse(appURL)
removeNodes(doc["//table/tr[1]"]) # remove the first row with the troublesome header
myTable <- readHTMLTable(doc, which = 1)
> head(myTable)
Year Plot CO2 Stem Coarse Root Leaf Fine Root Total NPP
1 1998 1 elev 1540 127 362 168 2197
2 1998 2 elev 1487 139 418 175 2219
3 1998 3 amb 1085 112 333 231 1762
4 1998 4 amb 1204 113 368 185 1870
5 1998 5 amb 1136 109 382 56 1683
6 1999 1 elev 1218 98 475 295 2086

Change 1 bit in file. What I did wrong?

I had file with such content:
00 00 00 00 00
I had changed 1 bit. Changed file:
00 60 00 00 00
My teacher said, that I don't know what means bit. What I did wrong? Clarify this for me, please: file has 5 block (10 digits). Bit is 00? Or bit is 0 — 1 digit of pair. Thank you.
If this is in hexidecimal notation, then you have some terminology confusion.
00 00 00 00 00
|__| ^
\ |
byte nibble
A byte is two nibbles, and a nibble is 4 bits.
Decimal Hex Binary
0 0 0000 <- You went from here...
1 1 0001
2 2 0010
3 3 0011
4 4 0100
5 5 0101
6 6 0110 <- ...to here, a change in two bits of one nibble.
7 7 0111
8 8 1000
9 9 1001
10 a 1010
11 b 1011
12 c 1100
13 d 1101
14 e 1110
15 f 1111
That depends on what that notation means, but I'm assuming it's showing 5 bytes in hexadecimal notation.
These are bytes, 8 bit, in binary notation:
00000000
00000001
00000010
...
These are the same bytes in hexadecimal notation:
00
01
02
...
Hexadecimal notation goes from 00 to FF, binary notation for the same values from 00000000 to 11111111. If you changed 00 to 60, you changed 00000000 to 01100000. So you changed 2 bits.
You are viewing the file in a hex editor/viewer. Each digit is a hexadecimal digit consisting of four bits in binary. The fact that you went from 00 to 60 means that you changed two bits in one of the hex digits. If you were viewing in binary mode, you wouldn't see anything other than 0s and 1s.
hex 0 == binary 0000
hex 6 == binary 0110
I would recommend reviewing binary and hexadecimal notation.

Automatically break comments after a certain amount of characters

Is it possible to highlight a large comment block and automatically insert breaks after a certain length?
Simple Example
# This is a super long message that has too much information in it. Although inline comments are cool, this sentence should not be this long.
After
# This is a super long message that has too much information in it.
# Although inline comments are cool, this sentence should not be this
# long.
Yes! And it's simpler than the length of this answer might suggest!
Background
There's a command called wrap_lines which has the capability to to exactly what you want. wrap_lines is defined in Packages/Default/paragraph.py, line 112, and is sparsely documented at the Unofficial Docs' list of commands:
wrap_lines
Wraps lines. By default, it wraps lines at the first ruler’s column.
width [Int]: Specifies the column at which lines should be wrapped.
wrap_lines is already accessible via the items located in the Edit -> Wrap menu. There are options to wrap the paragraph in which the cursor lies at column 70, 78, 80, 100, and 120, as well as the first ruler. By default, Wrap Paragraph at Ruler is mapped to Alt+Q on Windows and Super+Alt+Q on OSX.
Working with Rulers
What do I mean by the first ruler? You might summon a ruler via View -> Ruler, but if you want more than one on-screen (or would rather have your rulers in writing), you can add a JSON array of integers—each of which defines a ruler—to any .sublime-settings file. The array I've added to my user preferences file, for example, looks like this:
"rulers":
[
79,
80,
72
],
Thanks to that first ruler rule, Alt+Q will wrap a line that's longer than 79 characters at the 79-column mark, even though there's a ruler "before" it at column 72. "First ruler" doesn't mean the leftmost ruler, but the first defined ruler. If I moved 80, to index 0 like I have below, then the lines would wrap at 80 columns instead. Likewise for 72.
"rulers":
[
80,
79,
72
],
Using a Key Binding
Rulers are for the weak, you say? You can also write a new key binding to wrap a line at a column of your choice! Just add something like this to your Preferences -> Key Bindings – User file:
{ "keys": ["alt+q"], "command": "wrap_lines", "args": {"width": 80} },
Removing the args object would instead wrap at the first ruler, like the Wrap Paragraph at Ruler command does. Wrap Paragraph at Ruler is actually defined just like that in the default Windows keymap file:
{ "keys": ["alt+q"], "command": "wrap_lines" },
Caveats
One of the best (and, in some cases, worst) things about the wrap_lines command is that it will detect any sequence of non-alphanumeric characters that begins the line and duplicate it when wrapping. It's great for writing comments, since the behavior that your example suggested does indeed happen:
# This is a super long message that has too much information in it. Although inline comments are cool, this sentence should not be this long, because having to scroll to the right to finish reading a comment is really annoying!
Becomes:
# This is a super long message that has too much information in it.
# Although inline comments are cool, this sentence should not be this
# long, because having to scroll to the right to finish reading a
# comment is really annoying!
But if the line happens to start with any other symbols, like the beginning of a quote, Sublime Text doesn't know any better than to wrap those, too. So this:
# "This is a super long message that has too much information in it. Although inline comments are cool, this sentence should not be this long, because having to scroll to the right to finish reading a comment is really annoying!"
Becomes this, which we probably don't want:
# "This is a super long message that has too much information in it.
# "Although inline comments are cool, this sentence should not be this
# "long, because having to scroll to the right to finish reading a
# "comment is really annoying!"
It's a good idea to be careful with what starts your original line. Also, the wrap_lines command targets the entire paragraph that your cursor is touching, not just the current line nor, surprisingly, only the working selection. This means that you can use the command again on a newly-wrapped series of lines to re-wrap them at a different column, but you might also end up wrapping some lines that you didn't want to—like if you're aligning a paragraph under a header in Markdown:
# Hey, this header isn't really there!
Be careful with what starts the original line. Also, the `wrap_lines` command **targets the entire paragraph**, not just the current line.
If the command is activated anywhere within that block of text, you'll get this:
# Hey, this header isn't really there! Be careful with what starts the
original line. Also, the `wrap_lines` command **targets the entire
paragraph**, not just the current line.
You can avoid this sort of issue with clever use of whitespace; another empty line between the header and the paragraph itself will fix the wrapping, so:
# Hey, this header isn't really there!
Be careful with what starts the original line. Also, the `wrap_lines` command **targets the entire paragraph**, not just the current line.
Becomes:
# Hey, this header isn't really there!
Be careful with what starts the original line. Also, the
`wrap_lines` command **targets the entire paragraph**, not just
the current line.
Since the operation is so fast, you shouldn't have much trouble with tracking down the causes of any errors you encounter, starting over, and creatively avoiding them. Sublime Text usually doesn't have any trouble with mixing together comments and non-comments in code, so if you're lucky, you won't ever have to worry about it!
In SublimeText3 (and any earlier versions w/regular expression support, i.e. all of them):
Here is the search/replace that inserts a newline every 48 characters:
Find:
(.{48}){1}
Replace:
\1\n
Explanation:
The parentheses form a group so that replace can reference matches
with \1.
The . matches any character, and the {n} matches
exactly n of them.
The replace command takes each match group and
back-substitutes it with a newline \n appended.
Note: \1 technically refers to only the first match found, but using "replace all" handles the remaining regex matches as well).
Real-world example:
Suppose you are formatting public keys, which copied directly from a browser without the header/footer provides the following for the Google Internet Authority:
9c 2a 04 77 5c d8 50 91 3a 06 a3 82 e0 d8 50 48 bc 89 3f f1 19 70 1a 88 46 7e e0 8f c5 f1 89 ce 21 ee 5a fe 61 0d b7 32 44 89 a0 74 0b 53 4f 55 a4 ce 82 62 95 ee eb 59 5f c6 e1 05 80 12 c4 5e 94 3f bc 5b 48 38 f4 53 f7 24 e6 fb 91 e9 15 c4 cf f4 53 0d f4 4a fc 9f 54 de 7d be a0 6b 6f 87 c0 d0 50 1f 28 30 03 40 da 08 73 51 6c 7f ff 3a 3c a7 37 06 8e bd 4b 11 04 eb 7d 24 de e6 f9 fc 31 71 fb 94 d5 60 f3 2e 4a af 42 d2 cb ea c4 6a 1a b2 cc 53 dd 15 4b 8b 1f c8 19 61 1f cd 9d a8 3e 63 2b 84 35 69 65 84 c8 19 c5 46 22 f8 53 95 be e3 80 4a 10 c6 2a ec ba 97 20 11 c7 39 99 10 04 a0 f0 61 7a 95 25 8c 4e 52 75 e2 b6 ed 08 ca 14 fc ce 22 6a b3 4e cf 46 03 97 97 03 7e c0 b1 de 7b af 45 33 cf ba 3e 71 b7 de f4 25 25 c2 0d 35 89 9d 9d fb 0e 11 79 89 1e 37 c5 af 8e 72 69
After search and replace (all), you get:
9c 2a 04 77 5c d8 50 91 3a 06 a3 82 e0 d8 50 48
bc 89 3f f1 19 70 1a 88 46 7e e0 8f c5 f1 89 ce
21 ee 5a fe 61 0d b7 32 44 89 a0 74 0b 53 4f 55
a4 ce 82 62 95 ee eb 59 5f c6 e1 05 80 12 c4 5e
94 3f bc 5b 48 38 f4 53 f7 24 e6 fb 91 e9 15 c4
cf f4 53 0d f4 4a fc 9f 54 de 7d be a0 6b 6f 87
c0 d0 50 1f 28 30 03 40 da 08 73 51 6c 7f ff 3a
3c a7 37 06 8e bd 4b 11 04 eb 7d 24 de e6 f9 fc
31 71 fb 94 d5 60 f3 2e 4a af 42 d2 cb ea c4 6a
1a b2 cc 53 dd 15 4b 8b 1f c8 19 61 1f cd 9d a8
3e 63 2b 84 35 69 65 84 c8 19 c5 46 22 f8 53 95
be e3 80 4a 10 c6 2a ec ba 97 20 11 c7 39 99 10
04 a0 f0 61 7a 95 25 8c 4e 52 75 e2 b6 ed 08 ca
14 fc ce 22 6a b3 4e cf 46 03 97 97 03 7e c0 b1
de 7b af 45 33 cf ba 3e 71 b7 de f4 25 25 c2 0d
35 89 9d 9d fb 0e 11 79 89 1e 37 c5 af 8e 72 69
Hi I came across this solution and tried it with Sublime Text 3, works great. If I use the usual Alt+q on a python docstring, it will do pretty much what's desirable:
will limit the scope to the docstring
will have the beginning and ending ''' or """ properly done
It involved modifying one command from the default package. Please see here.
https://gist.github.com/SmileyChris/4340807
In Sublime Text 3, the default package is under \Packages\Default.sublime-package. You will have to unzip it, and find the file paragraph.py. Place (just) this file under your user package directory eg. \Data\Packages\Default\ so this file now overwrites the default package's paragraph.py.
Thanks to the original author Chris Beaven (SmileyChris).

unknown data encoding

While I was working with an old application with existing database which is in ms-access contains some strange data encoding such as 48001700030E0F465075465A56525E1100121D04121B565A58 as email address
What kind of data encoding is this? i tried base64 but it dosent seems that. Can anybody with previous experience with ms-access could tell me what possible encoding could this be.
edit:
more samples
54001700030E0F46507546474550481C1D09090D04461B565A195E5F
40001700030E0F4650755F564E545F06025D100E0C
38001700030E0F4650754545564654155C101C0C
46001700030E0F4650755D565150591D1B0007124F565A58
above samples are surely emails and for web url it looks like this
440505045D070D54585C5B50585D581C1701004F025A58
440505045D121147544C5B584D4B5D17015D100E4F5C5B
This is vb + ms access program if that can be any help and i think it some standard encoding
edit (2) ::
from looking at web url encoding it seems 0505045D could be for http://
edit(3) ::
1 combination found
52021301161209755354595A5E5F561D170B030E1341461B56585A == paresh#falmingoexports.com
It appears to be bytes encoded as hexadecimal. But what those bytes mean, I don't know. Decoding it to ASCII doesn't reveal much:
H \x00\x17\x00\x03\x0e\x0fFPu FZVR^ \x11\x00\x12\x1d\x04\x12\x1bVZX
T \x00\x17\x00\x03\x0e\x0fFPu FGEPH \x1c\x1d\t\t\r\x04F\x1bVZ\x19^_
# \x00\x17\x00\x03\x0e\x0fFPu _VNT_ \x06\x02]\x10\x0e\x0c
8 \x00\x17\x00\x03\x0e\x0fFPu EEVFT \x15\\\x10\x1c\x0c
F \x00\x17\x00\x03\x0e\x0fFPu ]VQPY \x1d\x1b\x00\x07\x12OVZX
Things I've noticed that may help crack the code:
The 2nd to 10th bytes appear to constant \x00\x17\x00\x03\x0e\x0fFPu.
The first byte is BCD length (spotted by Daniel Brückner!)
16th bytes onwards appear to some binary format that either encode the data or perhaps a pointer to the data.
Two of them end in: \x12?VZX.
The strings seem to be hexadecimal representations of some binary data.
The first two digits are the length of the string - decimal, not hexadecimal - so not the entire string is hexadecimal.
38 001700030E0F465075 4545 5646 5415 5C10 1C0C
40 001700030E0F465075 5F56 4E54 5F06 025D 100E 0C
46 001700030E0F465075 5D56 5150 591D 1B00 0712 4F56 5A58
48 001700030E0F465075 465A 5652 5E11 0012 1D04 121B 565A 58
54 001700030E0F465075 4647 4550 481C 1D09 090D 0446 1B56 5A19 5E5F
^ ^
| |
| +---- constant part, 9 bytes, maybe mailto: or same domain name of
| reversed email addresses (com.example#foo)
|
+---- length of the reset in decimal, not hexadecimal
I can see no clear indication for the location of the at-sign and the dot before the top-level domain. Seems to be an indication against simple mono-alphabetic substitutions like ROT13.
paresh#falmingoexports.com
Length
26 characters
Histogram
1x
h # f l i n g x t . c
3x o
2x p 2x a 2x m 2x r 2x e 2x s
ASCII values in hexadecimal representation
70 61 72 65 73 68 40 66 61 6C
6D 69 6E 67 6F 65 78 70 6F 72
74 73 2E 63 6F 6D
The length of 52 hexadecimal symbols matches length of the
encoded string.
52 02 13 01 16 12 09 75 53 54 59
5A 5E 5F 56 1D 17 0B 03 0E 13
41 46 1B 56 58 5A
Histogram
1x
01 02 03 09 0B 0E 12 16 17 1B
1D 41 46 53 54 58 59 5E 5F 75
2x 13 2x 56 2x 5A
The histograms don't match - so this rules out mono-alphabetic substitutions possibly followed by a permutation of the string.