notepad++ handling / converting binary command characters - binary

I'm using notepad++. i want to copy my code and then paste it in a simple textarea of little program (which obfuscates variables, removes blank lines & comments) and returns it.
the problem is my code contains binary command characters (like the NUL in white writing with black background) which the program cant handle.
my questions is, is there a simple way to convert these command charachters into something safe, run the program, and then convert them back?
thanks

In SynWrite editor this converting of NULL char can be done. Synwrite has text-converters (Run menu): described in help file topic.
PSPad has similar text-conv feature (Tools menu).
Or you can use a regex to replace [\x00-\x19] with new string.

Related

CSV displayed Arabic letter as?

I had a csv file initially it has Arabic letters in it, I did not know and I have made changes to it like putting formulas and saved it. Later when I open it I found all arabic characters are displayed as ?.
I browsed internet and tried all ways of importing data from this csv but the arabic characters that were saved as ? are still appearing as ?. I badly want to retrieve them as those were my leads.
Is there any way that I can extract Arabic characters from that file which was saved already. or is there any way I can restore the earliest version which has Arabic text in it. I do not have history version in windows so that option is not available.
Realy any help is appreciated.
You have to separate between file content, and display of the file.
The file actually does not contain "arabic" but a series of bytes. How these are displayed is totally up to the program that read the file. Simplified it's kind of like the string 1234 which you can read as either "12" and "34" or "1234".
The display is an interpretation of the file content. So it's not directly possible to say if it's wrong or not.
The offending program MAY have written the data back in an incorrect way, either directly as "?" characters, or as meaningless data. Then the file data is changed forever, and can't be brought back in that file.
If the offending program writes the data right, but always displays "?", the data may be there - you need to check in a unicode-enabled program.
Make a simple test with a sample dummy file. See this sample file

How to run the html file created in Sublime Text in Chrome directly?

I tried creating a build system in sublime text and saving as chrome:
{
"cmd":["C:\Program Files (x86)\Google\Chrome\Application\chrome.exe","$file"]
}
But it had no success.
On running with ctrl B it shows No Build System
Note: I have selected Chrome in build system.
What I can do solve this?
The contents of a sublime-build file has to be valid JSON in order for Sublime to recognize it, but the build that you posted above is not valid.
The \ character is special in JSON (as in many programming languages as well) in that it indicates that the next character should be interpreted specially, for example \n meaning "start a new line".
In order to use a \ character in a string, you need to double it to tell the JSON parser that it's just supposed to be a regular character and not special. For example:
{
"cmd":["C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe","$file"]
}
Alternatively Windows generally accepts / in place of \ in paths, which depending on your preference can be a little easier to look at visually:
{
"cmd":["C:/Program Files (x86)/Google/Chrome/Application/chrome.exe","$file"]
}

When I open a docx in hex viewer, can someone explain what i'm seeing

I wanted to learn about what I was looking at when I open a DOCX file in a hexadecimal viewer.
For example:
Hexadecimal is base 16 on a 32bit (DWORD) file?. So I was assuming that starting from right to left you would do:
0*16^0 + 0*16^1 + 6*16^2 + 14*16^3 ...... all the way to the 504B.
But when I end up with this huge number! It means nothing to me.
So I really guess I don't understand what I'm looking at. Why are the hex characters on the right section 0->F displaying funny characters under each one - PK........!.ae
Any information would be so helpful. I started with doing bitmaps and now I thought I'd have a play with DOCX to see if I could write a search tool for the files, But if I don't understand this simple concept, I have no possibility of cracking it on its head.
.docx files are .ZIP files. Run unzip on the file for your first step.

Method of identifying plaintext files as scripts

I am creating a filter for files coming onto a Unix machine. I only want to allow plain text files that do not look like scripts to pass through.
For checking plain text I am checking the executable bit of the file and using the -T file test from perl. (I understand this is not 100%, but it will catch the binary files I most want to avoid). I think this will be sufficient, but any suggestions are welcome.
My main question is in recognizing when a plain text file is a script. Every script I've ever written has started out with a #! line, so my first thought is to read in the file's first line and block any containing that. Are there common non-script plain text files that start with the #! line that I will flag with a false-positive? Are there better/additional methods of identifying a script?
That's what the file command (see Wikipedia) is for. It recognizes much more than just the she-bang (#!), and can tell you what kind of script it is, if any.

How to display non-ASCII characters from a XML output

I get this output in a XML element:
£111.00
It should be £111.00.
How can i sort this out so that all unicode characters are displayed rather than the code. I am using linux tool wget to fetch the xml file from the Internet. Perhaps some sort of convertor?
I am viewing the file in putty , i am parsing the file and i want to clean the input before parsing.
I am using xml_grep2 to get the elements i want and then cat filename | while read .....
Ok i'm going to close this question now.
After parsing the file with xml_grep2 i was able to get a clean output however was seeing this à character in the file. I changed putty settings for character set to UTF-8 from ISO-8859 to resolve that.
You can use HTML::Entities to replace the entities with literal character codes. I don't know how good its coverage is, though. There are bound to be similar tools for other languages if you are not comfortable with Perl. http://metacpan.org/pod/HTML::Entities
sh$ echo '£111.00' | perl -CSD -MHTML::Entities -pe 'decode_entities($_)'
£111.00
This won't work if the HTML::Entities module is not installed. If you need to install it, there are numerous tutorials about the CPAN on the Internet.
Edit: Add usage example. The -CSD option might not be necessary on your system, but on OSX at least, I got garbage output without it.