Pilcrow is added in OkHttp android logcat for long response body - android-logcat

I added HttpLoggingInterceptor with log level BODY.
My response body is quite long (84.37 KB with measurement tool).
This response takes 22 log lines in Android Studio.
The issue is simple - when I copy it from logcat (pressing ALT and copy only log text) and paste into any online JSON parser, it gives me an error.
While investigating this I found that between lines there is pilcrow (paragraph sign), which is possible to replace in MS Word only (Notepad++ doesn't see it).
When response was approximately 3 times shorter, I didn't face such problem.
Is it possible to avoid adding pilcrow in output?

I suspect that those "pilcrows" are just new lines added by Android Studio to reduce max lines length. If so you can replace them with Notepad ++ if you enable "Extended" in the "Search mode" (bottom left radio buttons) on Replace dialog and put "\r\n" as "find what" and an empty string as "replace with".
Also in Notepad ++ you can check View > Show Symbols and enable "Show All Characters" to see what those "pilcrows" are to Notepad ++ (and I suspect they would be CR + LF).

Related

Chrome on Windows adding trailing underscores to downloaded files?

I've got a rather odd situation happening, that I'm having difficulty tracking down in an existing Django application. One of the views, which inherits from APIView, returns with a file when a user makes a POST call. The endpoint works fine, but there's something odd happening when the downloaded file reaches the client machine. By the time the browser receives the file, the file extension has been renamed with a trailing underscore. (So suppose the file was originally "test.txt", the version that the client receives would be "test.txt_").
As near as I can figure, just before the response object is returned in the APIView, the content-type and content-disposition headers look correct. E.g.:
Content-Type: application/octet-stream
Content-Disposition: attachment;filename="test.txt"
That same file, when it shows up in Chrome downloads, is named "test.txt_" - with the trailing underscore. I've tried the same thing out in Firefox, and it seems to download correctly. Unfortunately, telling the majority of our users to switch browsers isn't going to fly.
I have tried:
Forcing a different content type (e.g.: instead of "application/octet-stream", try "application/text", just to see what happens). This had no effect.
Formatting the content disposition slightly different (e.g.: space between the semicolon and filename). This also had no effect.
Removed the double quotes around the filename in the content-disposition header. No effect.
Dropping breakpoints within the Rest Framework itself, but Visual Studio Code doesn't seem to trigger on these. (I'm not super-familiar with debugging through Visual Studio Code, so this may be my fault).
Stripped out any custom middleware, so the only remaining middleware are as follows:
corsheaders.middleware.CorsMiddleware
django.contrib.sessions.middleware.SessionMiddleware
django.middleware.locale.LocaleMiddleware
django.middleware.common.CommonMiddleware
django.middleware.csrf.CsrfViewMiddleware
django.contrib.auth.middleware.AuthenticationMiddleware
django.contrib.messages.middleware.MessageMiddleware
So far, any similar issues that other people have experienced seem to be slightly different (i.e.: Internet Explorer removing the period in the extension and replacing it with an underscore).
Any guesses on what might be happening here? I'm a bit stumped.
You have to remove "" from your file name
Change attachment; filename="filename.txt" to attachment; filename=filename.txt
Although seems like you won't be able to have spacing in file name
I finally figured out what was going on here. The UI that was used to trigger the download was doing so through creating a temporary anchor tag (see the second answer here: Download data url file ). When it was doing so, it had two different cases. In one case, if downloading multiple files, it would change the file extension to .zip. In another case, if downloading a single file, it was still trying to append an extension, but the way the UI code was written, it was setting the extension to be an empty string. So the end result is a period being added, but no extension after that. For example, if the file being downloaded was "test.txt", it would end up as "test.txt.", which was then converted by Chrome to "test.txt_", on Windows, to make it a valid file extension.
Our environment has a document storage system that contains documents with the attributes DocumentName and ContentType. In some cases, the content type would return with spaces appended to the end of the string like "pdf ".
In Internet Explorer the output would truncate the end of the string while Chrome would convert the extra spaces to underscores giving me this filename: "file.pdf______________"
To resolve I simply truncate the string.
public string getFileName(string docName, string contentType) {
string fileName = docName + "." + contentType.Trim();
return fileName;
}
I encountered the same problem.
Let's say your download file name is "my_report.csv"
Then before doing the download operations get rid of " characters
fileName = fileName.replace('"','') // replacing one " charcter
fileName = fileName.replace('"','') // replacing second " character
This will resolve your issue.
My solution in ASP.NET core
[HttpGet("pdf/{fileId}")]
public IActionResult GetPdfFile([FromRoute]int fileId)
{
var result = Repo.GetFile(fileId);
Response.Headers.Add("Content-Disposition", $"inline; filename={result.FileName}");
return File(result.Data, "application/pdf");
}
I resolved this issue with replace white space in file name by a character like -.
This was happening for me when the filename included a comma.
lastname,MD.pdf
browser would download filestream as
_lastname,MD.pdf_
Adding code to remove a potential comma from the filename resolved the issue and made it download as expected.
filename = filename.Replace(",", ""); // replace comma characters with blank
now downloads as
lastnameMD.pdf
In my case there was a space as a first character, and it was replaced to underscore. So I simply removed the space :-)

Google Apps Script Javascript string length not as expected

function test(){
var log=(typeof Logger=='undefined')?console:Logger;
log.log(" ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ".length);
}
The code prints 127.0 in Google Apps Script, but returns 128 in Chrome browser and Nodejs!
This is a known bug in Rhino, the JS engine that Google Apps Script uses. It doesn't correctly process "soft hyphen" character 0xAD when it's entered directly in a string; the character just gets lost. Your string contains it within "¬­®". To simplify the example,
"a­b".length
(with a soft hyphen between a and b) returns 3 in browsers but 2 in GAS.
A workaround, if you must use soft hyphen in strings, is to escape it like "a\u00ADb"
"a\u00ADb" === "a­b"
evaluates to true in browsers, and to false in GAS.
This discussion, currently offline but available from Google cache, refers to this bug. I quote it below
Subject: Re: Rhino eats strange characters
Hi Richard,
for me this clearly looks like a bug in Rhino where I'm able to
reproduce it. I'll try to prepare a patch for Rhino for this problem.
Please open an issue in HtmlUnit to be sure that it doesn't get lost
(and to be sure that I've correctly identified the root cause).
Cheers,
Marc.
--
Web: http://www.efficient-webtesting.com
Blog: http://mguillem.wordpress.com
Richard Eggert wrote:
I recently attempted to use HtmlUnit to load pages that have been
"compressed" using HTMLZip (http://www.htmlzip.com/) and found that
HtmlUnit horribly mangles the output. Since HTMLZip claims to work
properly in every major browser (and I'll take their word for it), I
figure this is a bug in HtmlUnit, since it is supposed to mimic the
behavior of "normal" browsers.
Examining the source code of a page generated by HTMLZip, I found that
HTMLZip uses JavaScript strings that contain unprintable characters
without escaping them. When I replaced all the unprintable characters
with their corresponding \x escape sequences, HtmlUnit was able to
process the page. However, HtmlUnit was not able to process pages in
which multiple layers of HTMLZip compression had been applied.
I then did an experiment in which I created a very simple ISO-8859-1
HTML document that contained just a SCRIPT tag that declared a variable
"x" that was assigned a string containing the characters 0 through 255,
escaping only the white space and quotation characters (to avoid syntax
errors). I ran it through HtmlUnit and examined the value of the "x"
variable. I found that every character was preserved intact EXCEPT for
the 0xAD character, which corresponds to the Unicode SHY "soft hyphen"
character in ISO-8859-1. The character was just plain missing from the
string!
In order to narrow down where the 0xAD was getting dropped, I used a
ScriptPreProcessor to capture the script before it was passed to Rhino.
I examined the captured script and found that the 0xAD was still present
in the text, which indicates to me that the character is being dropped
by Rhino and not by the HTML parser.
Should I submit a bug report for this? Also, can anyone think of a
quick workaround? Off the top of my head, all I can think of would be
to write a ScriptPreProcessor that automatically converts the SHY
character to an escape sequence, but without actually parsing the
script, I could end up escaping characters that appear outside of string
literals.
Rich Eggert
Member of Technical Staff
Proteus Technologies, LLC

How to set caret to a specific address in binary file opened with UltraEdit (UE) in hex edit mode?

I have a large binary file about 2 GB. I open it with UltraEdit in Hex Edit mode, and try to drag the vertical scroll bar to set caret to a specific address like 0x12345678 or some other address. But when I drag a little, the address moves a lot! The larger the file, the harder to navigate to an address.
Is there an easy way to do so?
Clicking in menu Search on menu item Goto or pressing Ctrl+G opens for a file currently displayed in hex editing mode the Hex Goto dialog.
In this dialog the byte offset to jump to can be entered in decimal or in hexadecimal when entered string starts with 0x as in your example.
On first Goto you have to wait some seconds as UltraEdit (v22.10) parses the entire file for line terminators for line number indication although that should not be done for a binary file opened in hex edit mode as there are no line numbers displayed in hex edit mode. I have reported this issue already to IDM Computer Solutions, Inc., but this bug is not fixed up to now. Further Goto executions are much faster as no useless parsing for line terminators anymore.

Sikuli ide special caracters :, \ doesn' t pass through

I am using Sikuli ide,
I want to do a very simple type("1440144711350.png", "C:\tests\exportDest.csv")
But it doesn' t seems to work, when i run it, i got errors, what' s the problem may be coming from ?
Thanks
From your code, I suppose you are trying to find or open a file from the windows explorer, or something similar.
The type function simulates the Standard US keyboard, can be tricky if you do not have one. Have a look to SikuliX documentation on the function type. You certainly then have to prefer the method paste
The second issue you certainly encounter is linked to the backslash that is interpreted in your string (for example \t is interpreted as tab). you have to escape them with a \\
To verify it: simply paste (Ctrl+V) in atext editor after running your script can provide you an idea what sikuli tried to paste...
paste("1440144711350.png", "C:\\tests\\exportDest.csv")
if needed you can simply press the enter key afterwards as following:
paste("1440144711350.png", "C:\\tests\\exportDest.csv")
type(Key.ENTER)

How can I read Chrome Cache files?

A forum I frequent was down today, and upon restoration, I discovered that the last two days of forum posting had been rolled back completely.
Needless to say, I'd like to get back what data I can from the forum loss, and I am hoping I have at least some of it stored in the cache files that Chrome created.
I face two problems -- the cache files have no filetype, and I'm unsure how to read them in an intelligent manner (trying to open them in Chrome itself seems to "redownload" them in a .gz format), and there are a ton of cache files.
Any suggestions on how to read and sort these files? (A simple string search should fit my needs)
EDIT: The below answer no longer works see here
In Chrome or Opera, open a new tab and navigate to chrome://view-http-cache/
Click on whichever file you want to view.
You should then see a page with a bunch of text and numbers.
Copy all the text on that page.
Paste it in the text box below.
Press "Go".
The cached data will appear in the Results section below.
Try Chrome Cache View from NirSoft (free).
EDIT: The below answer no longer works see here
Chrome stores the cache as a hex dump. OSX comes with xxd installed, which is a command line tool for converting hex dumps. I managed to recover a jpg from my Chrome's HTTP cache on OSX using these steps:
Goto: chrome://cache
Find the file you want to recover and click on it's link.
Copy the 4th section to your clipboard. This is the content of the file.
Follow the steps on this gist to pipe your clipboard into the python script which in turn pipes to xxd to rebuild the file from the hex dump:
https://gist.github.com/andychase/6513075
Your final command should look like:
pbpaste | python chrome_xxd.py | xxd -r - image.jpg
If you're unsure what section of Chrome's cache output is the content hex dump take a look at this page for a good guide:
http://www.sparxeng.com/blog/wp-content/uploads/2013/03/chrome_cache_html_report.png
Image source: http://www.sparxeng.com/blog/software/recovering-images-from-google-chrome-browser-cache
More info on XXD: http://linuxcommand.org/man_pages/xxd1.html
Thanks to Mathias Bynens above for sending me in the right direction.
EDIT: The below answer no longer works see here
If the file you try to recover has Content-Encoding: gzip in the header section, and you are using linux (or as in my case, you have Cygwin installed) you can do the following:
visit chrome://view-http-cache/ and click the page you want to recover
copy the last (fourth) section of the page verbatim to a text file (say: a.txt)
xxd -r a.txt| gzip -d
Note that other answers suggest passing -p option to xxd - I had troubles with that presumably because the fourth section of the cache is not in the "postscript plain hexdump style" but in a "default style".
It also does not seem necessary to replace double spaces with a single space, as chrome_xxd.py is doing (in case it is necessary you can use sed 's/ / /g' for that).
Note: The flag show-saved-copy has been removed and the below answer will not work
You can read cached files using Chrome alone.
Chrome has a feature called Show Saved Copy Button:
Show Saved Copy Button Mac, Windows, Linux, Chrome OS, Android
When a page fails to load, if a stale copy of the page exists in the browser cache, a button will be presented to allow the user to load that stale copy. The primary enabling choice puts the button in the most salient position on the error page; the secondary enabling choice puts it secondary to the reload button. #show-saved-copy
First disconnect from the Internet to make sure that browser doesn't overwrite cache entry. Then navigate to chrome://flags/#show-saved-copy and set flag value to Enable: Primary. After you restart browser Show Saved Copy Button will be enabled. Now insert cached file URI into browser's address bar and hit enter. Chrome will display There is no Internet connection page alongside with Show saved copy button:
After you hit the button browser will display cached file.
I've made short stupid script which extracts JPG and PNG files:
#!/usr/bin/php
<?php
$dir="/home/user/.cache/chromium/Default/Cache/";//Chrome or chromium cache folder.
$ppl="/home/user/Desktop/temporary/"; // Place for extracted files
$list=scandir($dir);
foreach ($list as $filename)
{
if (is_file($dir.$filename))
{
$cont=file_get_contents($dir.$filename);
if (strstr($cont,'JFIF'))
{
echo ($filename." JPEG \n");
$start=(strpos($cont,"JFIF",0)-6);
$end=strpos($cont,"HTTP/1.1 200 OK",0);
$cont=substr($cont,$start,$end-6);
$wholename=$ppl.$filename.".jpg";
file_put_contents($wholename,$cont);
echo("Saving :".$wholename." \n" );
}
elseif (strstr($cont,"\211PNG"))
{
echo ($filename." PNG \n");
$start=(strpos($cont,"PNG",0)-1);
$end=strpos($cont,"HTTP/1.1 200 OK",0);
$cont=substr($cont,$start,$end-1);
$wholename=$ppl.$filename.".png";
file_put_contents($wholename,$cont);
echo("Saving :".$wholename." \n" );
}
else
{
echo ($filename." UNKNOWN \n");
}
}
}
?>
I had some luck with this open-source Python project, seemingly inactive:
https://github.com/JRBANCEL/Chromagnon
I ran:
python2 Chromagnon/chromagnonCache.py path/to/Chrome/Cache -o browsable_cache/
And I got a locally-browsable extract of all my open tabs cache.
The Google Chrome cache directory $HOME/.cache/google-chrome/Default/Cache on Linux contains one file per cache entry named <16 char hex>_0 in "simple entry format":
20 Byte SimpleFileHeader
key (i.e. the URI)
payload (the raw file content i.e. the PDF in our case)
SimpleFileEOF record
HTTP headers
SHA256 of the key (optional)
SimpleFileEOF record
If you know the URI of the file you're looking for it should be easy to find. If not, a substring like the domain name, should help narrow it down. Search for URI in your cache like this:
fgrep -Rl '<URI>' $HOME/.cache/google-chrome/Default/Cache
Note: If you're not using the default Chrome profile, replace Default with the profile name, e.g. Profile 1.
It was removed on purpose and it won't be coming back.
Both chrome://cache and chrome://view-http-cache have been removed starting chrome 66. They work in version 65.
Workaround
You can check the chrome://chrome-urls/ for complete list of internal Chrome URLs.
The only workaround that comes into my mind is to use menu/more tools/developer tools and having a Network tab selected.
The reason why it was removed is this bug:
https://chromium.googlesource.com/chromium/src.git/+/6ebc11f6f6d112e4cca5251d4c0203e18cd79adc
https://bugs.chromium.org/p/chromium/issues/detail?id=811956
The discussion:
https://groups.google.com/a/chromium.org/forum/#!msg/net-dev/YNct7Nk6bd8/ODeGPq6KAAAJ
The JPEXS Free Flash Decompiler has Java code to do this at in the source tree for both Chrome and Firefox (no support for Firefox's more recent cache2 though).
EDIT: The below answer no longer works see here
Google Chrome cache file format description.
Cache files list, see URLs (copy and paste to your browser address bar):
chrome://cache/
chrome://view-http-cache/
Cache folder in Linux: $~/.cache/google-chrome/Default/Cache
Let's determine in file GZIP encoding:
$ head f84358af102b1064_0 | hexdump -C | grep --before-context=100 --after-context=5 "1f 8b 08"
Extract Chrome cache file by one line on PHP (without header, CRC32 and ISIZE block):
$ php -r "echo gzinflate(substr(strchr(file_get_contents('f84358af102b1064_0'), \"\x1f\x8b\x08\"), 10,
-8));"
Note: The below answer is out of date since the Chrome disk cache format has changed.
Joachim Metz provides some documentation of the Chrome cache file format with references to further information.
For my use case, I only needed a list of cached URLs and their respective timestamps. I wrote a Python script to get these by parsing the data_* files under C:\Users\me\AppData\Local\Google\Chrome\User Data\Default\Cache\:
import datetime
with open('data_1', 'rb') as datafile:
data = datafile.read()
for ptr in range(len(data)):
fourBytes = data[ptr : ptr + 4]
if fourBytes == b'http':
# Found the string 'http'. Hopefully this is a Cache Entry
endUrl = data.index(b'\x00', ptr)
urlBytes = data[ptr : endUrl]
try:
url = urlBytes.decode('utf-8')
except:
continue
# Extract the corresponding timestamp
try:
timeBytes = data[ptr - 72 : ptr - 64]
timeInt = int.from_bytes(timeBytes, byteorder='little')
secondsSince1601 = timeInt / 1000000
jan1601 = datetime.datetime(1601, 1, 1, 0, 0, 0)
timeStamp = jan1601 + datetime.timedelta(seconds=secondsSince1601)
except:
continue
print('{} {}'.format(str(timeStamp)[:19], url))