Can URL #anchor contain binary data? - json

I'm trying to encode web pages state in #anchor. Right now I am base64 encoding a JSON string, but it sometimes gets too long (10K+). Apparently I hit some kind of URL length limitation and it just doesn't work right (it gets cut off and JSON data structure can't be reconstructed).
I talked with some of my buddies and they said try to bzip or gzip it. I tried that, but now my #anchor is binary data.
I haven't been able to decode it properly, and I'm not sure if it even got sent correctly as part of URL.
Does anyone know how to add binary data in #anchor, if it's a good idea, or how to come up with an alternative working solution for my problem?

I would not bother with all of this.
Use Local Storage for your large data, and send a reference through your anchor to the data.

Related

How do you save a JSON response with Emojis as Unicode?

Currently I am scraping Instagram comments for a sentiment analysis project, and am using an Instagram scraper. It is supposed to output a comment file but it doesn't, so a workaround is to find the query URL in the log file and paste it into a browser.
An example URL would be this https://www.instagram.com/graphql/query/?query_hash=33ba35852cb50da46f5b5e889df7d159&variables={%22shortcode%22:%22CMex-IGn1G-%22,%22first%22:50,%22after%22:%22QVFCaERkTm84aWF3T1Exbmw5V0xhb05haVBEY2JaYmxhSTNGWVZ4M2RQWi0yVzVUSExlUlRYOUtsOVEtM0trRzBmSGxyYjdJV094a1hlYm1aLXZjdkVpZQ==%22}.
On Firefox I am able to view the JSON response and am also able to download it through two ways:
CTRL + A to select all and paste into a JSON file.
Download webpage as a JSON file.
The issue with these methods are that neither of these retain the emoji data. The first loses the emojis as they are not stored in unicode, but rather as question marks ???. I assumed this was related to the encoding, so tried to paste the raw response into Unicode files. Instead they are the emojis which can be represented as emojis ️🙌👏😍, but not unicode.
The second method either saves it with only the message {"message":"rate limited","status":"fail"} or another incorrect format.
The thing is, is that a few months ago I scraped some pages and managed to save the comments with the emojis stored in the unicode format. This is frustrating as I know it can be done, but I can't remember the process how I did it as I would have tried something basic, as I have outlined.
I am out of ideas and would greatly appreciate any help. Thank you.

How to store a blob of JSON in Airtable?

There does not appear to be a dedicated field type in Airtable for "meta" data blobs and/or a JSON string.
Is the "Attachment" type my best bet?
I could store it either as a json attachment, or on a String type column.
Since a full json on a text column would likely not be readable, I would store it as attachments.
However, it seems that at least for now, uploading attachments require the file to be already hosted somewhere first, so this route might not be the easiest one:
https://community.airtable.com/t/is-it-possible-to-upload-attachments/188
Right now this isn’t possible with the Airtable API alone. It’s
something we’ll think about for future API versions though. A
workaround for now is to use a different service
(e.g. Filestack90, imgur52, etc.) to process the upload before then
sending the url to Airtable. When Airtable processes the attachment,
it will copy the file to Airtable’s own (S3) server for safekeeping,
so it’s OK if the original uploaded file url is just temporary

how to pass backslash in html form

I know very little html, I have a backend application that does a mongodb lookup. I am building a simple html screen with forms to accept value to a web service which will run the mongo query and reply on the screen.
When I pass a filename path field in my form like this
\\test.server.com\filetest\test
in my web service app, I see the value coming in as
%5c%5Ctest.server.com%5cfiletest%5ctest
how can I get the value without this translation.
Matter fact I was hoping it would come in like this
\\\\test.server.com\\filetest\\test
as that is how things got stored in mongo.
You cannot pass a backslash directly as it is. That's because URLs can only be ASCII encoded. What this means is, that when you need to pass some special characters like Ü, as well as characters that need to be escaped in URLs (as spaces, backslashes, etc.) you need a way to represent them with ASCII symbols.
In your case the URL is getting encoded and backslashes are converted to %5c. To have them revert to '\' you need to either:
Decode them back in your server-side code. This is your best bet. This is done in different ways, depending on the technology your backend uses. In PHP, for example, you can use urldecode function - here.
Decode characters before querying in mongodb itself. This you will need to work on, because I'm not aware of a functionality that does this for you out of the box.
More info on URL encoding can be found here.
Hope this helps!

What is this method of display images called?

On Google Images, some images are loaded like this:

What is this called? I want to find some more information about it (eg. how well is it supported? Is it faster than traditional images?).
This is the Data URI schema, defined in RFC 2397.
In this specific case the image is base64 encoded and embedded in the page/css.
Doing this (Base64 encoding) increases the size of the data (by about 30%), but the data URI avoids an additional request to the server which in many cases would cause the complete page to load faster.
It's called Base64 encoding, or encoding Data URIs.
It's processed at the same speed as regular images, but it doesn't require an extra HTTP request, so it will result in a faster webpage load, but a (slightly) heavier HTML/CSS file.
That's the data URI scheme, see RFC 2397 as well.

Ruby encoding question

I'm saving scraped data to a web app, and here's a sample param:
400\xB0F.
This is the 'degree' character from a website, but when I put that into my model I get the dreaded invalid byte sequence in UTF-8 error.
Since it's coming from the web I thought I might try some client side encoding, so javascript turns that into: 400%B0F. This can at least get saved by ActiveRecord with no issue, but Rails seems to be escaping it again on the way out so those entities aren't decoded by the browser, so my show method shows the entire encoded string.
Where should I be cleaning up my input data, and what methods might be the best to use for unpredictable input?
Thanks!
Years ago I had, and solved, this very same problem in builder. Take a look at the to_xs method: http://builder.rubyforge.org/classes/String.html#M000007
You can require builder, and use it directly (you might want to pass false to escaping or you will get entity escaped output). Either that, or simply steal and adapt the source.
Update: here is the original, standalone, library:
http://intertwingly.net/stories/2005/09/28/xchar.rb
Perhaps you can use a binary form (like for upload file) with enctype="multipart/form-data" in form tag. Like this, you can use this data as a binary data ?
It's depends perhaps of waht you do with this data.
URI.unescape was the trick, after I encoded it client-side