How would you convert an image to JSON in Lua? - json

As you all have probably guessed, I've been trying to make an image parser in a heavily modified and sandboxed version of Lua known as "RBX.Lua" on the kids' gaming platform "ROBLOX".
It is limited and sandboxed heavily to protect from harming the site or engine.
Anyway, is there any way in normal Lua to convert an online image (.png, .jpg, etc) to JSON?
This will probably be closed due to being submissive, and I acknowledge that - I just want to see if there is any way to convert an image into JSON so it returns a JSON table of all the pixel data.

The problem is that you'll have a hard time reconstructing it inside Roblox if you intend to display it. There's no way to give raw image data to GUIs, you'd have to do some trickery and create a frame for every pixel of the image, which isn't very practical.
Otherwise try converting the image data to base64 and then back again. As it'd still be highly compressed you'd have to do the jpg or png decoding in lua. Painful.

Related

Protocol Buffers vs XML/JSON for data entry outside of programming effort

I would love to use protocol buffers, but I am not sure if they fit my use case. Here it is:
I have a Quiz app. This requires a bunch of data, like categories, questions, a list of answers (and which ones are correct). I do not want to be responsible for entering this data - I would prefer to pass it off to a non-programmer to serialize all this data for me, in either XML or JSON. Then my app would just read in the data file.
Does Google's Protocol Buffers fit my use case? Or should I stick to a more traditional format like XML or JSON?
I think not: Protobuf is a binary format. So then you would need to support a text format like XML or JSON and Protobuf.
Also it does not seem you would benefit from Protobufs better berformance at all.

Better way to bypass Haxe 16MB embedding file size limit?

I just noticed that Haxe (openFL) limits a single embed file's size to 16MB when using the #:file tag(or openfl.Assets). Flash/Flex can embed much larger files directly. My way to solve the problem is to split one large file to several smaller files and combine them at run time. But this is not so convenient sometimes. So is there any better way to bypass this limit in Haxe?
First of all, embedding files of this big size is generally not a good idea:
binary size is huge
is slows down compilation(because you need to copypaste quite a big amount of data every time
this data is forced to be stored in RAM/swap while application is running
But, speaking of solving the exact problem... I'm not exactly sure if swf allows to embed this big chunks of data at all(need to look at bytecode specs), but in any case it seems that the limitation is there because of ocaml inner limitation on string size. It can be fixed, I believe, however you need to rewrite part of haxe swf generator.
If you don't want to fix compiler(which may not be possible in case swf doesn't allow to embed this big chunks), then you may just go with a simple macro, which will transparently slice your file in parts, embed and reassemble in runtime.

How to access jpeg pixels of bytearray directly in Actionscript 3

I'm trying to:
load a jpeg through FileReference
write the result to a bytearray
extract the pixel data (and nothing else) directly from the bytearray
I've spent many hours looking for an AS3 class that can decode a jpeg object from raw binary data (such as from a bytearray), but to no avail (there is one here but it relies on Alchemy and a SWC, which isn't suitable).
Put simply, once I have the raw data in the byte array, I'm want to know how to discern the pixel data from the rest of the file.
I'm not interested in using the Loader class, or Bitmap's 'getPixels' function.
you will notice that steganography relies on using a png file. The reason that you can't use the jpg file(easily) is that the encoding process removes the reliability of pixel data. Jpg files can be encoded in several ways, including CMYK and RGB but most often YCbCr. Jpg compression relies on Fourier transform, which will eliminate the pixel-level detail. Therefore you will not be able to use the same process on jpg and png,gif,bmp etc.
This is not to say that you cannot do it in a jpg file, but you need to change the approach, or account for loss of data at compressions stage (or save uncompressed).
Well, you could manipulate the compressed data directly to include your message, but you'd have to read up on how you're able to do it without totally corrupting the image.
But if you're thinking to encode the message in the pixels to do a per-pixel diff when decoding your message I'm afraid your assumption (from the comment on Daniel's answer) is wrong.
JPEG compression is lossy - this means that when you put the amended pixel data back into the image file it which will cause all pixel data to be lost (since it needs to be re-encoded.) Instead of pixel data the only information that's saved in the file is how to reassemble an image that appears very similar looking to the original for the human eye, but the pixel data is not the same.
Not even if you decode the image, then save it as a JPEG file, then do the transformation of the original image and finally save this as a second JPEG with the exact same compression settings can you rely on a per-pixel comparison.
However, as I seem to remember that JPEG compresses the image data in 8*8 pixel blocks, you might be able to manipulate and compare the image data on a per-block basis.
extract the pixel data (and nothing else) directly from the bytearray
To do this you need to decode the jpeg first (apart from some eventual metadata, there is nothing else than pixel data in a typical jpeg file), and the way to do that is precisely using Loader.loadBytes and then BitmapData.getPixels. You can probably make your own decoder (like the one you posted), but I don't see any benefit in doing so.
A guy named Thibault Imbert at ByteArray.org adapted the libjpeg library for ActionScript 3. I have not tested this, but other folks seem to like it by the comments at bytearray.org.
http://code.google.com/p/as3-jpeg-decoder/downloads/list

HTML5: accessing large structured local data

Summary:
Are there good HTML5/javascript options for selectively reading chunks of data (let's say to be eventually converted to JSON) from a large local file?
Problem I am trying to solve:
Some existing program locally and outputs a ton of data. I want to provide a browser-based interactive viewer that will allow folks to browse through these results. I have control over how the data is written out. I can write it all out in one big file, but since it's quite large, I can't just read the whole thing in memory. Hence, I am looking for some kind of indexed or db-like access to this from my webapp.
Thoughts on solutions:
1. Brute-force: HTML5 FileReader API has a nice slice() method for random access. So I could write out some kind of an index in the beginning of the file, use it to look up positions of other stored objects, and read them whenever they're needed. I figured I'd ask if there are already javascript libraries that do something like this (or better) before trying to implement this ugly thing.
2. HTML5 local database. Essentially, I am looking for an analog of HTML5 openDatabase() call that would open (a read-only) connection to a database based on a user-specified local file. From what I understand, there's no way to specify a file with a pre-loaded database. Furthermore, even if there was such a hack, it's not clear whether the local file format would be the same across browsers. I've seen the phonegap solution that populates the browser local database from SQL statements. I can do that too, but the data I am talking about is quite large (5-10GB): it will take a while to load, and such duplication seems rather pointless.
HTML5 does not sound like the appropriate answer for your needs. HTML5's focus is on the client side, and based on your description you're asking a lot out of the browsers, most likely more than they can handle.
I would instead recommend you look at a server-based solution to deliver the desired goal/results to the client view, something like Splunk would be a good product to consider.

Should HTML be encoded before being persisted?

Should HTML be encoded before being stored in say, a database? Or is it normal practice to encode on its way out to the browser?
Should all my text based field lengths be quadrupled in the database to allow for extra storage?
Looking for best practice rather than a solid yes or no :-)
Is the data in your database really HTML or is it application data like a name or a comment that you just happen to know will end up as part of an HTML page?
If it's application data, I think its best to:
represent it in a form that native to the environment (e.g. unencoded in the database), and
make sure its properly translated as it crosses representational boundaries (encode when you generate the HTML page).
If you're a fan of MVC, this also helps separates the view/controller from the model (and from the persistent storage format).
Representation
For example, assume someone leaves the comment "I love M&Ms". Its probably easiest to represent it in the code as the plain-text String "I love M&Ms", not as the HTML-encoded String "I love M&Ms". Technically, the data as it exists in the code is not HTML yet and life is easiest if the data is represented as simply as accurately possible. This data may later be used in a different view, e.g. desktop app. This data may be stored in a database, a flat file, or in an XML file, perhaps later be shared with another program. Its simplest for the other program to assume the string is in "native" representation for the format: "I love M&Ms" in a database and flat file and "I love M&Ms" in the XML file. I would cringe to see the HTML-encoded value encoded in an XML file ("I love &Ms").
Translation
Later, when the data is about to cross a representation boundary (e.g. displayed in HTML, stored in a database, plain-text file, or XML file), then its important to make sure it is properly translated so it is represented accurately in a format native to that next environment. In short, when you go to display it on an HTML page, make sure its translated to properly-encoded HTML (manually or through a tool) so the value is accurately displayed on the page. When you go to store it in the database or use it in a query, use escaping and/or prepared statements and bound variable to ensure the same conceptual value is accurately represented to the database. When you go to store it in an XML file, you ensure its XML-encoded.
Failure to translate properly when crossing representation boundaries is the source of injection attacks such SQL-injection attacks. Be conscientious of that whenever you are working with multiple representations/languages (e.g. Java, SQL, HTML, Javascript, XML, etc).
--
On the other hand, if you are really trying to save HTML page fragments to the database, then I am unclear by what you mean by "encoded before being stored". If its is strictly valid HTML, all the necessary values should already be encoded (e.g. &, <, etc).
The practice is to HTML encode before display.
If you are consistent about encoding before displaying, you have done a good bit of XSS prevention.
You should save the original form in your database. This preserved the original and you may want to do other processing on that and not on the encoded version.
Database vendor specific escaping on the input, html escaping on the output.
I disagree with everyone who thinks it should be decoded at display time, the chances of an attack occuring if its encoded before it reaches the database is only possible if a developer purposes decodes it before displaying it. However, if you decode it before presenting it there is always a chance that it could happen by some other newbie developer, like a new hire, or a bad implementation. If its sitting there unencoded its just waiting to pop out on the internet and spread like herpes. Losing the original data shouldnt be a concern. encode + decode should produce the same data every time. Just my two cents.
For security reasons, yes you should first convert the html to their entities and then insert into the database. Attacks such as XSS are initiated when you allow users (or rather bad guys) to use html tags and then you process/insert them in to the databse. XSS is one of the root causes of most security holes. So you definitely need to encode your html before storing it.