what is the most efficient and safe method to save a short array on html? - html

I'm trying to make a website to serve as the interface between a plotting program and the user input file. The plotting program needs several parameters, which I could allow the user to enter using input tag. But the plotting program needs user input on the legend for distinguishing the values in the input file as well, namely the range(boundary) of value and the corresponding color for this range. I made a fieldset containing the required input elements for one range. When user click "Add another range", the content of the fieldset is cleared so as to be ready for the new input. And the previously entered input is stored in a table below as a new row. Beside this row, there is a "delete" button.
As this website is aimed for multiple users, this information should be also exclusive for the corresponding user. Could someone please tell me what approach should I use? The plotting program is written using perl, and I'm using CGI for this website.
And this approach should allow the html part to access the current values in the array, so I could display the entered ranges in the table dynamically. This approach should also allow the deletion/modification/addition of such entered range information. i'm thinking of a temporary database. But I only need the final version of all the range info in a string, so I can send it to the CGI program and organize it to be the correct format to be inputted into the perl plotting program.
Any help or hint is greatly appreciated! I'm a newbie to this area. Thank you very much for your time and help in advance!

JSON is pretty universal these days. Use that. Many new database systems like MongoDB use JSON as a native storage format.
Most server-side languages can consume and produce JSON easily. JSON allows structured data, so it can do more than simple arrays.
JSON is also very fast on the browser (compared to XML), being a native JavaScript object.

If the data will be purely in Perl, then FreezeThaw or Storable are the things to use. If your data is simple, then there is nothing wrong with Diodeus' answer of using JSON, but as things get complicated, those modules will be able to handle the complexities of Perl datastructures better.

Related

How should a "project" file be written?

With popular software packages, like Microsoft Word or Photoshop, we often have an option to save our progress as a "project" file and later can open that file to edit our works furthermore. This file often contains all the options and the progress that the user has made (i.e the essay you typed in Word).
So my question is, if I am doing a similar application that requires creating a similar "project" file, how should I go about this? My application is a scientific application, which means it required a lot of (multi-dimension) arrays. I understand there will be a lot of options to do this, but I would like to know the de facto way.
Here are some of the options I have outline out:
XML: Human readable. The size is too big and it's too much work to deal with arrays.
JSON: More popular/modern. Good with array.
Protocol Buffer: It is created by Google. Probably faster.
Database: Probably not a good use case since "project" files are most likely "temporary". Also, working with arrays is not very straight forward.
Creating your own binary format: Might be the most difficult solution for an inexperienced programmer like myself.
???
I would like to get some advice from you guys. Thank you :).
(Good question. :) Only some thoughts) I'd prefer text format for the main project file. You can make diffs and open and read and modify it easily. Large ascii or binary data can be stored as serialized data in external files or in a database like SQLite from where it can be easily accessed and processed through the application. The main project has links to the external data store. My advice for the main project file is a simple XML format that can easily be transformed to JSON format. A list of key value pairs (dict) is good for the beginning. value can be of basic datatype or be an array or dict. A complicated XML tree is not good. The key name can also help to describe and structure data. So i'd prefer key="rect.4711.pos.x" value="500" and not <rect id="4711"><pos><x>500</x>...</pos>.... Important aspect is that the project data is portable and self-contained, and the user can see the project as a single unit even if it is a directory on the file system, for this purpose supporting some kind of zipped format of project data is good.

How to define and store data structure properties in a database

I'm working on a PHP web app with a Postgres backend. The app uses a variety of APIs and want to be able to add/edit the API endpoints used by the system dynamically.
I'm planning to handle variations in the API request URLs with replacement codes, for example: http://api.com/?key=%%api_key%%&user_id=%%user_id%%
The part I don't have a plan for is how to define and store the "shape" of the returned API data. For example, let's say I want to get a user's comments from different APIs. The structure of the data will likely differ from one to another. Even if they are all json data (vs. XML), the property(s) I care about will be located in different places. Is there an established way to do this?
I'm considering a text field with a json "map" to the location of the properties:
{
"user": {
"comments" : %%HERE%%
}
}
Presumably my app would parse this, and loop through it to find the indicated location and then use it to find the data in the corresponding location in the response data. But I'm not exactly how to do it or if this is even the best way. Any suggestions are welcome.
Thinking this through a bit more, I realize that an alternative approach would be to store some kind of algorithm to finding the data. Is there a precedent for this? I briefly considered the idea of storing raw PHP code that could be executed to parse the data, but this feels very wrong and potentially dangerous/insecure.
JOLT may be helpful. It's for transforming JSON to JSON, much like XSLT for XML. You could write a spec for each new api, which would transform the data into a uniform format for your app to read.

Application Input Specification: Drawing input data of method

Does anyone know a good way to to draw the exact structure of input data for a method? In my case I have to specify the correct input data for a server application. The server gets an http post with data. Because this data is a very complex json data structure, I want to draw this, so next developer can easily check the drawing and is able to understand, what data is needed for the http post. It would be nice if I can also draw http headers mark data as mandatory or nice to have.
I dont need a data flow diagramm or sth. like that. What I need is a drawing, how to build a valid json for the server method.
Please if anyone have an idea, just answer or comment this question, even if you just have ideas for buzz words, I can google myself.
In order to describe data structure consider (1) using the UML class diagram with multiplicities and ownership and "named association ends". Kirill Fakhroutdinov's examples uml-diagrams.org: Online Shopping and uml-diagrams.org: Sentinel HASP Licensing Domain illustrate what your drawing might look like.
As you need to specifically describe json structure then (2) Google: "json schema" to see how others approached the same problem.
Personally, besides providing the UML diagram I'd (3) consider writing a TypeScript definition file which actually can describe json structure including simple types, nested structures, optional parts etc. and moreover the next developer can validate examples of data structures (unit tests) against the definition by writing a simple TypeScript script and trying to compile it

How do I learn to verify that user input is sane?

I'm not sure of the terminology here, so let me specify that when I say "verify" user input, I mean watch out for users claiming 30 Feb 2021 as their birthdays, rather than guarding against injection attacks.
Are there any guides to doing this correctly, or lists of common ways people do it wrong? Strategies for ensuring correct input even before it's entered (e.g., picking out of a calendar instead of typing into a text field)?
Note that I am not interested in language-specific answers (e.g., ASP.NET Validation Controls) but rather general strategies and principles.
The freer you make the input field, the more you have to check. Some languages may make it easy for you to verify that a text field is a valid date; others may not.
Then again, some users will resent clicking on a calendar control or three drop-downs to enter their birthdate. They may prefer to just type it in. That's a trade-off.
The term you are looking for is input validation.
As you point out if you use a control where it is impossible to enter invalid data you can help the client, but you still need to implement proper validation on the server.
I mean watch out for users claiming 30 Feb 2021 as their birthdays, rather than guarding against injection attacks
Why not do both? Is there a specific reason why you want to leave yourself open to injection attacks?
Assume that the user sends a string to the server, either one they entered themselves or else one that was sent by a control you placed on the page. The first part is to find a library function for parsing the string into typed data. In your example you could use DateTime.TryParse to parse a string to a date. This will fail for your given example as the given date is invalid. If you cannot find a library function for what you are trying to parse you can try to write a parser yourself. For simple validations you may be able to express it as a regular expression. For more complicated inputs you may need to write some code that performs the validation, perhaps even using a parser library to help you if the input language is particularly complicated.
The second part is to implement business validation rules specific for your needs. For example you know that a birth date must be in the past, but not too far in the past. This will require some judgement as it's not impossible that someone using your site could be 100 years old, but it's highly unlikely that they are 200 years old since no-one is believed to be this old.
i would recommend using a design pattern called "strategy". this is one of the patterns created by "the gang of four", or "gof" for short. there are some copies and variants of this pattern that you may have heard of, e.g. "inversion of control" and "dependency injection".
anyways, for an object oriented language, what you do is that you create a class called "validator", which validates data in a method called "validate". you'll have to make validate accept some relevant form of input, or overload it to have different methods for different sorts of data. or if you have access to some form of generics, you can use that.
next up, the constructor of this class should take a "validatorstrategy" object as argument. and then the actual validation will be passed through the strategy object.
to take this even further, you could then create some sort of input form generator system, where you specify input fields with your own type names. these will then generate different input fields depending on your front end language (html/android xml/java swing), and they will also affect the way in which the input is validated.
hmm.. i wonder how to solve the issue with two password input fields that need to have the exact same content to validate. how would this look in the form generating system? maybe there would be one input type named "password" which would generate one input field which doesn't show the input and has no validation, and another type named "passwordsetter" which would generate two input fields which doesn't show the input, and has the validation strategy of comparing the data from th two fields. creating that validation strategy could be pretty tricky though D:

best practices for writing to a file from multiple methods

I have a class that contains a bunch of methods for checking data I scrape every week (for things like well-formedness and other errors in gathering the data). Each of these methods performs a test, and then prints out a summary of the test.
I want to print out the output from these tests to a file, but I'm not sure what the best way to do it is. For example...
Should the class hold an instance variable to the file, and each method open/appends/closes the file? (A problem is that methods sometimes call other methods, so this seems kinda messy?)
Should each method get passed the file as a parameter? (Seems messy as well.)
Should each method return a string, and a"central" method that calls all the other tests outputs all these strings to a file?
I'm not really familiar with using logger libraries -- would that be a solution?
My particular context
I have a scraper that pulls data from various websites and stores them in a database. Websites change all the time, so I'm writing a "scrape checker" program that checks my scrapes for various things, like:
number of empty results
length of results
weird characters in results
and so on
So I have methods like:
check_num_empty_results
check_weird_characters
check_scrape (calls a bunch of other checks)
check_scrape_pair (sometimes I want to check pairs of scrapes together, e.g., to match results against each other, so this is different checking each one in isolation)
etc.
I want my "scrape checker" program to print out a file that summarizes all the checks.
Separation of concerns. Write code the focuses on the scraping activity and return the value(s) scraped. Then use aspect oriented programming for logging, which can simplify the problem greatly as the aspect holds the reference to the file or logging API.
Ultimately, it depends on what language you're using.
The first solution makes the most sense if your language permits it. For each instance of the logging class, have a field for the file object that you're reading from/writing to. This is basically equivalent to passing the file object as a parameter to every method.
That said, most mature languages have modules that will do a lot of this work for you; off the top of my sh/awk, Perl, and Python all come to mind as being suited to this task (though if you want to, you could use Java or something else).
Seems like a logging framework would be a perfect solution for this. If you are using Java or .NET, log4j and log4net are pretty much the de-facto standards for that.