Best practice for sending data between groovy and powershell processes? - json

I'm currently working on a project that has front-end components (Jira) written in groovy, and backend processes written in powershell. We're using json to pass information back and forth. One of the biggest problems we've encountered is coming up with a standardized "template" for the json that is being used on both ends. What we have works, but it is a frankenstein mess.
We are using json libraries for both json and powershell -- the json that is being constructed on either end is legit json. We are also encoding it to base64 to get around interpolation issues we've run into.
My main question is this: what is the best practice for passing data between different tools in json? I'm relatively new to it. Is there some sort of standard template we should be adhering to? I develop the groovy side, my friend the powershell side -- I was hoping to come up with something that would minimize problems if someone messed up how the json was constructed on either side. Something to check against. Something akin to an xsd.
Was curious if people have dealt with this type of thing, and what the best approach was. As I mentioned before, we have something that works now -- with error handling and whatnot, but it was very organic... and not standardized at all. I saw mention of jasonp, jsend, etc., but having some difficulty groking the options.
Tips/guidance appreciated.

Related

Where shall I use protobuf

I was reading this article about protobuf and I wondered where to use it in the projects. I read some articles that said google created protobuf to replace XML, but as far as I know in 2008 (the first release) JSON was already there.
I searched more and I found an article that the writer suggested to use it instead of JSON, but I still don't get the idea completely.
So where shall I use it? Any special scenario, or like JSON whenever that I want to transport data? Any other scenarios?
It is useful whenever you want to serialize/deserialize your data. Typical situations include sending your data to someone else over the network, storing it to disk or keeping it in context while performing asynchronous processes.
Here is a brief explanation about the main differences between protocol buffer, json and XML: https://stackoverflow.com/a/14029040/6681872

Can/should I use YAML as payload in RESTful webservice?

As the header says.
In general I like YAML more than JSON these days. I implemented a RESTful WS PoC back in the day using JSON. I was wondering if I can instead use YAML or not.
E.g. are there enough tools/libraries/support for doing that? Or would I end up doing quite a bit of mundane/tedious coding which I would've avoided if I were using JSON instead?
Also as I understood from WWW: REST doesn't restrict one from using YAML as the payload, is that correct?
Thanks!
Yes, if it's a goal that the data be especially readable by humans. REST itself isn't focused on protocols/formats so much as patterns.
There's not a lot to gain here for webservices however, which typically represent app to app communication. Computers don't care, and JSON can be pretty-printed to improve legibility somewhat.
YAML is well supported by mainstream languages, though not always included in standard libraries as JSON typically is. So you'll probably be looking at an additional library dependency.
Also, if the client is a browser, parsing will be slower, as you'll have to use a non-native external lib such as described here using: JavaScript YAML Parser . Make sure it gets compressed in transit or the extra indentation spaces will expand the size of the data.
Also, YAML has a lot of esoteric and downright potentially dangerous features. Whenever I'm using it I use the "safe" parser, and deactivate many if not most of its features besides data structures.
I could imagine some utility as a debug parameter however, perhaps url.yaml or …?fmt=yaml to assist during development. But, otherwise not much gain for all the trouble.

Application Input Specification: Drawing input data of method

Does anyone know a good way to to draw the exact structure of input data for a method? In my case I have to specify the correct input data for a server application. The server gets an http post with data. Because this data is a very complex json data structure, I want to draw this, so next developer can easily check the drawing and is able to understand, what data is needed for the http post. It would be nice if I can also draw http headers mark data as mandatory or nice to have.
I dont need a data flow diagramm or sth. like that. What I need is a drawing, how to build a valid json for the server method.
Please if anyone have an idea, just answer or comment this question, even if you just have ideas for buzz words, I can google myself.
In order to describe data structure consider (1) using the UML class diagram with multiplicities and ownership and "named association ends". Kirill Fakhroutdinov's examples uml-diagrams.org: Online Shopping and uml-diagrams.org: Sentinel HASP Licensing Domain illustrate what your drawing might look like.
As you need to specifically describe json structure then (2) Google: "json schema" to see how others approached the same problem.
Personally, besides providing the UML diagram I'd (3) consider writing a TypeScript definition file which actually can describe json structure including simple types, nested structures, optional parts etc. and moreover the next developer can validate examples of data structures (unit tests) against the definition by writing a simple TypeScript script and trying to compile it

twisted - transfer data using json

I need to transfer data (objects) between client and server, and Twisted seems a good way to accomplish this. I've been doing a lot searching but still haven't found any example to understand the basic principle. So any simple code would help.
Thanks!
EDIT
Both client and server are written in python
The data may be large, so I need a fast, reliable transmission ( I've taken a look at producers, is that good?)
Flask is great, but I am using another framework, so the whole networking thing relies on Twisted.
It's hard to tell if your question is more about json, python or twisted, but here's an overview, more can follow once the specifics are known. Perhaps you could add some more info to your question so we can offer more assistance :-)
re Json: Json is just a string with a defined structure. If you are working in python and have an object to send as json, then you need to convert the object to a json string by use of
import json
json.dumps(objectName)
If your client is javascript then instead of json.dumps you might use JSON.stringify(objectname).
If you intend to use javascript for clients then some of the frameworks like jQuery make it very easy.
Pythons json.dumps has a lot of optional arguments, most of which you won't need. You can see the options at https://docs.python.org/2/library/json.html
Python is python, I assume you know how to create and populate objects. Will your client be python or javascript or something else? From a javascript client to a python server you would most likely use Ajax to send requests and get responses.
Twisted allows you to easily create a server that will listen on a given port and, when data arrives, an event will occur that supplies the data received. You can then do whatever you need to with the data. Just be careful about doing blocking things like database inserts since the server may miss some data or otherwise misbehave if you interrupt it's event loop. Twisted can be difficult to learn initially, but it is a very powerful and reliable system that is well proven. One alternative to consider, particularly if your clients are not python, is node.js. In my opinion, node is a little bit easier to grasp initially and there are thousands of add-on modules that let you do almost anything you'd want. I use both twisted and node for different things.
Neither node.js nor twisted are software that you can use to just quickly spin up a server or client without some study and experimentation. To use Twisted or Node.js properly confidently, using all their features and goodness, requires a bit of research and work on your part.
There are excellent frameworks like Flask that can be used to build a server that can react to a number of different Ajax calls from a client - you can have a single server be able to respond to several different kinds of requests instead of having a server for each Ajax type.
This is a small library that serializes an object with all its children to JSON and also parses it back to a fully working object:
https://github.com/Toubs/PyJSONSerialization/

best practices for writing to a file from multiple methods

I have a class that contains a bunch of methods for checking data I scrape every week (for things like well-formedness and other errors in gathering the data). Each of these methods performs a test, and then prints out a summary of the test.
I want to print out the output from these tests to a file, but I'm not sure what the best way to do it is. For example...
Should the class hold an instance variable to the file, and each method open/appends/closes the file? (A problem is that methods sometimes call other methods, so this seems kinda messy?)
Should each method get passed the file as a parameter? (Seems messy as well.)
Should each method return a string, and a"central" method that calls all the other tests outputs all these strings to a file?
I'm not really familiar with using logger libraries -- would that be a solution?
My particular context
I have a scraper that pulls data from various websites and stores them in a database. Websites change all the time, so I'm writing a "scrape checker" program that checks my scrapes for various things, like:
number of empty results
length of results
weird characters in results
and so on
So I have methods like:
check_num_empty_results
check_weird_characters
check_scrape (calls a bunch of other checks)
check_scrape_pair (sometimes I want to check pairs of scrapes together, e.g., to match results against each other, so this is different checking each one in isolation)
etc.
I want my "scrape checker" program to print out a file that summarizes all the checks.
Separation of concerns. Write code the focuses on the scraping activity and return the value(s) scraped. Then use aspect oriented programming for logging, which can simplify the problem greatly as the aspect holds the reference to the file or logging API.
Ultimately, it depends on what language you're using.
The first solution makes the most sense if your language permits it. For each instance of the logging class, have a field for the file object that you're reading from/writing to. This is basically equivalent to passing the file object as a parameter to every method.
That said, most mature languages have modules that will do a lot of this work for you; off the top of my sh/awk, Perl, and Python all come to mind as being suited to this task (though if you want to, you could use Java or something else).
Seems like a logging framework would be a perfect solution for this. If you are using Java or .NET, log4j and log4net are pretty much the de-facto standards for that.