Finding it difficult to process JSON in delphi - json

I'm currently working on an application that will get data of your character from the WoW armory.
Example character: My WoW Character(link)
I will get all the info I want by calling the API provided by Blizzard and I will get the response in JSON.
Example JSON: JSON response for the character above(link)
At first I tried get the data from the JSON by string manipulation.
This mean, splitting my strings, searching for keywords in the string to find the position and formatting that into individual pieces of data such as talents and stats.
This worked great at the beginning but as I wanted more data this became harder because of the many functions I ran on all the strings it just became one big blur and unclear to see what I was doing at that moment.
Is there a good way to process my JSON?
I was thinking about getting the JSON and creating an empty class.
While working through the JSON it would generate properties and store the values in there.
But I have no idea if and how its possible to generate properties dynamically.
In the future I would like to get even more data but first I want to get this up and running before even thinking about that.
Does anyone have any ideas/advice on this?
Thanks in advance.

Your JSON seems rather short and basic. It does not seem you need special speed or exotic features. http://jsonviewer.stack.hu/#http://eu.battle.net/api/wow/character/moonglade/Xaveak?fields=stats,talents
And while since Delphi XE2 you really have stock JSON parser as part of DB-Express suite, still there are concerns:
1. It was told to cause problems with both speed and reliability.
2. It would make you program dependent on DB-Express package (why, if you not actually using it for DB access?)
3. It would bind your future to Enterprise edition of Delphi.
So you'd better try some 3rd-party library.
One of the fastest would probably be Synopse JSON parser, side-project of their mORMot library. It is generally good code, with large attention to speed and developers actively helping on their forum.
One more known and used library would be Henri Gourvest's SuperObject.
It made claims to be the fastest parser for Delphi, and while due to above that is probably no more true, the speed is quite adequate for most tasks. Henri himself is not actively supporting his ex-projects, always doing something new, so the scarce documentation (also duplicated in install package) would be all you have officially, plus there is a forum where other users might help you. OTOH the main idea behind SuperObject design was uniformity, and while some tasks could really be documented better - that is mostly due to uncertainty "if this task would really work in uniform matter without any special treatment". But usually it does.
PS. Since that is wiki you may try to enhance it for future users ;-)
So coming back to documentation, you would need
1) to load the whole JSON to the library. That you can do via creating TStream by your http library or providing string buffer wth the data: that is Parsing a JSON data structure section of the manual
2) reading values like "name" and "level" - described in How to read a property value of an object ? section there.
3) enlist arrays like "talents" - described in Browsing data structure section.

XE3 has "built in" JSON support (see docwiki), but I have heard (haven't used it myself) that it isn't very well optimised.
So perhaps look for some thirdparty option like SuperObject.

Your task is easily achievable using TSvSerializer which is included in my delphi-oop library. You only need to declare your model type and deserialize it from your json string. Your model (very simplified incomplete and untested version) should look something like this:
type
TStats = class
public
property health: Integer read fhealth write Fhealth;
...
end;
TTalent = class
public
property tier: Integer read Ftier write Ftier;
...
end;
TMainTalent = class
public
property selected: Boolean read Fselected write Fselected;
property talents: TObjectList<TTalent> read Ftalents write Ftalents;
end;
TWowCharacter = class
public
property lastModified: Int64 read FlastModified write FlastModified;
property name: string read Fname write Fname;
...
property stats: TStats read Fstats write Fstats;
property talents: TObjectList<TMainTalent> read Ftalents write Ftalents;
...
end;
Then you just need to do:
uses
SvSerializer;
var
LWowCharacter: TWowCharacter;
begin
LWowCharacter := TWowCharacter.FromJson(YourJsonString);
...
You can find my contact email in delphi-oop project, ask me if something's unclear, I'll try to help you in my spare time.

Related

What makes JSON or YAML syntax able to be sent "through the wire"?

require 'yaml'
class Person
attr_accessor :name, :age
end
fred = Person.new
fred.name = "Fred Bloggs"
fred.age = 45
laura = Person.new
laura.name = "Laura Smith"
laura.age = 23
test_data = [ fred, laura ]
puts test_data.to_yaml
#YAML
- !ruby/object:Person
age: 45
name: Fred Bloggs
- !ruby/object:Person
name: Laura Smith
age: 23
This is an example of YAML serialization from a book that I am reading. I'm having trouble understanding what makes YAML syntax any different from normal ruby code for it to be saved/sent. If it were to be converted to binary as in "binary serialization" it would make sense to me as it would be able to be sent faster. If the point of serialization is to keep the state of an object in order and make it into a stream why not just make it a stream of its original order and syntax?
Concerning the question whether binary serialization would be faster: Yes, it would. If you are concerned about speed, YAML is not the tool you want – you should turn to other tools like Cap'n Proto. YAML has been designed to be human readable.
So why send YAML instead of Ruby code? Well, for starters: Security. If one end sends Ruby code to the other end and the code gets evaluated there, this may easily turn into a vulnerability if an unauthorized third party finds a way to inject a message into this stream; it can lead to arbitrary code execution.
So let's assume we don't actually want to send arbitrary Ruby code. Instead, we want to send a subset which is a single expression which evaluates to the data we want to send. Incidentally, this is how JSON came into existence: As a subset of JavaScript evaluating to an object value.
Since JSON already exists, there is no point in inventing the wheel again basing some serialization language on Ruby, unless you want to add some feature missing from JSON¹. You would need to write a complete parser and emitter (note that you cannot simply use your Ruby implementation because, as described above, this will let an attacker execute arbitrary code). And JSON is already supported in a wide range of programming languages and ecosystems, making it an ideal data interexchange format if you value cross-platform compatibility.
So now the question remains what YAML offers in addition to JSON. Some argue that YAML syntax is far better readable than JSON, YMMV. But there are a number of features in YAML that make it superior to JSON:
YAML has an extensible tagging system for annotation content with a type. Example from your code: !ruby/object:Person. This ensures that if you have a field in your data structure where differently typed values can occur, the receiving side immediately knows which type to use for deserialization. In JSON, you would need type inference (deducing the type from the value of the expression) to make that decision and that is not always possible².
Data structures may contain cycles (e.g. ring lists, strongly connected graphs). These are difficult to serialize. YAML has built-in anchors and aliases, making it possible to reference a previously started node to denote a cyclic structure. JSON has no such thing. I assume it would be difficult to include this feature in a Ruby-based serialization language without adding features alien to Ruby itself.
Lastly, and that's the answer to the question in the title, YAML has been designed for streaming (JSON to a far lesser extent). A YAML stream can contain any number of documents. This makes it possible to keep a stream open and wait for new data on the receiving side. In contrast, JSON expects the input to end after one object.
All of this does not mean that YAML (or JSON) is the one and only way to go. Don't have any cycles or heterogeneous fields in your data? You won't need anchors/aliases or tags! Don't need human-readable serialization? You can go with a binary format! JSON and YAML have been successful because their feature set pretty well mirrors the requirements in a lot of applications. Whether it is the right tool for your application is up to you to decide.
¹ There are surely projects that do exactly that for any number of reasons. The point I want to make is that in general, implementing proper (de-)serialization is an involved task and you usually want to use what's already there.
² You can, of course, extend your JSON schema so that every node has a structure like this:
{
"type": "myType",
"value": ...
}
But that would make the serialization pretty verbose.

Best practice for sending data between groovy and powershell processes?

I'm currently working on a project that has front-end components (Jira) written in groovy, and backend processes written in powershell. We're using json to pass information back and forth. One of the biggest problems we've encountered is coming up with a standardized "template" for the json that is being used on both ends. What we have works, but it is a frankenstein mess.
We are using json libraries for both json and powershell -- the json that is being constructed on either end is legit json. We are also encoding it to base64 to get around interpolation issues we've run into.
My main question is this: what is the best practice for passing data between different tools in json? I'm relatively new to it. Is there some sort of standard template we should be adhering to? I develop the groovy side, my friend the powershell side -- I was hoping to come up with something that would minimize problems if someone messed up how the json was constructed on either side. Something to check against. Something akin to an xsd.
Was curious if people have dealt with this type of thing, and what the best approach was. As I mentioned before, we have something that works now -- with error handling and whatnot, but it was very organic... and not standardized at all. I saw mention of jasonp, jsend, etc., but having some difficulty groking the options.
Tips/guidance appreciated.

What alternatives are there for creating a REST-full web service API based on JSON?

We're creating a web service and we'd like 2 things:
- to be JSON based
- to be REST-full - how much so, we haven't decided
We've already implemented custom APIs but now we'd like to follow some standards, since at some point it gets a little crazy to remember all the rules, all the exceptions, and all the undocumented parts that the creator also forgot.
Are any of you using some standards that you've found useful? Or At least, what are some alternatives?
So far I know of jsonapi and HAL.
These don't seem to be good enough though, since what we'd optimaly like is to be able to:
+ define, expose and update entities and relations between them
+ define, expose and invoke operations
+ small numbers of requests are preferable, at least where it "makes sense" (i'll leave that as a blank check)
[EDIT]
Apparently, there's OData too: http://www.odata.org/
Are any of you using some standards that you've found useful? Or At least, what are some alternatives?
Between your own question and the comments most of the big names have been mentioned. I just like to also add JSON Hyper Schema:
"JSON Schema is a JSON based format for defining the structure of JSON data. This document specifies hyperlink- and hypermedia-related keywords of JSON Schema."
http://json-schema.org/latest/json-schema-hypermedia.html
It's an extension to JSON schema and fulfils a very similar role to the others mentioned above.
I've been using json-hal for a while and like it a lot, but I'm increasingly drawn to the JSON Schema family of schemas which also handle data model definition and validation. These schemas are also the basis of the excellent Swagger REST API standard:
http://swagger.io/specification/
Hope this helps.

best practices for writing to a file from multiple methods

I have a class that contains a bunch of methods for checking data I scrape every week (for things like well-formedness and other errors in gathering the data). Each of these methods performs a test, and then prints out a summary of the test.
I want to print out the output from these tests to a file, but I'm not sure what the best way to do it is. For example...
Should the class hold an instance variable to the file, and each method open/appends/closes the file? (A problem is that methods sometimes call other methods, so this seems kinda messy?)
Should each method get passed the file as a parameter? (Seems messy as well.)
Should each method return a string, and a"central" method that calls all the other tests outputs all these strings to a file?
I'm not really familiar with using logger libraries -- would that be a solution?
My particular context
I have a scraper that pulls data from various websites and stores them in a database. Websites change all the time, so I'm writing a "scrape checker" program that checks my scrapes for various things, like:
number of empty results
length of results
weird characters in results
and so on
So I have methods like:
check_num_empty_results
check_weird_characters
check_scrape (calls a bunch of other checks)
check_scrape_pair (sometimes I want to check pairs of scrapes together, e.g., to match results against each other, so this is different checking each one in isolation)
etc.
I want my "scrape checker" program to print out a file that summarizes all the checks.
Separation of concerns. Write code the focuses on the scraping activity and return the value(s) scraped. Then use aspect oriented programming for logging, which can simplify the problem greatly as the aspect holds the reference to the file or logging API.
Ultimately, it depends on what language you're using.
The first solution makes the most sense if your language permits it. For each instance of the logging class, have a field for the file object that you're reading from/writing to. This is basically equivalent to passing the file object as a parameter to every method.
That said, most mature languages have modules that will do a lot of this work for you; off the top of my sh/awk, Perl, and Python all come to mind as being suited to this task (though if you want to, you could use Java or something else).
Seems like a logging framework would be a perfect solution for this. If you are using Java or .NET, log4j and log4net are pretty much the de-facto standards for that.

How to save and load different types of objects?

During coding I frequently encounter this situation:
I have several objects (ConcreteType1, ConcreteType2, ...) with the same base type AbstractType, which has abstract methods save and load . Each object can (and has to) save some specific kind of data, by overriding the save method.
I have a list of AbstractType objects which contains various ConcreteTypeX objects.
I walk the list and the save method for each object.
At this point I think it's a good OO design. (Or am I wrong?) The problems start when I want to reload the data:
Each object can load its own data, but I have to know the concrete type in advance, so I can instantiate the right ConcreteTypeX and call the load method. So the loading method has to know a great deal about the concrete types. I usually "solved" this problem by writing some kind of marker before calling save, which is used by the loader to determine the right ConcreteTypeX.
I always had/have a bad feeling about this. It feels like some kind of anti-pattern...
Are there better ways?
EDIT:
I'm sorry for the confusion, I re-wrote some of the text.
I'm aware of serialization and perhaps there is some next-to-perfect solution in Java/.NET/yourFavoriteLanguage, but I'm searching for a general solution, which might be better and more "OOP-ish" compared to my concept.
Is this either .NET or Java? If so, why aren't you using serialisation?
If you can't simply use serialization, then I would still definitely pull the object loading logic out of the base class. Your instinct is correct, leading you to correctly identify a code smell. The base class shouldn't need to change when you change or add derived classes.
The problem is, something has to load the data and instantiate those objects. This sounds like a job for the Abstract Factory pattern.
There are better ways, but let's take a step back and look at it conceptually. What are all objects doing? Loading and Saving. When you get the object from memory, you really don't to have to care whether it gets its information from a file, a database, or the windows registry. You just want the object loaded. That's important to remember because later on, your maintanence programmer will look at the LoadFromFile() method and wonder, "Why is it called that since it really doesn't load anything from a file?"
Secondly, you're running into the issue that we all run into, and it's based in dividing work. You want a level that handles getting data from a physical source; you want a level that manipulates this data, and you want a level that displays this data. This is the crux of N-Tier Development. I've linked to an article that discusses your problem in great detail, and details how to create a Data Access Layer to resolve your issue. There are also numerous code projects here and here.
If it's Java you seek, simply substitute 'java' for .NET and search for 'Java N-Tier development'. However, besides syntactical differences, the design structure is the same.