OversizedPayloadException Akka - exception

This is the exception if the payload sent is greater than a threshold set. Is there a way I can know the size of the cluster message(not just the json size) being exchanged and handle this case before passing the cluster message to the sender?

It's not possible to say the exact overhead in general, since the envelope contains the sender and receiver system addresses which in turn contain the host names of the systems, actor system names and the actor paths, additionally it contains the string manifest from the serializer used to serialize the actual payload, all of these may be of different sizes.
For reasonably short strings - ip based hostname, top level actor with short name, single character serializer string manifest the overhead is something around 120 bytes.
Note that this is for the current stable remoting, in the next-gen remoting, "Artery", we have compression implemented for actor refs making those as small as an index into a cache.

Related

Convention so represent string in .properties

If I have two properties:
foo=1
bar=2345
Is there a way to specify that foo is a number and bar is a string?
I assume: bar="2345" would do but I wonder if there's a widely accepted convention
A properties file is a text file which contains data in some standard format, which can be read by the application using it. It is mostly used for configuration of the application and also for internationalization.
As per the wiki document https://en.wikipedia.org/wiki/.properties
Each parameter is stored as a pair of strings, one storing the name of
the parameter (called the key), and the other storing the value.
There is no way to specify / force the value to be number or string only (instead it is always a string). It is majorly the functionality of the framework / application which; while reading the properties file tries to parse the values. If it fails to parse the value (of certain specific type like number) it may fallback to some default value or will simply terminate the program.

Viewstate: 2 different formats?

Trying to scrape a webpage, I hit the necessity to work with ASP.NET's __VIEWSTATE variables. So, ever the optimist, I decided to read up on those variables, and their formats. Even though classified as Open Source by Microsoft, I couldn't find any formal definition:
Everybody agrees the first step to do is decode the string, using a Base64 decoder. Great - that works...
Next - and this is where the confusion sets in:
Roughly 3/4 of the decoders seem to use binary values (characters whose values indicate the the type of field which is follow). Here's an example of such a specification. This format also seems to expect a 'signature' of 0xFF 0x01 as first two bytes.
The rest of the articles (such as this one) describe a format where the fields in the format are separated (or marked) by t< ... >, p< ... >, etc. (this seems to be the case of the page I'm interested in).
Even after looking at over a hundred pages, I didn't find any mention about the existence of two formats.
My questions are: Are there two different formats of __VIEWSTATE variables in use, or am I missing something basic? Is there any formal description of the __VIEWSTATE contents somewhere?
The view state is serialized and deserialized by the
System.Web.UI.LosFormatter class—the LOS stands for limited object
serialization—and is designed to efficiently serialize certain types
of objects into a base-64 encoded string. The LosFormatter can
serialize any type of object that can be serialized by the
BinaryFormatter class, but is built to efficiently serialize objects
of the following types:
Strings
Integers
Booleans
Arrays
ArrayLists
Hashtables
Pairs
Triplets
Everything you need to know about ViewState: Understanding View State

http rest: Disadvantages of using json in query string?

I want to use query string as json, for example: /api/uri?{"key":"value"}, instead of /api/uri?key=value. Advantages, from my point of view, are:
json keep types of parameters, such are booleans, ints, floats and strings. Standard query string treats all parameters as strings.
json has less symbols for deep nested structures, For example, ?{"a":{"b":{"c":[1,2,3]}}} vs ?a[b][c][]=1&a[b][c][]=2&a[b][c][]=3
Easier to build on client side
What disadvantages could be in that case of json usage?
It looks good if it's
/api/uri?{"key":"value"}
as stated in your example, but since it's part of the URL then it gets encoded to:
/api/uri?%3F%7B%22key%22%3A%22value%22%7D
or something similar which makes /api/uri?key=value simpler than the encoded one; in this case both for debugging outbound calls (ie you want to check the actual request via wireshark etc). Also notice that it takes up more characters when encoded to valid url (see browser limitation).
For the case of 'lesser symbols for nested structures', It might be more appropriate to create a new resource for your sub resource where you will handle the filtering through your query parameters;
api/a?{"b":1,}
to
api/b?bfield1=1
api/a?aBfield1=1 // or something similar
Lastly for the 'easier to build in client side', i think it depends on what you use to create your client, usually query params are represented as maps so it is still simple.
Also if you need a collection for a single param then:
/uri/resource?param1=value1,value2,value3

rest api response format

Should I treat all api response as "resource" and return a JSON object or simple array would be appropriate as well ?
for instance are all of the below responses valid?
GET /rest/someresource should return collection of ids
[{id:1},{id:2}]
{{id:1},{id:2}}
[1,2]
GET /rest/someresource?id>0 search for ids bigger than zero and return collection of ids
[{id:1},{id:2}]
{{id:1},{id:2}}
[1,2]
Collection Resources
It is acceptable to return an array of resources - either a list of ids, or object structures - such a thing is commonly known as a 'collection' resource.
See http://51elliot.blogspot.com.au/2014/06/rest-api-best-practices-4-collections.html for an examination of resources and collections.
While not required by REST, it's common to use a plural noun to refer to a collection resource - e.g.
/rest/someresources
REST also requires the use of defined media types, and there are a couple available to assist with collections, e.g.:
Collection+json
Provides a structure with meta data around a list of items wherein you define the structure of each item as your resource
HAL
provides a structure with embedded collections and embedded resources
And many more
All provide a defined structure for including hypermedia links for your resource, or each resource in your collection - and if you are doing REST this is one of the things that the spec says you MUST do (even though many people don't).
Your Proposed Json Structures
Some more specific comments on your proposed json structures:
Option 2 is not valid json. Consider:
{{id:1},{id:2}}
A json object must have a name:value pair, e.g.
{somename:{id:1},someothername:{id:2}}
would be valid - but not very useful!
Also - strictly for json, the name should be enclosed in quotes. the value may be enclosed in quotes if it is a string.
So if you don't want to use a commonly used media type as referenced above, your options are 1 or 3. which should be:
[{"id":1},{"id":2}]
[1, 2]
Both are valid, however option 1 will give you more flexibility to add more properties to each element of the array if you decide in the future you would like to return more than an id. e.g. at some point in the future you might decide to return:
[{"id":1,"name":"fred"},{"id":2,"name":"wilma"}]
Option 3 will only ever be able to return a list of ids.
So personally I would go with option 1.
Depends on how RESTful you're aiming to be.
In addition to what #Chris Simon said, I'll add that if the server would only return IDs at GET /rest/someresource, the client would have to repeatedly call something like GET /rest/someresource/{id} in order to obtain data (it can display on the UI), right? This in turn would just increase the load on the server. If the id would be enough, you can probably get away with the proposed solution.
Also, once you decide you'd better be consistent.
Given that the 2nd option is not even valid, and the last is pretty limiting, I'd also go for the first option, JSON.
Just to make it clear we are talking about different representations of the same resource here:
By GET /rest/someresource both [{id:1},{id:2}] and [1,2] are valid responses, but you should make clear which one you want to see, e.g. with the prefer header. So by Prefer: return=minimal you would return [1,2] and if the header is not present, then [{id:1},{id:2}]. Just make sure that the prefer header is registered by the vary header, or you will have caching troubles.
By GET /rest/someresource?id>0 you filter your collection. So either the /rest/someresource?id>0 URI identifies a different filtered collection resource or it identifies the same collection resource, but with the filter query string your client indicates that it is waiting for a filtered representation of the resource and not the full representation. You can use the same by the minimal representation if you don't want to use the prefer header: GET /rest/someresource?return=minimal.
Note that if you want your client to query again, then you should send them hyperlinks in your response. The REST client must get the URIs (or URI templates) from these hyperlinks and it should not start to build URIs on its own.

Gemfire pdxInstance datatype

I am writing pdxInstances to GemFire using the sequence: rabbitmq => springxd => gemfire.
If I put this JSON into rabbitmq {'ID':11,'value':5}, value appears as a byte value in GemFire. If I put {'ID':11,'value':500}, value appears as a word and if I put {'ID':11,'value':50000} it appears as an Integer.
A problem arises when I query data from GemFire and order them. For example, if I use a query such as select * from /my_region order by value it fails, saying it cannot compare a byte with a word (or byte with an integer).
Is there any way to declare the data type in JSON? Or any other method to get rid of this problem?
To add a bit of insight into this problem... in reviewing GemFire/Geode source code, it would seem it is not possible to configure the desired value type and override GemFire/Geode's default behavior, which can be seen in JSONFormatter.setNumberField(..).
I will not explain how GemFire/Geode involves the JSONFormatter during a Region.put(key, value) operation as it is rather involved and beyond the scope of this discussion.
However, one could argue that the problem is not necessarily with the JSONFormatter class, since storing a numeric value in a byte is more efficient than storing the value in an integer, especially when the value would indeed fit into a byte. Therefore, the problem is really that the Comparator used in the Query processor should be able to compare numeric values in the same type family (byte, short, int, long), upcasting where appropriate.
If you feel so inclined, feel free to file a JIRA ticket in the Apache Geode JIRA repository at https://issues.apache.org/jira/browse/GEODE-72?jql=project%20%3D%20GEODE
Note, Apache Geode is the open source "core" of Pivotal GemFire now. See the Apache Geode website for more details.
Cheers!
Your best bet would be to take care of this with a custom module or a groovy script. You can either write a custom module in Java to do the conversion and then upload the custom module into SpringXD, then you could reference your custom module like any other processor. Or you could write a script in Groovy and pass the incoming data through a transform processor.
http://docs.spring.io/spring-xd/docs/current/reference/html/#processors
The actual conversion probably won't be too tricky, but will vary depending on which method you use. The stream creation would look something like this when you're done.
stream create --name myRabbitStream --definition "rabbit | my-custom-module | gemfire-json-server etc....."
stream create --name myRabbitStream --definition "rabbit | transform --script=file:/transform.groovy | gemfire-json-server etc...."
It seems like you have your source and sink modules set up just fine, so all you need to do is get your processor module setup to do the conversion and you should be all set.