How I can map NSDictionary to NSManagedObject without writing to CoreData? - json

I use a MagicalRecord, and am having a little trouble using it.
A server sends me a JSON, and I need it as quickly as possible ro map to the existing NSManagedObject and give it to the block.
NSManagedObjectContext *localContext = [NSManagedObjectContext MR_contextForCurrentThread];
[Review MR_importFromObject:dictionary inContext:localContext];
[localContext MR_saveOnlySelfAndWait];
And after:
[[CacheOperation sharedOperation]saveBestRateProductByDict:reviewDict];
Review *review = [Review MR_findFirstByAttribute:#"id" withValue:[reviewDict objectForKey:#"id"]];
But if I have many objects, it takes a lot of time.
How I can map NSDictionary to NSManagedObject without writing to CoreData?

I guess MR_importFromObject check for the existence of the object to have an insert or update behavior.
That's great for most cases. (and it was made for the 90% http://www.cimgf.com/2012/05/29/importing-data-made-easy/ )
But you are in the 10% (me too, if it can be of any support)
This behavior mean that there is a request to find the object, and a request to update it. Multiple by the number of objects, that can be huge.
You can refer to a good apple doc (the part Implementing Find-or-Create Efficiently ) :
https://developer.apple.com/library/ios/documentation/Cocoa/Conceptual/CoreData/Articles/cdImporting.html
One trick is to make only one request for all the objects you want to update, and one request to update them all. It's worse in memory usage but better in I/O, that should speed you up.
We went another way using TMCache and storing raw JSON for objects changing frequently.
Hope this help.

Related

Splitting a feature collection by system index in Google Earth Engine?

I am trying to export a large feature collection from GEE. I realize that the Python API allows for this more easily than the Java does, but given a time constraint on my research, I'd like to see if I can extract the feature collection in pieces and then append the separate CSV files once exported.
I tried to use a filtering function to perform the task, one that I've seen used before with image collections. Here is a mini example of what I am trying to do
Given a feature collection of 10 spatial points called "points" I tried to create a new feature collection that includes only the first five points:
var points_chunk1 = points.filter(ee.Filter.rangeContains('system:index', 0, 5));
When I execute this function, I receive the following error: "An internal server error has occurred"
I am not sure why this code is not executing as expected. If you know more than I do about this issue, please advise on alternative approaches to splitting my sample, or on where the error in my code lurks.
Many thanks!
system:index is actually ID given by GEE for the feature and it's not supposed to be used like index in an array. I think JS should be enough to export a large featurecollection but there is a way to do what you want to do without relying on system:index as that might not be consistent.
First, it would be a good idea to know the number of features you are dealing with. This is because generally when you use size().getInfo() for large feature collections, the UI can freeze and sometimes the tab becomes unresponsive. Here I have defined chunks and collectionSize. It should be defined in client side as we want to do Export within the loop which is not possible in server size loops. Within the loop, you can simply creating a subset of feature starting from different points by converting the features to list and changing the subset back to feature collection.
var chunk = 1000;
var collectionSize = 10000
for (var i = 0; i<collectionSize;i=i+chunk){
var subset = ee.FeatureCollection(fc.toList(chunk, i));
Export.table.toAsset(subset, "description", "/asset/id")
}

Performance of /ui2/cl_json serialization

In the past I used this to return any data structure via SAP RFC:
json = /ui2/cl_json=>serialize( data = <lt_result>
pretty_name = /ui2/cl_json=>pretty_mode-low_case ).
This works very well if <lt_result> is small, but for bigger data sets this is slow.
How can I return any data structure via a generic ABAP RFC function module? I use PyRFC, but AFAIK this should not matter much for this question.
This may perform better:
DATA(lo_json_writer) = cl_sxml_string_writer=>create( type = if_sxml=>co_xt_json ).
CALL TRANSFORMATION id
SOURCE result = <lt_result>
RESULT XML lo_json_writer.
ev_json_data = lo_json_writer->get_output( ). " yours export parameter
Taken from official documentation.
If performance is most important for you, then /ui2/cl_json is the wrong choice. While it is an ABAP code and SAP_BASIS 700 compatible syntax.
CALL TRANSFORMATION id is better with respect to performance. This is also written in my blog. BTW: I am an author of /ui2/cl_json.
But if it goes about flexibility, comfort, supported data types and desired format, then there is no better solution, for now, comparing to /ui2/cl_json.
Potentially, one can get some better, specialized implementation, using CALL TRANSFORMATION and own XSLT transformation, but it would be already slower then id one and would cost more coding effort.
There are still potential to make /ui2/cl_json faster, by dropping support of lower releases (below 7.40) and using the build in SXML parser for processing the JSON, but that would be some work to do. And I do not have a time / actual request for that.
#Sandra Rossi: I would be happy to apply any performance suggestions for /ui2/cl_json, so if you have concrete examples, please send them to me. Here or in the blog. But please take into consideration that for the current moment, I need to conform to SAP_BASIS 7.00 limits.

Apigility GET collection returns only 10 results when content negotiation is set to JSON

This issue is bugging me for some time now. To test it I just installed a fresh Apigility, set the db (PDO:mysql) and added a DB-Connected service. In the table I have 40+ records. When I make a GET collection request the response looks OK (with the default HAL content negotiation). Then I change the content negotiation to JSON. Now when I make a GET collection request my response contains only 10 elements.
So my question is: where do I set/change this limit?
You can set the page size manually, like so:
$paginator = $this->getAlbumTable()->fetchAll(true);
// set the current page to what has been passed in query string, or to 1 if none set
$paginator->setCurrentPageNumber((int) $this->params()->fromQuery('page', 1));
// set the number of items per page to 10
$paginator->setItemCountPerPage(10);
http://framework.zend.com/manual/current/en/tutorials/tutorial.pagination.html
Could you please send the page_size, total_items part at the end of the json output?
it's like:
"page_count": 140002,
"page_size": 25,
"total_items": 3500035,
"page": 1
This is not an ideal fix, because it requires you to go into the source code rather than using the page size given in the UI.
The collection class that is auto generated for you by the DB-Connected style derives off of Zend/Paginator/Paginator. This class defines the $defaultItemCountPerPage static protected member which is defaulted to 10. That's why you're only getting 10 results. If you open up the auto-generated collection class for your entity and add: protected static $defaultItemCountPerPage = 100; in the otherwise empty class, you will see that you now get up to 100 results in the response. You can look at other Paginator class variables and methods that you could replace in your derived class to get your desired behavior.
This is not an ideal solution. I'd prefer that the generated code automatically used the same configed page size that the HalJson strategy uses. Maybe I'll contribute a PR to change that. Or, maybe I'll just use the HalJson approach. It does seem like the better way to go. You should have some limit to how much data you load in from the DB at a time to not have an overly long running query or an overly large collection of data coming back you have to deal with. And, whatever limit you set, what do you do when you hit that limit? With the simple Json method, you can't ever get "page 2" of data. So, if you are going to work with some sizeable amount of data, it might be better to use HalJson on and then have some logic on the client side to grab pages of data at a time as needed. The returned JSON structure is a little more complicated, but not terribly so.
I'm probably in the same spot you are -- I'm trying to do a simple little api to play with while keeping everything simple and so I didn't want the client to have to deal with the other stuff in HalJson, but probably better to deal with that complexity and have a smooth way to page through data if you're going to use this with some real set of data. At least, that's the pep talk I'm giving myself right now. :-)

How to best validate JSON on the server-side

When handling POST, PUT, and PATCH requests on the server-side, we often need to process some JSON to perform the requests.
It is obvious that we need to validate these JSONs (e.g. structure, permitted/expected keys, and value types) in some way, and I can see at least two ways:
Upon receiving the JSON, validate the JSON upfront as it is, before doing anything with it to complete the request.
Take the JSON as it is, start processing it (e.g. access its various key-values) and try to validate it on-the-go while performing business logic, and possibly use some exception handling to handle vogue data.
The 1st approach seems more robust compared to the 2nd, but probably more expensive (in time cost) because every request will be validated (and hopefully most of them are valid so the validation is sort of redundant).
The 2nd approach may save the compulsory validation on valid requests, but mixing the checks within business logic might be buggy or even risky.
Which of the two above is better? Or, is there yet a better way?
What you are describing with POST, PUT, and PATCH sounds like you are implementing a REST API. Depending on your back-end platform, you can use libraries that will map JSON to objects which is very powerful and performs that validation for you. In JAVA, you can use Jersey, Spring, or Jackson. If you are using .NET, you can use Json.NET.
If efficiency is your goal and you want to validate every single request, it would be ideal if you could evaluate on the front-end if you are using JavaScript you can use json2.js.
In regards to comparing your methods, here is a Pro / Cons list.
Method #1: Upon Request
Pros
The business logic integrity is maintained. As you mentioned trying to validate while processing business logic could result in invalid tests that may actually be valid and vice versa or also the validation could inadvertently impact the business logic negatively.
As Norbert mentioned, catching the errors before hand will improve efficiency. The logical question this poses is why spend the time processing, if there are errors in the first place?
The code will be cleaner and easier to read. Having validation and business logic separated will result in cleaner, easier to read and maintain code.
Cons
It could result in redundant processing meaning longer computing time.
Method #2: Validation on the Go
Pros
It's efficient theoretically by saving process and compute time doing them at the same time.
Cons
In reality, the process time that is saved is likely negligible (as mentioned by Norbert). You are still doing the validation check either way. In addition, processing time is wasted if an error was found.
The data integrity can be comprised. It could be possible that the JSON becomes corrupt when processing it this way.
The code is not as clear. When reading the business logic, it may not be as apparent what is happening because validation logic is mixed in.
What it really boils down to is Accuracy vs Speed. They generally have an inverse relationship. As you become more accurate and validate your JSON, you may have to compromise some on speed. This is really only noticeable in large data sets as computers are really fast these days. It is up to you to decide what is more important given how accurate you think you data may be when receiving it or whether that extra second or so is crucial. In some cases, it does matter (i.e. with the stock market and healthcare applications, milliseconds matter) and both are highly important. It is in those cases, that as you increase one, for example accuracy, you may have to increase speed by getting a higher performant machine.
Hope this helps.
The first approach is more robust, but does not have to be noticeably more expensive. It becomes way less expensive even when you are able to abort the parsing process due to errors: Your business logic usually takes >90% of the resources in a process, so if you have an error % of 10%, you are already resource neutral. If you optimize the validation process so that the validations from the business process are performed upfront, your error rate might be much lower (like 1 in 20 to 1 in 100) to stay resource neutral.
For an example on an implementation assuming upfront data validation, look at GSON (https://code.google.com/p/google-gson/):
GSON works as follows: Every part of the JSON can be cast into an object. This object is typed or contains typed data:
Sample object (JAVA used as example language):
public class someInnerDataFromJSON {
String name;
String address;
int housenumber;
String buildingType;
// Getters and setters
public String getName() { return name; }
public void setName(String name) { this.name=name; }
//etc.
}
The data parsed by GSON is by using the model provided, already type checked.
This is the first point where your code can abort.
After this exit point assuming the data confirmed to the model, you can validate if the data is within certain limits. You can also write that into the model.
Assume for this buildingType is a list:
Single family house
Multi family house
Apartment
You can check data during parsing by creating a setter which checks the data, or you can check it after parsing in a first set of your business rule application. The benefit of first checking the data is that your later code will have less exception handling, so less and easier to understand code.
I would definitively go for validation before processing.
Let's say you receive some json data with 10 variables of which you expect:
the first 5 variables to be of type string
6 and 7 are supposed to be integers
8, 9 and 10 are supposed to be arrays
You can do a quick variable type validation before you start processing any of this data and return a validation error response if one of the ten fails.
foreach($data as $varName => $varValue){
$varType = gettype($varValue);
if(!$this->isTypeValid($varName, $varType)){
// return validation error
}
}
// continue processing
Think of the scenario where you are directly processing the data and then the 10th value turns out to be of invalid type. The processing of the previous 9 variables was a waste of resources since you end up returning some validation error response anyway. On top of that you have to rollback any changes already persisted to your storage.
I only use variable type in my example but I would suggest full validation (length, max/min values, etc) of all variables before processing any of them.
In general, the first option would be the way to go. The only reason why you might need to think of the second option is if you were dealing with JSON data which was tens of MBs large or more.
In other words, only if you are trying to stream JSON and process it on the fly, you will need to think about second option.
Assuming that you are dealing with few hundred KB at most per JSON, you can just go for option one.
Here are some steps you could follow:
Go for a JSON parser like GSON that would just convert your entire
JSON input into the corresponding Java domain model object. (If GSON
doesn't throw an exception, you can be sure that the JSON is
perfectly valid.)
Of course, the objects which were constructed using GSON in step 1
may not be in a functionally valid state. For example, functional
checks like mandatory fields and limit checks would have to be done.
For this, you could define a validateState method which repeatedly
validates the states of the object itself and its child objects.
Here is an example of a validateState method:
public void validateState(){
//Assume this validateState is part of Customer class.
if(age<12 || age>150)
throw new IllegalArgumentException("Age should be in the range 12 to 120");
if(age<18 && (guardianId==null || guardianId.trim().equals(""))
throw new IllegalArgumentException("Guardian id is mandatory for minors");
for(Account a:customer.getAccounts()){
a.validateState(); //Throws appropriate exceptions if any inconsistency in state
}
}
The answer depends entirely on your use case.
If you expect all calls to originate in trusted clients then the upfront schema validation should be implement so that it is activated only when you set a debug flag.
However, if your server delivers public api services then you should validate the calls upfront. This isn't just a performance issue - your server will likely be scrutinized for security vulnerabilities by your customers, hackers, rivals, etc.
If your server delivers private api services to non-trusted clients (e.g., in a closed network setup where it has to integrate with systems from 3rd party developers), then you should at least run upfront those checks that will save you from getting blamed for someone else's goofs.
It really depends on your requirements. But in general I'd always go for #1.
Few considerations:
For consistency I'd use method #1, for performance #2. However when using #2 you have to take into account that rolling back in case of non valid input may become complicated in the future, as the logic changes.
Json validation should not take that long. In python you can use ujson for parsing json strings which is a ultrafast C implementation of the json python module.
For validation, I use the jsonschema python module which makes json validation easy.
Another approach:
if you use jsonschema, you can validate the json request in steps. I'd perform an initial validation of the most common/important parts of the json structure, and validate the remaining parts along the business logic path. This would allow to write simpler json schemas and therefore more lightweight.
The final decision:
If (and only if) this decision is critical I'd implement both solutions, time-profile them in right and wrong input condition, and weight the results depending on the wrong input frequency. Therefore:
1c = average time spent with method 1 on correct input
1w = average time spent with method 1 on wrong input
2c = average time spent with method 2 on correct input
2w = average time spent with method 2 on wrong input
CR = correct input rate (or frequency)
WR = wrong input rate (or frequency)
if ( 1c * CR ) + ( 1w * WR) <= ( 2c * CR ) + ( 2w * WR):
chose method 1
else:
chose method 2

Set UITableViewCell data from remote JSON file

I have UITableView representing list of cities (100 cities).
For each city I want to call specific remote(URL) JSON to get city's weather information and populate response data for each city cell in the UITableView.
When I run application, I want to see my table as fast as possible, so I don't need to wait for all json responses. I want that informations got asynchronously (when specific json is loaded, set it's information for corresponding city cell in the UITableView).
Note: It is important for me to call seperate remote JSON files.
Which technic is the best for this task?
I would start with the following approach:
Create a data structure to hold city information, including:
path to your data service,
service call "state" (idle, waiting, completed, error),
weather information (from JSON returned by service call)
When you first show the table, you will want to:
initialize your array (of the aforementioned data structure),
initiate each service call asynchronously,
set each row (city) state to waiting.
You will also probably want to return a custom UITableCellView with the city name (if you already have it) and a spinning activity indicator. This will be your best option to have a fast load time (not waiting for services to complete) and give some visual indication that the data is loading.
Each service call should use the ViewController as its delegate; you will need a key field so that when the services return, they can identify with which row/city they are associated.
As each service completes and calls the delegate, it will send the data to the ViewController, which (in turn) will update the array and initiate a UITableView update.
The UITableView update is, in my opinion, the most difficult part. Typically cells are drawn or updated when they become visible; the table pre-fetches all visible cells' geometry and then queries the actual contents when it's ready to draw each cell; as a result, your strategy for updating cells will depend on how your table is used.
If your cell geometry changes, you will most likely need to redraw your entire table; I shudder to think about what 50 simultaneous UITableView redraws will do for your app, so you might need to set a time-threshold to "chunk" updates and handle drawing more intelligently.
[theTableView reloadData] will cause the entire table to be re-queried and redrawn.
If your cell geometry does not change, you can try to be more surgical of updating only the visible cells (the non-visible ones aren't an issue since their data will be queried when they become visible).
[theTableView visibleCells] returns an array of visible cells; when your service call returns, you could update the data and then search the array to see if the cell in question is visible; if it is, you will probably need to send the specific UITableCellView a setNeedsDisplay message.
There is a good explanation of setNeedsDisplay, setNeedsLayout, and 'reloadData' at http://iosdevelopertips.com/cocoa/understanding-reload-repaint-and-re-layout-for-uitableview.html.
There is a relevant SO question at How to refresh UITableViewCell?
Lastly, you will probably want to implement some updating logic in the service delegate error routine, just so you don't create endlessly spinning activity indicators.
I do this now while searching multiple servers. I use Core Data, but you can use an NSMutableArray to accumulate your JSON responses.
Every time you finish receiving date from one of your servers (for example, when connectionDidFinishLoading executes), take the JSON data object and add it to an NSMutableArray (let's call it weatherResults) (add it using the addObject method). You may want to convert the JSON to an NSDictionary before adding it to the mutable array weatherResults.
Assuming your dataSource delegate methods refer to what is in the weatherResults NSMutableArray (for example, getting the number of rows from the size of the array using [weatherResults count]) you can do the following:
After inserting the object to the array, you can simply call reloadData in the dataSource controller. You will see the table update as each new JSON results arrives. The results should append to the bottom of the table as they come in. If you want to sort the NSMutableArray each time a JSON results arrives, you can do that too.
I do this and the time it takes to resort and reload the table is insignificant on my iPad. If you do not resort, it should be even faster.
By the way, in this explanation, I assume that the JSON response contains all of the information that you need to fill in your table cell. That may not be the case. If it's not, you will have to correlate the response with other information you have, such as a list of cities that your program is presenting.