Edit json object by lua script in redis - json

I want edit my json object before back from the Redis server,
In my Redis server I have 4 keys:
user:1 {"Id":"1","Name":"Gholzam","Posts":"[]"}
user:1:post:1 {"PostId":"1","Content":"Test content"}
user:1:post:2 {"PostId":"2","Content":"Test content"}
user:1:post:3 {"PostId":"3","Content":"Test content"}
I want to get this context by lua script,How ? :
{"Id":"1","Name":"Gholzam","Posts":"[{"PostId":"1","Content":"Test
content"},{"PostId":"1","Content":"Test
content"},{"PostId":"1","Content":"Test content"}]}

The choice of client here is largely irrelevant; the important thing to do is: figure out the data storage. You say you have 4 keys - but it is not obvious to me how we we know, given user:1, what the posts are. Common approaches there include:
have a set called user:1:posts (or something similar) which contains either the full keys (user:1:post:1, etc) or the relative keys (1, etc)
have a hash called user:1:posts (or something similar) which contains the posts keyed by their id
I'd be tempted to use the latter approach, as it is more direct - so I might have:
user:1, a string with contents {"Id":"1","Name":"Gholzam","Posts":"[]"}
user:1:posts, a hash with 3 pairs:
key 1 with value {"PostId":"1","Content":"Test content"}
key 2 with value {"PostId":"2","Content":"Test content"}
key 3 with value {"PostId":"3","Content":"Test content"}
Then you can use hgetall or hvals to get the posts easily.
The second part is how to manipulate json at the server. The good news here is that redis provides access to json tools inside lua via cjson.
I am an expert in neither cjson nor lua; however, frankly my advice is: don't do this. IMO, redis works best if you let it focus on what it is great at: storage and retrieval. You probably can bend it to your whim, but I would be very tempted to do any json manipulation outside of redis.

Related

Is Athena a viable/sensible choice for infrequently searching *unstructured* JSON?

I'm recording request and response headers and bodies for all traffic to our API and from our API to 3rd party services into S3 as into tiny objects.
I want to be able to query this data infrequently. For example (pseudo-code):
select $.cars[0].color from "objects" where object_path in (....);
Other info:
Many "objects" in S3 won't have a valid path to $.cars[0].color (it's just one example).
I hope to not use Glue.
Cost is important - this is something that will be queried very infrequently. Configuring some ElasticSearch/similar solution is terribly out of budget for the use case.
I hope to not define my own set of schemas (this is simply not feasible).
Athena says it can search unstructured JSON. I'm having trouble creating a proof-of-concept to show this is true.
Is Athena right fr me? Am I missing a better solution?
I think Athena will work for your case.
Athena handles missing properties in JSON objects. For example, if you define the cars column as array<struct<color:string>>:
the property can be missing ⇒ SELECT cars … will be NULL
it can be an empty list ⇒ SELECT cars[1] … (Athena arrays start at 1) will result in an error, but element_at(cars, 1) and try(cars[1]) will return NULL
the object may be missing the color property ⇒ SELECT cars[1].color … will be NULL
for completely free-form JSON define the column as string and use the JSON functions to query it.
Glue is not necessary. Create the table manually, from your application, or with CloudFormation, and configure it to use partition projection and you will not have to think about using Glue crawlers at all.
Athena doesn't cost anything when you don't use it, and if you will query only infrequently this is key. Make sure to compress your data, and partition it in a way that supports your query patterns (e.g. by date or month if you most often will query recent data).
Not sure what you mean by having to define your "own set of schemas", so perhaps you can clarify that part?

What's the difference between json data being encoded or not

What's the purpose (not what it becomes) of doing json_encode on this before I am putting into the database
rating: {cleanliness: 3, publicFacility: 1, roomFacility: 2, security: 2}
to become this
rating: "{"cleanliness":3,"publicFacility":1,"roomFacility":2,"security":2}"
I see no point of doing this cause I need to json_decode it again before serving it back... can anybody clear me out?
Do not store json encoded data in the database. You mitigate the whole point of a relational database this way and make searching for values an expensive task. I see in your sample the attributes cleanliness, publicFacility, roomFacility and security. Those should be columns in your database so you can search for something like "all entries with a cleanliness higher than 3".
It works with the JSON column type but it is more expensive than using normal columns.
Edit: Check the use-case for your database entry. If you are sure you never need to search in or order by the encoded attributes you can store data encoded as json string. However, if your database supports the JSON column type, you should use that one because it allows searching in the stored JSON (but is more expensive than searching in normal columns). </Edit>
Second point: The second code snipped (with the quotation marks) looks like invalid syntax for json.

Store multiple authors in to couchbase database

I am a newbie to "couchbase server". What i am looking for is to store 10 author names to couchbase document one after another. Someone please help me whether the structure is like a single document "author" and multiple values
{ id : 1, name : Auther 1}, { id : 2, name : Author 2}
OR store Author 1 to a document and Author 2 to another document.
If so, how can i increment the id automatically before "insert" command.
you can store all authors in a single document
{ doctype : "Authors",
AuthorNames:[
{
id: 1,
Name : "author1"
}
{
id: 2,
Name : "author2"
}
so on
]
IF you want to increase the ID, one is to enter one author name at a time in new document, but ID will be randomly generated and it would not in incremental order.
In Couchbase think more about how your application will be using the data more than how you are want to store it. For example, will your application need to get all of the 10 authors all of the time? If so, then one document might be worthwhile. Perhaps your application needs to only ever read/write one of the authors at a time. Then you might want to put each in their own, but have an object key pattern that makes it so you can get the object really fast. Objects that are used often are kept in the managed cache, other objects that are not used often may fall out of the managed cache...and that is ok.
The other factor is what your reads to writes ratio is on this data.
So like I said, it depends on how your application will be reading and writing your data. Use this as the guidance for how your data should be stored.
The single JSON document is pretty straight forward. The more advanced schema design where each author is in its own document and you access them via object key, might be a bit more complicated, but ultimately faster and more scalable depending on what I already pointed out. I will lay out an example schema and some possibilities.
For the authors, I might create each author JSON document with an object key like this:
authors::ID
Where ID is a value I keep in a special incrementer object that I will called authors::incrementer. Think of that object as a key value pair only holding an integer that happens to be the upper bound of an array. Couchbase SDKs include a special function to increment just such an integer object. With this, my application can put together that object key very quickly. If I want to go after the 5th author, I do a read by object key for "authors::5". If I need to get 10, I do a parallelized BulkGet function and get authors::1 through authors::10. If I want to get all the authors, I get the incrementer object, and get that integer and then to a parallelized bulk get. This way i can get them in order or in whatever order I feel like and I am accessing them by object key which is VERY fast in Couchbase.
All this being said, I could use a view to query this data or the upcoming "SQL for Documents" in Couchbase 4.0 or I can mix and match when I query and when I get objects by their key. Key access will ALWAYS be faster. It is the difference between asking a question then going and getting the object and simply knowing the answer and getting it immediately.

If JSON represents the 'object', what represents the 'class'?

JSON appears to be a nice way to represent a complex data structure in plain text. If we think of this complex data structure as analogous to an OOP object - an instance of a class - then is there a commonly used JSON-like format that represents the class itself (just the data part - forget methods)? Can JSON itself be used for this?
To put it another way, if JSON encodes name-value pairs, what should I use if I want to encode only the names?
The reason I want this is that I am designing a protocol to use with jQuery (to which I am a complete novice by the way). The client will communicate to the server the structure of the JSON object it wants back, and the server will return a JSON object of that structure with the values added.
The key point is that it is the client that is in full control of what data fields (name-value pairs) the server returns. It's a bit different from all the examples of jQuery that I've found so far on the web where the client makes a request (which usually includes a very limited set of parameters, if any) and the server makes the decision as to what fields to return in the JSON reply.
(Obviously, what the client asks for must be congruent with the server's data model; if the server has an array of widgets each with its own price, the client can't ask for an array of prices each with its own widget.)
This must be a common problem, and I don't want to reinvent the wheel. I want to adopt a solution that is already in common use across the web.
Edit
I just found JSON Schema. This is not what I am looking for. It contains way more than I need.
Edit
I'm looking more for a 'this is how it is usually done' answer, rather than a 'you could try…' answer. (I can invent dozens of possible answers myself.)
To encode only names within JSON, you could use a key/value pair where the key is either the class name or just a key named 'values' - with the value being an array of strings that are the names to be returned by the server. For example:
{ 'class_name' : [ "name1", "name2", "name3" ] }
The server can then either detect the class name from the key used and return the supplied values for the names in the array if the class supports it or ignore if it does not.
I'm looking more for a 'this is how it is usually done' answer
There is no single "correct" way to do what you want. Many people have their implementation. It depends on various factors -- what you want to do, where you want to do, how efficiently you want it to do?
For simple structures I would prefer and suggest the answer given by #dbr9979.
For nested structures, you can have nested arrays. Something like:
{
"nestedfield1": {
"nestedfield11":["nestedfield111", "nestedfield112"],
"nestedfield12":["nestedfield121", "nestedfield122"],
"__SIMPLE_FIELDS__": ["simplefield13", "simplefield14"]
}
}
The point is, if the key is __SIMPLE_FIELDS__, the value is an array of simple fields (string, numbers etc..), else the key stands for the key in the object.
For something more complex, what I would suggest is you have predefined structures, that both the server and the client know of. This is particularly useful when you have to make multiple identical requests. Assign some unique number for each of them. Something like:
1 => <the structure above>
2 => ["simplefield1", "simplefield2" ..]
3 => etc .. etc
The server stores the above structure and the relevant number in the database or something. And now, as it may be obvious by now, client sends across the id of the required structure, and the server responds in the appropriate fashion.
I think what you meant by this:
the client that is in full control of what data fields (name-value pairs) the server returns.
is like the difference between SELECT * FROM Bags and SELECT color, price FROM Bag in SQL. Am I interpreting you correctly?
You could query with:
{
'resource': 'Bag',
'field_names': ['color', 'price']
}
which will return the response:
{
'status': 'success',
'result': [
{'color': 'red', 'price': 50},
{'color': 'blue', 'price': 45},
]
}
most likely though, you may not actually need your request to be a JSON object; I've seen implementations where the field names is taken from the query string, like http://foo.com/bag?fields=color,price
I was looking for Partial Response.
RESTful API Design: can your API give developers just the information they need? explains it all and gives examples from LinkedIn, Facebook, and Google. Google and Facebook both have similar approaches. Here's how Lie Ryan's example would look using Google's approach:
url?fields=status,result(color,price)
Since Google and Facebook are behind this, I would not be surprised to see this become a de facto standard.
In my case I am likely to run into a length limitation on the URL and so have to use POST instead, but this is an excellent starting point for me.

Where Do I Store Hash Table or Dictionary Key Names

When I'm working with Hash Tables/Dictionaries I sometimes struggle with how to specify keys.
For example: if I create a simple Dictionary (using Python for this example),
foo = {'bar': 'baz', 'foobar': 'foobaz' }
I can access values (in other modules) with the key values: (foo['bar']) and get baz back.
In the words of Dr. Evil, "pretty standard, really."
Unfortunately, using static strings for keys tightly couples any modules using this Dictionary to its implementation. Of course, this can also apply when using other key types (e.g. Enums, Objects, etc.); anyway you slice it, all modules which access the Dictionary need to know the values for the keys.
To resolve this, I typically use static constant string values (or Enums if available in the language) for keys, and either store them publicly in the local class/module, or in a separate module/class. Therefore any changes to the dictionary keys themselves are kept in a single location.
This usually looks like this:
BAR_KEY = 'bar'
foo[BAR_KEY] = 'foobar'
Are there better ways of specifying keys such that the use of the Dictionary doesn't necessarily couple a module/class to its implementation?
Note: I've seen a few responses in SO which address this (e.g. property-to-reference-a-key-value-pair-in-a-dictionary), but the topics didn't seem to address this issue specifically. The answers were helpful, but I'd like a wider range of experience.
Why not make a class for this, which only contains properties? This is done nicely with Python (from what I know), and works well with other languages, too. Refactoring the names is trivial with today's tools, too.
In cases where I'm passing the object around and I've got known keys, I'd always prefer adding an attribute to an object. IMO, the use case of dictionaries is when you don't know what the keys are.
Python is trivial:
foo.bar=baz
Java is pretty much the same:
class Foo { public String bar="baz"; }
The Python performance would be pretty much identical, since a property lookup is just a dictionary lookup and the Java performance would be better.
I sometimes create a separate class to hold the dictionary keys. That gives them their own namespace, as well as the regular benefits of having the keys be in const strings, namely that you don't have the risk of typos, you get code completion, and the string value is easy to change. If you don't want to create a separate class, you get all of the benefits except a namespace just from have const strings.
That said, I think you're getting close to soft coding territory. If the keys in the dictionary change, it's OK for code using the dictionary to change.
Personally, I use your method. It's pretty sensible, simple, and solves the actual problem.
I usually do the same thing; if the key is always going to be the same, make a 'constant static' in whatever language to hold the key.