Where Do I Store Hash Table or Dictionary Key Names - language-agnostic

When I'm working with Hash Tables/Dictionaries I sometimes struggle with how to specify keys.
For example: if I create a simple Dictionary (using Python for this example),
foo = {'bar': 'baz', 'foobar': 'foobaz' }
I can access values (in other modules) with the key values: (foo['bar']) and get baz back.
In the words of Dr. Evil, "pretty standard, really."
Unfortunately, using static strings for keys tightly couples any modules using this Dictionary to its implementation. Of course, this can also apply when using other key types (e.g. Enums, Objects, etc.); anyway you slice it, all modules which access the Dictionary need to know the values for the keys.
To resolve this, I typically use static constant string values (or Enums if available in the language) for keys, and either store them publicly in the local class/module, or in a separate module/class. Therefore any changes to the dictionary keys themselves are kept in a single location.
This usually looks like this:
BAR_KEY = 'bar'
foo[BAR_KEY] = 'foobar'
Are there better ways of specifying keys such that the use of the Dictionary doesn't necessarily couple a module/class to its implementation?
Note: I've seen a few responses in SO which address this (e.g. property-to-reference-a-key-value-pair-in-a-dictionary), but the topics didn't seem to address this issue specifically. The answers were helpful, but I'd like a wider range of experience.

Why not make a class for this, which only contains properties? This is done nicely with Python (from what I know), and works well with other languages, too. Refactoring the names is trivial with today's tools, too.

In cases where I'm passing the object around and I've got known keys, I'd always prefer adding an attribute to an object. IMO, the use case of dictionaries is when you don't know what the keys are.
Python is trivial:
foo.bar=baz
Java is pretty much the same:
class Foo { public String bar="baz"; }
The Python performance would be pretty much identical, since a property lookup is just a dictionary lookup and the Java performance would be better.

I sometimes create a separate class to hold the dictionary keys. That gives them their own namespace, as well as the regular benefits of having the keys be in const strings, namely that you don't have the risk of typos, you get code completion, and the string value is easy to change. If you don't want to create a separate class, you get all of the benefits except a namespace just from have const strings.
That said, I think you're getting close to soft coding territory. If the keys in the dictionary change, it's OK for code using the dictionary to change.

Personally, I use your method. It's pretty sensible, simple, and solves the actual problem.

I usually do the same thing; if the key is always going to be the same, make a 'constant static' in whatever language to hold the key.

Related

Remove JSON keys with wildcards from a MySQL field

I have a MySQL 8.0.20 database with a table that describes metadata about uploaded image files. One column contains a JSON object with a whole bunch of auto-generated data that I'm trying to clean up.
This JSON object sometimes contains one or more variable key names that match a specific pattern. Something like
{
"image_name": "P10043983",
"image_size": "60138",
"image_original_exifdata": "{
'FileName':'P10043983.jpg',
'MimeType':'image/jpeg',
'UndefinedTag:0xA435':'\u0000\u0000\u0000\u0000\u0000\u0000'
}"
}
That UndefinedTag:0xA435 (with many permutations) is the problem. It's referring to various image Exif details like lens type, GPS data, etc. It's stuff that I'm not interested in and that these cameras mostly don't provide, so I've ended up with a table full of long strings of useless characters just taking up space. I want those JSON fields gone for performance and cleanliness.
Is there a way to run a SQL query that would use wildcards or regular expressions to find (and, ideally, remove) all of these pesky variable keys? I'd like to avoid manually making a list of all of the possible "UndefinedTag" keys to search against, and I also didn't like the results when I just treated the whole thing as a string and did REGEXP_REPLACE calls (it sometimes left trailing commas that broke my JSON and were difficult for me to avoid/resolve).
I know some of the JSON functions like JSON_SEARCH() accept wildcards, but it explicitly says the search path can't end in a wildcard (so no UndefinedTag:0x** allowed). Many of the functions I'm after (e.g., JSON_REMOVE()) don't accept wildcards at all. Hell, I've even had trouble finding known keys, and I suspect that silly colon in the key name might have something to do with it.
So, how can I clean up my table and remove the many forms of this UndefinedTag problem? Maybe it's easier to just go back to the regex_replace plan and deal instead with the trailing commas?

Deserialize an anonymous JSON array?

I got an anonymous array which I want to deserialize, here the example of the first array object
[
{ "time":"08:55:54",
"date":"2016-05-27",
"timestamp":1464332154807,
"level":3,
"message":"registerResourcePath ('', '/sap/bc/ui5_ui5/ui2/ushell/resources/')",
"details":"","component":"sap.ui.ModuleSystem"},
{"time":"08:55:54","date":"2016-05-27","timestamp":1464332154808,"level":3,"message":"URL prefixes set to:","details":"","component":"sap.ui.ModuleSystem"},
{"time":"08:55:54","date":"2016-05-27","timestamp":1464332154808,"level":3,"message":" (default) : /sap/bc/ui5_ui5/ui2/ushell/resources/","details":"","component":"sap.ui.ModuleSystem"}
]
I tried deserializing using CL_TREX_JSON_SERIALIZER, but it is corrupt and does not work with my JSON, here is why
Then I tried /UI2/CL_JSON, but it needs a "structure" that perfectly fits the object given by the JSON Object. "Structure" means in my case an internal table of objects with the attributes time, date, timestamp, level, messageanddetails. And there was the problem: it does not properly handle references and uses class description to describe the field assigned to the field-symbol. Since I can not have a list of objects but only a list of references to objects that solution also doesn't works.
As a third attempt I tried with the CALL TRANSFORMATION as described by Horst Keller, but with this method I was not able to read in an anonymous array, and here is why
My major points:
I do not want to change the JSON, since that is what I get from sap.ui.log
I prefere to use built-in functionality and not a thirdparty framework
Your problem comes out not from the anonymity of array, but from the awkwardness of SAP JSON (De)serializer, which doesn't respect double quotes, which enclose JSON attributes. The issue is thoroughly described in this answer.
If you don't want to change your JSON on-the-fly, the only way you have is to change CL_TREX_JSON_DESERIALIZER class like this.
/UI5/CL_JSON_PARSER parses JSONs with unknown format.
Note that it's got "for internal use" written on it so many times that you probably should take it seriously and clone its code to fixate it.

Edit json object by lua script in redis

I want edit my json object before back from the Redis server,
In my Redis server I have 4 keys:
user:1 {"Id":"1","Name":"Gholzam","Posts":"[]"}
user:1:post:1 {"PostId":"1","Content":"Test content"}
user:1:post:2 {"PostId":"2","Content":"Test content"}
user:1:post:3 {"PostId":"3","Content":"Test content"}
I want to get this context by lua script,How ? :
{"Id":"1","Name":"Gholzam","Posts":"[{"PostId":"1","Content":"Test
content"},{"PostId":"1","Content":"Test
content"},{"PostId":"1","Content":"Test content"}]}
The choice of client here is largely irrelevant; the important thing to do is: figure out the data storage. You say you have 4 keys - but it is not obvious to me how we we know, given user:1, what the posts are. Common approaches there include:
have a set called user:1:posts (or something similar) which contains either the full keys (user:1:post:1, etc) or the relative keys (1, etc)
have a hash called user:1:posts (or something similar) which contains the posts keyed by their id
I'd be tempted to use the latter approach, as it is more direct - so I might have:
user:1, a string with contents {"Id":"1","Name":"Gholzam","Posts":"[]"}
user:1:posts, a hash with 3 pairs:
key 1 with value {"PostId":"1","Content":"Test content"}
key 2 with value {"PostId":"2","Content":"Test content"}
key 3 with value {"PostId":"3","Content":"Test content"}
Then you can use hgetall or hvals to get the posts easily.
The second part is how to manipulate json at the server. The good news here is that redis provides access to json tools inside lua via cjson.
I am an expert in neither cjson nor lua; however, frankly my advice is: don't do this. IMO, redis works best if you let it focus on what it is great at: storage and retrieval. You probably can bend it to your whim, but I would be very tempted to do any json manipulation outside of redis.

If JSON represents the 'object', what represents the 'class'?

JSON appears to be a nice way to represent a complex data structure in plain text. If we think of this complex data structure as analogous to an OOP object - an instance of a class - then is there a commonly used JSON-like format that represents the class itself (just the data part - forget methods)? Can JSON itself be used for this?
To put it another way, if JSON encodes name-value pairs, what should I use if I want to encode only the names?
The reason I want this is that I am designing a protocol to use with jQuery (to which I am a complete novice by the way). The client will communicate to the server the structure of the JSON object it wants back, and the server will return a JSON object of that structure with the values added.
The key point is that it is the client that is in full control of what data fields (name-value pairs) the server returns. It's a bit different from all the examples of jQuery that I've found so far on the web where the client makes a request (which usually includes a very limited set of parameters, if any) and the server makes the decision as to what fields to return in the JSON reply.
(Obviously, what the client asks for must be congruent with the server's data model; if the server has an array of widgets each with its own price, the client can't ask for an array of prices each with its own widget.)
This must be a common problem, and I don't want to reinvent the wheel. I want to adopt a solution that is already in common use across the web.
Edit
I just found JSON Schema. This is not what I am looking for. It contains way more than I need.
Edit
I'm looking more for a 'this is how it is usually done' answer, rather than a 'you could try…' answer. (I can invent dozens of possible answers myself.)
To encode only names within JSON, you could use a key/value pair where the key is either the class name or just a key named 'values' - with the value being an array of strings that are the names to be returned by the server. For example:
{ 'class_name' : [ "name1", "name2", "name3" ] }
The server can then either detect the class name from the key used and return the supplied values for the names in the array if the class supports it or ignore if it does not.
I'm looking more for a 'this is how it is usually done' answer
There is no single "correct" way to do what you want. Many people have their implementation. It depends on various factors -- what you want to do, where you want to do, how efficiently you want it to do?
For simple structures I would prefer and suggest the answer given by #dbr9979.
For nested structures, you can have nested arrays. Something like:
{
"nestedfield1": {
"nestedfield11":["nestedfield111", "nestedfield112"],
"nestedfield12":["nestedfield121", "nestedfield122"],
"__SIMPLE_FIELDS__": ["simplefield13", "simplefield14"]
}
}
The point is, if the key is __SIMPLE_FIELDS__, the value is an array of simple fields (string, numbers etc..), else the key stands for the key in the object.
For something more complex, what I would suggest is you have predefined structures, that both the server and the client know of. This is particularly useful when you have to make multiple identical requests. Assign some unique number for each of them. Something like:
1 => <the structure above>
2 => ["simplefield1", "simplefield2" ..]
3 => etc .. etc
The server stores the above structure and the relevant number in the database or something. And now, as it may be obvious by now, client sends across the id of the required structure, and the server responds in the appropriate fashion.
I think what you meant by this:
the client that is in full control of what data fields (name-value pairs) the server returns.
is like the difference between SELECT * FROM Bags and SELECT color, price FROM Bag in SQL. Am I interpreting you correctly?
You could query with:
{
'resource': 'Bag',
'field_names': ['color', 'price']
}
which will return the response:
{
'status': 'success',
'result': [
{'color': 'red', 'price': 50},
{'color': 'blue', 'price': 45},
]
}
most likely though, you may not actually need your request to be a JSON object; I've seen implementations where the field names is taken from the query string, like http://foo.com/bag?fields=color,price
I was looking for Partial Response.
RESTful API Design: can your API give developers just the information they need? explains it all and gives examples from LinkedIn, Facebook, and Google. Google and Facebook both have similar approaches. Here's how Lie Ryan's example would look using Google's approach:
url?fields=status,result(color,price)
Since Google and Facebook are behind this, I would not be surprised to see this become a de facto standard.
In my case I am likely to run into a length limitation on the URL and so have to use POST instead, but this is an excellent starting point for me.

Which word do you use to describe a JSON-like object?

I have a naming issue.
If I read an object x from some JSON I can call my variable xJson (or some variation). However sometimes it is possible that the data could have come from a number of different sources amongst which JSON is not special (e.g. XMLRPC, programmatically constructed from Maps,Lists & primitives ... etc).
In this situation what is a good name for the variable? The best I have come up with so far is something like 'DynamicData', which is ok in some situations, but is a bit long and not probably not very clear to people unfamiliar with the convention.
SerializedData?
A hierarchical collection of hashes and lists of data is often referred to as a document no matter what serialization format is used. Another useful description might be payload or body in the sense of a message body for transmission or a value string written to a key/value store.
I tend to call the object hierarchy a "doc" myself, and the serialized format a "document." Thus a RequestDocument is parsed into a RequestDoc, and upon further identification it might become an OrderDoc, or a CustomerUpdateDoc, etc. An InvoiceDoc might become known generically as a ResponseDoc eventually serialized to a ResponseDocument.
The longer form is awkward, but such serialized strings are typically short-lived and localized in the code anyway.
If your data is the model, name it after the model it's representing. e.g., name it after the purpose of the contents, not the format of the contents. So if it's a list of customer information, name it "customers", or "customerModel", or something like that.
If you don't know what the contents are, the name isn't important, unless you want to differentiate the format. "responseData", "jsonResponse", etc...
And "DynamicData" isn't a long name, unless there is absolutely nothing descriptive to be said about the data. "data" might be just fine.