Need help separating unorganized JSON/JSON arrays

Need help separating unorganized JSON/JSON arrays - mysql

Ok so I don't necessarily understand this or how to do this at all but I am either looking for something that will show me how to sit there and split this portion of MYSQL json into either separate rows or just a way to export as either csv or json and then split one portion off from the others
Example of the JSON:
[{"id":2, "identifier":"IDENTIFIER:", "license":"LICENSE:", "firstname":"FIRSTNAME", "lastname":"LASTNAME", "accounts":"{"money":9595,"bank":9595}"},
{"id":2, "identifier":"IDENTIFIER", "license":"LICENSE", "firstname":"FIRSTNAME", "lastname":"LASTNAME", "accounts":"{"black_money":9595,"bank":9595,"money":9595}"}]
I want to be able to separate the three things in the JSON array called accounts this is all held in a mysql DB and I want to either be able to run something and have a exportable table that can be imported into google sheets or something of that sort so I can sort them if need be.

I expect that your json is probably like
[{"id":2, "identifier":"IDENTIFIER:", "license":"LICENSE:", "firstname":"FIRSTNAME", "lastname":"LASTNAME", "accounts":"{\"money\":9595,\"bank\":9595}"}, {"id":2, "identifier":"IDENTIFIER", "license":"LICENSE", "firstname":"FIRSTNAME", "lastname":"LASTNAME", "accounts":"{\"black_money\":9595,\"bank\":9595,\"money\":9595}"}]
but when copying / pasting the backslashes disappeared. Your json is obviously special since it includes other json. Try
function myFunction() {
var json = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet().getRange('A1').getValue()
var data = JSON.parse(json)
var result = []
result.push(['account.black_money','account.bank','account.money'])
data.forEach(function(elem){
var account = JSON.parse(elem.accounts)
result.push([account.black_money,account.bank,account.money])
})
return result
}
take a copy of : https://docs.google.com/spreadsheets/d/1NwSUF7hRNjcLRbr2HP_mjj-fPfbKJz7BNLYFr184P4o/copy

Related

dumping list to JSON file creates list within a list [["x", "y","z"]], why?

I want to append multiple list items to a JSON file, but it creates a list within a list, and therefore I cannot acces the list from python. Since the code is overwriting existing data in the JSON file, there should not be any list there. I also tried it by having just an text in the file without brackets. It just creates a list within a list so [["x", "y","z"]] instead of ["x", "y","z"]
import json
filename = 'vocabulary.json'
print("Reading %s" % filename)
try:
with open(filename, "rt") as fp:
data = json.load(fp)
print("Data: %s" % data)#check
except IOError:
print("Could not read file, starting from scratch")
data = []
# Add some data
TEMPORARY_LIST = []
new_word = input("give new word: ")
TEMPORARY_LIST.append(new_word.split())
print(TEMPORARY_LIST)#check
data = TEMPORARY_LIST
print("Overwriting %s" % filename)
with open(filename, "wt") as fp:
json.dump(data, fp)
example and output with appending list with split words:
Reading vocabulary.json
Data: [['my', 'dads', 'house', 'is', 'nice']]
give new word: but my house is nicer
[['but', 'my', 'house', 'is', 'nicer']]
Overwriting vocabulary.json

So, if I understand what you are trying to accomplish correctly, it looks like you are trying to overwrite a list in a JSON file with a new list created from user input. For easiest data manipulation, set up your JSON file in dictionary form:
{
"words": [
"my",
"dad's",
"house",
"is",
"nice"
]
}
You should then set up functions to separate your functionality to make it more manageable:
def load_json(filename):
with open(filename, "r") as f:
return json.load(f)
Now, we can use those functions to load the JSON, access the words list, and overwrite it with the new word.
data = load_json("vocabulary.json")
new_word = input("Give new word: ").split()
data["words"] = new_word
write_json("vocabulary.json", data)
If the user inputs "but my house is nicer", the JSON file will look like this:
{
"words": [
"but",
"my",
"house",
"is",
"nicer"
]
}
Edit
Okay, I have a few suggestions to make before I get into solving the issue. Firstly, it's great that you have delegated much of the functionality of the program over to respective functions. However, using global variables is generally discouraged because it makes things extremely difficult to debug as any of the functions that use that variable could have mutated it by accident. To fix this, use method parameters and pass around the data accordingly. With small programs like this, you can think of the main() method as the point in which all data comes to and from. This means that the main() function will pass data to other functions and receive new or edited data back. One final recommendation, you should only be using all capital letters for variable names if they are going to be constant. For example, PI = 3.14159 is a constant, so it is conventional to make "pi" all caps.
Without using global, main() will look much cleaner:
def main():
choice = input("Do you want to start or manage the list? (start/manage)")
if choice == "start":
data = load_json()
words = data["words"]
dictee(words)
elif choice == "manage":
manage_list()
You can use the load_json() function from earlier (notice that I deleted write_json(), more on that later) if the user chooses to start the game. If the user chooses to manage the file, we can write something like this:
def manage_list():
choice = input("Do you want to add or clear the list? (add/clear)")
if choice == "add":
words_to_add = get_new_words()
add_words("vocabulary.json", words_to_add)
elif choice == "clear":
clear_words("vocabulary.json")
We get the user input first and then we can call two other functions, add_words() and clear_words():
def add_words(filename, words):
with open(filename, "r+") as f:
data = json.load(f)
data["words"].extend(words)
f.seek(0)
json.dump(data, f, indent=4)
def clear_words(filename):
with open(filename, "w+") as f:
data = {"words":[]}
json.dump(data, f, indent=4)
I did not utilize the load_json() function in the two functions above. My reasoning for this is because it would call for opening the file more times than needed, which would hurt performance. Furthermore, in these two functions, we already need to open the file, so it is okayt to load the JSON data here because it can be done with only one line: data = json.load(f). You may also notice that in add_words(), the file mode is "r+". This is the basic mode for reading and writing. "w+" is used in clear_words(), because "w+" not only opens the file for reading and writing, it overwrites the file if the file exists (that is also why we don't need to load the JSON data in clear_words()). Because we have these two functions for writing and/or overwriting data, we don't need the write_json() function that I had initially suggested.
We can then add to the list like so:
>>> Do you want to start or manage the list? (start/manage)manage
>>> Do you want to add or clear the list? (add/clear)add
>>> Please enter the words you want to add, separated by spaces: these are new words
And the JSON file becomes:
{
"words": [
"but",
"my",
"house",
"is",
"nicer",
"these",
"are",
"new",
"words"
]
}
We can then clear the list like so:
>>> Do you want to start or manage the list? (start/manage)manage
>>> Do you want to add or clear the list? (add/clear)clear
And the JSON file becomes:
{
"words": []
}
Great! Now, we implemented the ability for the user to manage the list. Let's move on to creating the functionality for the game: dictee()
You mentioned that you want to randomly select an item from a list and remove it from that list so it doesn't get asked twice. There are a multitude of ways you can accomplish this. For example, you could use random.shuffle:
def dictee(words):
correct = 0
incorrect = 0
random.shuffle(words)
for word in words:
# ask word
# evaluate response
# increment correct/incorrect
# ask if you want to play again
pass
random.shuffle randomly shuffles the list around. Then, you can iterate throught the list using for word in words: and start the game. You don't necessarily need to use random.choice here because when using random.shuffle and iterating through it, you are essentially selecting random values.
I hope this helped illustrate how powerful functions and function parameters are. They not only help you separate your code, but also make it easier to manage, understand, and write cleaner code.

Python: import JSON file into SQLAlchemy JSON field

I'm relatively new to Python so I'm hoping that I've just missed something really obvious... But all the similar questions/answers here on StackOverflow seem really overly complex for the simple task that I am trying to achieve.
I have a few hundred text files containing JSON data (the actual data structure isn't important, this block below is just to show you what kind of thing I have, the actual structure of the data could be wildly different but it will always be valid JSON data).
{
"config": {
"item1": "value1",
"item2": "value2"
},
"data": [
{
"dataA1": "valueA1",
"itemA2": "valueA2"
},
{
"dataB1": "valueB1",
"itemB2": "valueB2",
"itemB3": "valueB3"
}
]
}
My Model is something like this:
class ModelName(db.Model):
__tablename__ = 'table_name'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(64))
data1 = db.Column(db.JSON)
data2 = db.Column(db.JSON)
I have multiple data columns here, data1 and data2, simply so I can do a visual comparison of the inserted data. The final model will only have a single data field.
Here is the data insert where everything seems to be going wrong:
import json
new_record = ModelName(
name='Foo',
data1=open('./filename.json').read(),
data2=json.dumps(open('./filename.json').read(), indent=2)
)
try:
db.session.add(new_record)
db.session.commit()
print('Insert successful')
except:
print('Insert failed')
The data that ends up in data1 and data2 get littered with varying numbers of \ to escape double quotes and line breaks, plus it wraps the whole data insert in a set of double-quotes. As a result, the data is simply unusable. So I'm currently having to copy and paste the data into the DB manually which although this tedious task works fine, it is far from the right thing to have to do.
I don't need to edit, manipulate, or do anything to the data in any way. I simply want to read the JSON string from a given file and then insert its content into a record in the database, that is it, end of story, nothing else.
Is there really no SIMPLE way to achieve this?

When you read in a file you need json.loads().
And there's no indent kwarg for that.
So instead do:
data2=json.loads(open('filename.json').read())

Azure tables unable to store flattened JSON

I am using the npm flat package, and arrays/objects are flattened, but object/array keys are surrounded by '' , like in 'task_status.0.data' using the object below.
These specific fields do not get stored into AzureTables - other fields go through, but these are silently ignored. How would I fix this?
var obj1 = {
"studentId": "abc",
"task_status": [
{
"status":"Current",
"date":516760078
},
{
"status":"Late",
"date":1516414446
}
],
"student_plan": "n"
}
Here is how I am using it - simplified code example: Again, it successfully gets written to the table, but does not write the properties that were flattened (see further below):
var flatten = require('flat')
newObj1 = flatten(obj1);
var entGen = azure.TableUtilities.entityGenerator;
newObj1.PartitionKey = entGen.String(uniqueIDFromMyDB);
newObj1.RowKey = entGen.String(uniqueStudentId);
tableService.insertEntity(myTableName, newObj1, myCallbackFunc);
In the above example, the flattened object would look like:
var obj1 = {
studentId: "abc",
'task_status.0.status': 'Current',
'task_status.0.date': 516760078,
'task_status.1.status': 'Late',
'task_status.1.date': 516760078,
student_plan: "n"
}
Then I would add PartitionKey and RowKey.
all the task_status fields would silently fail to be inserted.
EDIT: This does not have anything to do with the actual flattening process - I just checked a perfectly good JSON object, with keys that had 'x.y.z' in it, i.e. AzureTables doesn't seem to accept these column names....which almost completely destroys the value proposition of storing schema-less data, without significant rework.

. in column name is not supported. You can use a custom delimiter to flatten your objects instead.
For example:
newObj1 = flatten(obj1, {delimiter: '__'});

Parse complex Json string contained in Hadoop

I want to parse a string of complex JSON in Pig. Specifically, I want Pig to understand my JSON array as a bag instead of as a single chararray. I found that complex JSON can be parsed by using Twitter's Elephant Bird or Mozilla's Akela library. (I found some additional libraries, but I cannot use 'Loader' based approach since I use HCatalog Loader to load data from Hive.)
But, the problem is the structure of my data; each value of Map structure contains value part of complex JSON. For example,
1. My table looks like (WARNING: type of 'complex_data' is not STRING, a MAP of <STRING, STRING>!)
TABLE temp_table
(
user_id BIGINT COMMENT 'user ID.',
complex_data MAP <STRING, STRING> COMMENT 'complex json data'
)
COMMENT 'temp data.'
PARTITIONED BY(created_date STRING)
STORED AS RCFILE;
2. And 'complex_data' contains (a value that I want to get is marked with two *s, so basically #'d'#'f' from each PARSED_STRING(complex_data#'c') )
{ "a": "[]",
"b": "\"sdf\"",
"**c**":"[{\"**d**\":{\"e\":\"sdfsdf\"
,\"**f**\":\"sdfs\"
,\"g\":\"qweqweqwe\"},
\"c\":[{\"d\":21321,\"e\":\"ewrwer\"},
{\"d\":21321,\"e\":\"ewrwer\"},
{\"d\":21321,\"e\":\"ewrwer\"}]
},
{\"**d**\":{\"e\":\"sdfsdf\"
,\"**f**\":\"sdfs\"
,\"g\":\"qweqweqwe\"},
\"c\":[{\"d\":21321,\"e\":\"ewrwer\"},
{\"d\":21321,\"e\":\"ewrwer\"},
{\"d\":21321,\"e\":\"ewrwer\"}]
},]"
}
3. So, I tried... (same approach for Elephant Bird)
REGISTER '/path/to/akela-0.6-SNAPSHOT.jar';
DEFINE JsonTupleMap com.mozilla.pig.eval.json.JsonTupleMap();
data = LOAD temp_table USING org.apache.hive.hcatalog.pig.HCatLoader();
values_of_map = FOREACH data GENERATE complex_data#'c' AS attr:chararray; -- IT WORKS
-- dump values_of_map shows correct chararray data per each row
-- eg) ([{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... },
{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... },
{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... }])
([{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... },
{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... },
{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... }]) ...
attempt1 = FOREACH data GENERATE JsonTupleMap(complex_data#'c'); -- THIS LINE CAUSE AN ERROR
attempt2 = FOREACH data GENERATE JsonTupleMap(CONCAT(CONCAT('{\\"key\\":', complex_data#'c'), '}'); -- IT ALSO DOSE NOT WORK
I guessed that "attempt1" was failed because the value doesn't contain full JSON. However, when I CONCAT like "attempt2", I generate additional \ mark with. (so each line starts with {\"key\": ) I'm not sure that this additional marks breaks the parsing rule or not. In any case, I want to parse the given JSON string so that Pig can understand. If you have any method or solution, please Feel free to let me know.

I finally solved my problem by using jyson library with jython UDF.
I know that I can solve it by using JAVA or other languages.
But, I think that jython with jyson is the most simplist answer to this issue.

What's the best way to send in multiple coordinates in a JSON to RethinkDB in order to create an r.polygon?

I'm using an Express server with RethinkDB, and I want to send in multiple coordinates into my 'locations' table on RethinkDB and create an r.polygon(). I understand how to do the query via RethinkDB's data explorer , but I'm having trouble figuring out how to send it via JSON from the client to the server and insert it through my query there.
I basically want to do this:
r.db('places').table('locations').insert({
name: req.body.name,
bounds: r.polygon(req.body.bounds)
})
where req.body.bounds looks like this:
[long, lat],[long, lat], [long, lat]
I can't send it in as a string because then it gets read as one single input instead of three arrays. I'm sure there's a 'right in front of me' way, but I'm drawing a blank.
What's the best way to do this?
Edit: To clarify, my question is, what should my JSON look like and how should it be received on my server?
This is what RethinkDB wants in order to make a polygon:
r.polygon([lon1, lat1], [lon2, lat2], [lon3, lat3], ...) → polygon
As per the suggestion, I've added in r.args() to my code:
r.db('places').table('locations').insert({
name: req.body.name,
bounds: r.polygon(r.args(req.body.bounds))
})
Edit
Ok, I was dumb and had a typo in one of my coordinates!
Sending it as an array of arrays and wrapping it in r.args() on the server side works.

What you need is r.args to unpack the array into arguments for r.polygon. https://www.rethinkdb.com/api/javascript/args/
With assumption that req.body.bounds is:
[[long, lat],[long, lat], [long, lat]]
And you are submit a raw JSON string from client.
You first need to decode the JSON payload, and get the bounds field, wrap it with args as following:
var body = JSON.parse(req.body)
r.db('places').table('locations').insert({
name: req.body.name,
bounds: r.polygon(r.args(body.bounds))
})

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Need help separating unorganized JSON/JSON arrays - mysql

Related

dumping list to JSON file creates list within a list [["x", "y","z"]], why?

Python: import JSON file into SQLAlchemy JSON field

Azure tables unable to store flattened JSON

Parse complex Json string contained in Hadoop

What's the best way to send in multiple coordinates in a JSON to RethinkDB in order to create an r.polygon?

Categories

Resources