I extracted some data from a mongo database using the RMongo library. I have been working with the data with no problem. However, I need to access a field that was saved, originally in the database, as JSON. Since rmongodb saves the data as data frame, I now have a large character vector of length 1:
res1 = "[ { \"text\" : \"#Kayture Beyoncé jam session ?\" , \"name\" : \"beponcé \xed\xa0\xbc\xed\xbc\xbb\" , \"screenName\" : \"ColaaaaTweedy\" , \"follower\" : false , \"mentions\" : [ \"Kayture\"] , \"userTwitterId\" : \"108061963\"} , { \"text\" : \"#Kayture fucking marry me\" , \"name\" : \"George McQueen\" , \"screenName\" : \"GeorgeMcQueen12\" , \"follower\" : false , \"mentions\" : [ \"Kayture\"] , \"userTwitterId\" : \"67896750\"}]"
I need to extract all the "text" attributes of the objects from this array (there are 2 in this example), but I can not figure out a fast way. I was trying using strsplit, or going from character to json files using jsonlite, and then to list, but it does not work.
Any ideas?
Thanks!
Starting from
res1 = "[ { \"text\" : \"#Kayture Beyoncé jam session ?\" , \"name\" : \"beponcé \xed\xa0\xbc\xed\xbc\xbb\" , \"screenName\" : \"ColaaaaTweedy\" , \"follower\" : false , \"mentions\" : [ \"Kayture\"] , \"userTwitterId\" : \"108061963\"} , { \"text\" : \"#Kayture fucking marry me\" , \"name\" : \"George McQueen\" , \"screenName\" : \"GeorgeMcQueen12\" , \"follower\" : false , \"mentions\" : [ \"Kayture\"] , \"userTwitterId\" : \"67896750\"}]"
you can use fromJSON() from the jsonlite package to parse that JSON object.
library(jsonlite)
fromJSON(res1)
text name screenName follower mentions userTwitterId
1 #Kayture Beyoncé jam session ? beponcé í ¼í¼» ColaaaaTweedy FALSE Kayture 108061963
2 #Kayture fucking marry me George McQueen GeorgeMcQueen12 FALSE Kayture 67896750
Related
I have JSON data in a file json_format.py as follows:
{
"name" : "ramu",
"place" : "hyd",
"height" : 5.10,
"list" : [1,2,3,4,5,6],
"tuple" : (0,1,2),
"colors" : {"mng":"white","aft" : "blue","night":"red"},
"car" : "None",
"bike" : "True",
}
I'm reading the above with this code:
import json
from pprint import pprint
with open (r'C:/PythonPrograms\Json_example/json_format.py') as jobj:
fp = jobj.readlines()
b = json.dumps(fp) # ---> I get string
print(type(b))
c = json.loads(b)
print(type(c)) # ---> List
pprint(c)
print(c[0])
pprint(c["name"])
Now, I would like to access the JSON object as c['name'] and the output should be ramu.
Since c is a list, I can't do so. How can I read my JSON data so that I can access it with keys?
Thanks in advance!
You're effectively doing c = json.loads(json.dumps(jobj.readlines())) when you just need:
c = json.load(jobj)
print(c["name"]) # ramu
Also, your JSON is malformed.
There are no tuples in JSON: "tuple" : (0,1,2),
Your last item should not end with a comma: "bike" : "True",
In my system from json-call http://192.168.1.6:8080/json.htm?type=devices&rid=89 I get the output below.
{
"ActTime" : 1501360852,
"ServerTime" : "2017-07-29 22:40:52",
"Sunrise" : "05:50",
"Sunset" : "21:28",
"result" : [
{
"AddjMulti" : 1.0,
"AddjMulti2" : 1.0,
"AddjValue" : 0.0,
"AddjValue2" : 0.0,
"BatteryLevel" : 255,
"CustomImage" : 0,
"Data" : "73 Lux",
"Description" : "",
"Favorite" : 1,
"HardwareID" : 4,
"HardwareName" : "Dummies",
"HardwareType" : "Dummy (Does nothing, use for virtual switches only)",
"HardwareTypeVal" : 15,
"HaveTimeout" : true,
"ID" : "82089",
"LastUpdate" : "2017-07-29 21:16:22",
"Name" : "ESP8266C_Licht1",
"Notifications" : "false",
"PlanID" : "0",
"PlanIDs" : [ 0 ],
"Protected" : false,
"ShowNotifications" : true,
"SignalLevel" : "-",
"SubType" : "Lux",
"Timers" : "false",
"Type" : "Lux",
"TypeImg" : "lux",
"Unit" : 1,
"Used" : 1,
"XOffset" : "0",
"YOffset" : "0",
"idx" : "89"
}
],
"status" : "OK",
"title" : "Devices"
}
'Automating' such call by the below scripttime lua-script is aimed at extraction of specific information, to be further used in applications.
The first 11 lines run without problems, but further extraction of the information is a problem.
I have tried various scriptlines to get a solution for A), B), C), D) and E), but either they generate error-report, or they don't give results: see further below for the 'best' trial-scriptlines and related results.
To avoid misunderstanding: those dashed commentlines in the script just below this question with A), B), C), D) and E) are only describing the desired actions/functions and are in no way meant as scriptlines!
Question:
Help requested in the form of better applicable scriptlines for A) till E) in the trialscript at the end of this message, or hints where to find applicable example scriptlines.
-- Lua-script to determine staleness, time-out and value for data from Json-call
print('Start of Timeout-script')
commandArray = {}
TimeOutLimit = 10 -- Allowed timeout in seconds
json = (loadfile "/home/pi/domoticz/scripts/lua/JSON.lua")() -- For Linux
-- json = (loadfile "D:\\Domoticz\\scripts\\lua\\json.lua")() -- For Windows
-- Line 07
local content=assert(io.popen('curl "http://192.168.1.6:8080/json.htm?type=devices&rid=89"')) -- notice double quotes
local list = content:read('*all')
content:close()
local jsonList = json:decode(list)
-- Line 12 Next scriptlines describe desired actions
-- A) Extract ServerTime as numeric value (not as string)
-- B) Extract LastUpdate as numeric value (not as string)
-- Staleness = ServerTime - LastUpdate
-- C) Extract HaveTimeout as boolean (not as string)
-- If HaveTimeout and (Staleness > TimeOutlimit) then
-- Print('TimeOutLimit exceeded by ' .. (Staleness - TimeOutLimit) .. 'seconds')
-- End
-- D) Extract textstring from Type or Data
-- E) Extract numeric value from Data
print('End of Timeout-script')
return commandArray
For lines 11 etc, the following trial-scriptlines gave 'best' results (= no errors):
-- Line 11
-- local Servertime = json:decode(ServerTime)
-- print('Servertime : '..Servertime)
-- Line 14
-- CheckTimeOut =jsonValue.result[1].HaveTimeout -- value from "HaveTimeout", inside "result" bloc number 1 (even if it's the only one)
CurrentServerTime =jsonValue.Servertime -- value from "ServerTime"
CurrentLastUpdate = jsonValue.result[1].LastUpdate
CurrentData = jsonValue.result[1].Data
-- Line 19
print('TimeOut : '..CheckTimeOut)
print('Servertime : '..CurrentServerTime)
print('LastUpdate : '..CurrentLastUpdate)
print('Data-content : '..CurrentData)
print('End of Timeout-script')
return commandArray
Results:
Without dashes before the lines 12 and 13, respectively 15, then the following error reports:
660: nil passed to JSON:decode()
lua:15: attempt to index global 'jsonValue' (a nil value)
With dashes before lines 12, 13 and 15 for the trial-scriptlines shown above, according to the log no errors exist (as demonstrated by the 2 prints)
2017-07-31 16:30:02.520 LUA: Start of Timeout-script
2017-07-31 16:30:02.563 LUA: End of Timeout-script
But why no print-results in the log from Lines 20 till 23?
Not having those print-results makes it difficult to determine next steps in data-extraction, to achieve the objectives described under A) till E).
;-) Error reports generally are more useful information than "no errors, but no results"
Without downloading content over the network, the working script looks like this:
local json = require("json")
local jj = [[
{
"ActTime" : 1501360852,
"ServerTime" : "2017-07-29 22:40:52",
"Sunrise" : "05:50",
"Sunset" : "21:28",
"result" : [
{
"AddjMulti" : 1.0,
"AddjMulti2" : 1.0,
"AddjValue" : 0.0,
"AddjValue2" : 0.0,
"BatteryLevel" : 255,
"CustomImage" : 0,
"Data" : "73 Lux",
"Description" : "",
"Favorite" : 1,
"HardwareID" : 4,
"HardwareName" : "Dummies",
"HardwareType" : "Dummy (Does nothing, use for virtual switches only)",
"HardwareTypeVal" : 15,
"HaveTimeout" : true,
"ID" : "82089",
"LastUpdate" : "2017-07-29 21:16:22",
"Name" : "ESP8266C_Licht1",
"Notifications" : "false",
"PlanID" : "0",
"PlanIDs" : [ 0 ],
"Protected" : false,
"ShowNotifications" : true,
"SignalLevel" : "-",
"SubType" : "Lux",
"Timers" : "false",
"Type" : "Lux",
"TypeImg" : "lux",
"Unit" : 1,
"Used" : 1,
"XOffset" : "0",
"YOffset" : "0",
"idx" : "89"
}
],
"status" : "OK",
"title" : "Devices"
}
]]
print('Start of Timeout-script')
local jsonValue = json.decode(jj)
CheckTimeOut =jsonValue.result[1].HaveTimeout
CurrentServerTime =jsonValue.Servertime
CurrentLastUpdate = jsonValue.result[1].LastUpdate
CurrentData = jsonValue.result[1].Data
print('TimeOut : '.. (CheckTimeOut and "true" or "false") )
print('Servertime : '.. (CurrentServerTime or "nil") )
print('LastUpdate : '.. (CurrentLastUpdate or "nil") )
print('Data-content : '.. (CurrentData or "nil") )
print('End of Timeout-script')
result:
Start of Timeout-script
TimeOut : true
Servertime : nil
LastUpdate : 2017-07-29 21:16:22
Data-content : 73 Lux
End of Timeout-script
import urllib.request as request
import json
api = "https://kr.api.pvp.net/championmastery/location/KR/player/38281748/topchampions?api_key=RGAPI-6bdee369-a91d-485a-9280-444de0e37afe"
api_data = request.urlopen(api).read().decode("utf-8")
apiload = json.loads(api_data)
print(apiload)
I want to print my League of Legends champion Points.
So I use https://developer.riotgames.com/api/methods#!/1091/3768 this API, and
convert to Python object. but this API's Return Value is List[ChampionMasteryDTO],
which means I can't use it as dictionary.
apiload contain [{"key" : "value"}, ... {"key" : "value"}]
how can I make apiload as dictionary?
The apiload variable prints two dictionaries within a list.
If you would like to create a new dictionary using the apiload you can do the following:
#create a new dictionary
my_dict = {}
#now iterate through the list
for item in apiload:
#now iterate through the dictionaries that are in the list:
for key, value in item.items():
#assign the key value to the new declared dictionary
new_dict[key] = value
This will create a new dictionary with the following output:
championId : 91
tokensEarned : 0
championPointsSinceLastLevel : 339079
chestGranted : True
lastPlayTime : 1478451844000
playerId : 38281748
championLevel : 7
championPoints : 360679
championPointsUntilNextLevel : 0
championId : 5
tokensEarned : 0
championPointsSinceLastLevel : 129110
chestGranted : True
lastPlayTime : 1478454752000
playerId : 38281748
championLevel : 7
championPoints : 150710
championPointsUntilNextLevel : 0
championId : 21
tokensEarned : 0
championPointsSinceLastLevel : 2018
chestGranted : False
lastPlayTime : 1476197348000
playerId : 38281748
championLevel : 4
championPoints : 14618
championPointsUntilNextLevel : 6982
Hope that helps.
Regarding the structure of your response, the easy one if you just want to print your points, would be to just use a loop actually :
for el in apiload:
print(el["championPoints"])
If your problem is to print the total of points collected, use a collections.Counter :
from collections import Counter
cnt = Counter()
for col in apiload:
for k in col.keys():
cnt[k] += c.get(k, 0)
print(cnt['championPoints']) # should print the sum of championPoints
So I'm trying to pull data from a JSON string (as seen below). When I decode the JSON using the code below, and then attempt to index the duration text, I get a nil return. I have tried everything and nothing seems to work.
Here is the Google Distance Matrix API JSON:
{
"destination_addresses" : [ "San Francisco, CA, USA" ],
"origin_addresses" : [ "Seattle, WA, USA" ],
"rows" : [
{
"elements" : [
{
"distance" : {
"text" : "1,299 km",
"value" : 1299026
},
"duration" : {
"text" : "12 hours 18 mins",
"value" : 44303
},
"status" : "OK"
}]
}],
"status" : "OK"
}
And here is my code:
local json = require ("json")
local http = require("socket.http")
local myNewData1 = {}
local SaveData1 = function (event)
distanceReturn = ""
distance = ""
local URL1 = "http://maps.googleapis.com/maps/api/distancematrix/json?origins=Seattle&destinations=San+Francisco&mode=driving&&sensor=false"
local response1 = http.request(URL1)
local data2 = json.decode(response1)
if response1 == nil then
native.showAlert( "Data is nill", { "OK"})
print("Error1")
distanceReturn = "Error1"
elseif data2 == nill then
distanceReturn = "Error2"
native.showAlert( "Data is nill", { "OK"})
print("Error2")
else
for i = 1, #data2 do
print("Working")
print(data2[i].rows)
for j = 1, #data2[i].rows, 1 do
print("\t" .. data2[i].rows[j])
for k = 1, #data2[i].rows[k].elements, 1 do
print("\t" .. data2[i].rows[j].elements[k])
for g = 1, #data2[i].rows[k].elements[k].duration, 1 do
print("\t" .. data2[i].rows[k].elements[k].duration[g])
for f = 1, #data2[i].rows[k].elements[k].duration[g].text, 1 do
print("\t" .. data2[i].rows[k].elements[k].duration[g].text)
distance = data2[i].rows[k].elements[k].duration[g].text
distanceReturn = data2[i].rows[k].elements[k].duration[g].text
end
end
end
end
end
end
timer.performWithDelay (100, SaveData1, 999999)
Your loops are not correct. Try this shorter solution.
Replace all your "for i = 1, #data2 do" loop for this one below:
print("Working")
for i,row in ipairs(data2.rows) do
for j,element in ipairs(row.elements) do
print(element.duration.text)
end
end
This question was solved on Corona Forums by Rob Miracle (http://forums.coronalabs.com/topic/47319-parsing-json-from-google-distance-matrix-api/?hl=print_r#entry244400). The solution is simple:
"JSON and Lua tables are almost identical data structures. In this case your table data2 has top level entries:
data2.destination_addresses
data2.origin_addresses
data2.rows
data2.status
Now data2.rows is another table that is indexed by numbers (the [] brackets) but here is only one of them, but its still an array entry:
data.rows[1]
Then inside of it is another numerically indexed table called elements.
So far to get to the element they are (again there is only one of them
data2.rows[1].elements[1]
then it's just accessing the remaining elements:
data2.rows[1].elements[1].distance.text
data2.rows[1].elements[1].distance.value
data2.rows[1].elements[1].duration.text
data2.rows[1].elements[1].duration.value
There is a great table printing function called print_r which can be found in the community code which is great for dumping tables like this to see their structure."
I have a JSON file (an export from mongoDB) that I'd like to load into R. The document is about 890 MB in size with roughly 63,000 rows of 12 fields. The fields are numeric, character and date. I'd like to end up with a 63000 x 12 data frame.
lines <- readLines("fb2013.json")
result: jFile has all 63,000 elements in char class and all fields are lumped into one field.
Each file looks something like this:
"{ \"_id\" : \"10151271769737669\", \"comments_count\" : 36, \"created_at\" : { \"$date\" : 1357941938000 }, \"icon\" : \"http://blahblah.gif\", \"likes_count\" : 450, \"link\" : \"http://www.blahblahblah.php\", \"message\" : \"I wish I could figure this out!\", \"page_category\" : \"Computers\", \"page_id\" : \"30968999999\", \"page_name\" : \"NothingButTrouble\", \"type\" : \"photo\", \"updated_at\" : { \"$date\" : 1358210153000 } }"
Using rjson,
jFile <- fromJSON(paste(readLines("fb2013.json"), collapse=""))
only the first row is read into jFile but there are 12 fields.
Using RJSONIO:
jFile <- fromJSON(lines)
results in the following:
Warning messages:
1: In if (is.na(encoding)) return(0L) :
the condition has length > 1 and only the first element will be used
Again, only the first row is read into jFile and there are 12 fields.
The output from rjson and RJSONIO looks something like this:
$`_id`
[1] "1018535"
$comments_count
[1] 0
$created_at
$date
1.357027e+12
$icon
[1] "http://blah.gif"
$likes_count
[1] 20
$link
[1] "http://www.chachacha"
$message
[1] "I'd love to figure this out."
$page_category
[1] "Internet/software"
$page_id
[1] "3924395872345878534"
$page_name
[1] "Not Entirely Hopeless"
$type
[1] "photo"
$updated_at
$date
1.357027e+12
try
library(rjson)
path <- "WHERE/YOUR/JSON/IS/SAVED"
c <- file(path, "r")
l <- readLines(c, -1L)
json <- lapply(X=l, fromJSON)
Since you want a data.frame, try this:
# three copies of your sample...
line.1<- "{ \"_id\" : \"10151271769737669\", \"comments_count\" : 36, \"created_at\" : { \"$date\" : 1357941938000 }, \"icon\" : \"http://blahblah.gif\", \"likes_count\" : 450, \"link\" : \"http://www.blahblahblah.php\", \"message\" : \"I wish I could figure this out!\", \"page_category\" : \"Computers\", \"page_id\" : \"30968999999\", \"page_name\" : \"NothingButTrouble\", \"type\" : \"photo\", \"updated_at\" : { \"$date\" : 1358210153000 } }"
line.2<- "{ \"_id\" : \"10151271769737669\", \"comments_count\" : 36, \"created_at\" : { \"$date\" : 1357941938000 }, \"icon\" : \"http://blahblah.gif\", \"likes_count\" : 450, \"link\" : \"http://www.blahblahblah.php\", \"message\" : \"I wish I could figure this out!\", \"page_category\" : \"Computers\", \"page_id\" : \"30968999999\", \"page_name\" : \"NothingButTrouble\", \"type\" : \"photo\", \"updated_at\" : { \"$date\" : 1358210153000 } }"
line.3<- "{ \"_id\" : \"10151271769737669\", \"comments_count\" : 36, \"created_at\" : { \"$date\" : 1357941938000 }, \"icon\" : \"http://blahblah.gif\", \"likes_count\" : 450, \"link\" : \"http://www.blahblahblah.php\", \"message\" : \"I wish I could figure this out!\", \"page_category\" : \"Computers\", \"page_id\" : \"30968999999\", \"page_name\" : \"NothingButTrouble\", \"type\" : \"photo\", \"updated_at\" : { \"$date\" : 1358210153000 } }"
x <- paste(line.1, line.2, line.3, sep="\n")
lines <- readLines(textConnection(x))
library(rjson)
# this is the important bit
df <- data.frame(do.call(rbind,lapply(lines,fromJSON)))
ncol(df)
# [1] 12
# finally, there's some cleaning up to do...
df$created_at
# [[1]]
# [[1]]$`$date`
# [1] 1.357942e+12
# ...
df$created_at <- as.POSIXlt(unname(unlist(df$created_at)/1000),origin="1970-01-01")
df$created_at
# [1] "2013-01-11 17:05:38 EST" "2013-01-11 17:05:38 EST" "2013-01-11 17:05:38 EST"
df$updated_at <- as.POSIXlt(unname(unlist(df$updated_at)/1000),origin="1970-01-01")
Note that this conversion assumes that the dates were stored as milliseconds since the epoch.