Convert tweets into Bson using twitteR and rmongo library - json

Since the streamR connection API doesn't work anymore on Tweeter I try to convert the output from searchTwitter function (from TwitteR) into BSON before insert it in a mongodb database.
test.tweets = searchTwitter("mongodb", n=10, lang="en")
class(test.tweets)
test.text=laply(test.tweets,function(t) t$getText())
class(toJSON(test.text))
bson <- mongo.bson.from.JSON(test.text)
R return an error : "Error in mongo.bson.from.JSON(test.text) : Not a valid JSON content:..."
How to resolve this conversion or does exist another solution ?
Thank you

This works
library(rmongodb)
library(jsonlite)
test.text <- c("A tweet", "Another tweet")
(bson <- mongo.bson.from.JSON(toJSON(test.text)))
# 1 : 2 A tweet
# 2 : 2 Another tweet

Related

Error while trying to parse json into R

I have recently started using R and have a task regarding parsing json in R to get a non-json format. For this, i am using the "fromJSON()" function. I have tried to parse json as a text file. It runs successfully when i do it with just a single row entry. But when I try it with multiple row entries, i get the following error:
fromJSON("D:/Eclairs/Printing/test3.txt")
Error in feed_push_parser(readBin(con, raw(), n), reset = TRUE) :
lexical error: invalid char in json text.
[{'CategoryType':'dining','City':
(right here) ------^
> fromJSON("D:/Eclairs/Printing/test3.txt")
Error in feed_push_parser(readBin(con, raw(), n), reset = TRUE) :
parse error: trailing garbage
"mumbai","Location":"all"}] [{"JourneyType":"Return","Origi
(right here) ------^
> fromJSON("D:/Eclairs/Printing/test3.txt")
Error in feed_push_parser(readBin(con, raw(), n), reset = TRUE) :
parse error: after array element, I expect ',' or ']'
:"mumbai","Location":"all"} {"JourneyType":"Return","Origin
(right here) ------^
The above errors are due to three different formats in which i tried to parse the json text, but the result was the same, only the location suggested by changed.
Please help me to identify the cause of this error or if there is a more efficient way o performing the task.
The original file that i have is an excel sheet with multiple columns and one of those columns consists of json text. The way i tried right now is by extracting just the json column and converting it to a tab separated text and then parsing it as:
fromJSON("D:/Eclairs/Printing/test3.txt")
Please also suggest if this can be done more efficiently. I need to map all the columns in the excel to the non-json text as well.
Example:
[{"CategoryType":"dining","City":"mumbai","Location":"all"}]
[{"CategoryType":"reserve-a-table","City":"pune","Location":"Kothrud,West Pune"}]
[{"Destination":"Mumbai","CheckInDate":"14-Oct-2016","CheckOutDate":"15-Oct-2016","Rooms":"1","NoOfPax":"3","NoOfAdult":"3","NoOfChildren":"0"}]
Consider reading in the text line by line with readLines(), iteratively saving the JSON dataframes to a growing list:
library(jsonlite)
con <- file("C:/Path/To/Jsons.txt", open="r")
jsonlist <- list()
while (length(line <- readLines(con, n=1, warn = FALSE)) > 0) {
jsonlist <- append(jsonlist, list(fromJSON(line)))
}
close(con)
jsonlist
# [[1]]
# CategoryType City Location
# 1 dining mumbai all
# [[2]]
# CategoryType City Location
# 1 reserve-a-table pune Kothrud,West Pune
# [[3]]
# Destination CheckInDate CheckOutDate Rooms NoOfPax NoOfAdult NoOfChildren
# 1 Mumbai 14-Oct-2016 15-Oct-2016 1 3 3 0

Get a JSON text

I have this JSON text:
data = {"one":"number","two":"string","three":"number","four":[{"five":"number","six","string"},{"five":"number","six":"string"}]}
How I can get "five"'s number and "six"'s string using Python 3.3 and using json module ?
P.S.: If I do print data['five'] it doesn't works with this error:
print(data['five'])
KeyError: 'five'
Thanks,
Marco
Try this:
data = {"one":"number","two":"string","three":"number","four":[{"five":"number","six":"string"},{"five":"number","six":"string"}]}
print(data['four'][0]['five']) # number
print(data['four'][0]['six']) # string

JSON (using jsonlite) parsing error in R

I have the following JSON file:
{"id":1140854908,"name":"'Amran"}
{"id":1140852651,"name":"'Asir"}
{"id":1140855190,"name":"'Eua"}
{"id":1140851307,"name":"A Coruna"}
{"id":1140854170,"name":"A`Ana"}
I used the package jsonlite but I get a parsing error
library(jsonlite)
try <- fromJSON("states.txt",simplifyDataFrame = T)
# Error in feed_push_parser(readBin(con, raw(), n), reset = TRUE) :
# parse error: trailing garbage
# :1140854908,"name":"'Amran"} {"id":1140852651,"name":"'Asir"
# (right here) ------^
Try changing your data file to below
[
{"id":1140854908,"name":"'Amran"}
,{"id":1140852651,"name":"'Asir"}
,{"id":1140855190,"name":"'Eua"}
,{"id":1140851307,"name":"A Coruna"}
,{"id":1140854170,"name":"A`Ana"}
]
The same code worked for me.. It is looking for an array..
Your file is a newline delimited JSON (http://ndjson.org/). You can read it with jsonlite like this:
try <- stream_in(file("states.txt"))

httr GET operation unable to access JSON response

I am trying to access the JSON response from an API call in my R script. The API call is succesful, and I can view the JSON response in the console. However, I am unable to access any data from it.
A sample code segment is:
require(httr)
target <- '#trump'
sentence<- 'Donald trump has a wonderful toupe, it really is quite stunning that a man can be so refined and elegant'
query <- url_encode(sentence)
target <- gsub('#', '', target)
endpoint <- "https://alchemy.p.mashape.com/text/TextGetTargetedSentiment?outputMode=json&target="
apiCall <- paste(endpoint, target, '&text=', query, sep = '')
resp <-GET(apiCall, add_headers("X-Mashape-Key" = sentimentKey, "Accept" = "application/json"))
stop_for_status(resp)
headers(resp)
str(content(resp))
content(resp, "text")
I followed examples in the httr quickstart guide from CRAN (here) as well as this stack.
Unfortunately, I keep getting either "unused parameters 'text' in content()" or "no definition exists for content() accepting a class of 'response.' Does anyone have any advice? PS the headers will print, and resp$content will print the raw bitstream
Expanding on the comment, you need to set the content type explicitly in the call to content(...). Since your code is not reproducible, here is an example using the Census Bureau's geocoder (which returns a json response).
library(httr)
url <- "http://geocoding.geo.census.gov/geocoder/locations/onelineaddress"
resp <-GET(url, query=list(address="1600 Pennsylvania Avenue, Washington DC",
benchmark=9,
format="json"))
json <- content(resp, type="application/json")
json$result$addressMatches[[1]]$coordinates
# $x
# [1] -77.038025
#
# $y
# [1] 38.898735
Assuming your are actually getting a json response, and that it is well-formed, simply using content(resp, type="application/json") should work.

R JSON UTF-8 parsing

I have an issue when trying to parse a JSON file in russian alphabet in R. The file looks like this:
[{"text": "Валера!", "type": "status"}, {"text": "когда выйдет", "type": "status"}, {"text": "КАК ДЕЛА?!)", "type": "status"}]
and it is saved in UTF-8 encoding. I tried libraries rjson, RJSONIO and jsonlite to parse it, but it doesn't work:
library(jsonlite)
allFiles <- fromJSON(txt="ru_json_example_short.txt")
gives me error
Error in feed_push_parser(buf) :
lexical error: invalid char in json text.
[{"text": "Валера!", "
(right here) ------^
When I save the file in ANSI encodieng, it works OK, but then, the Russian alphabet transforms into question marks, so the output is unusable.
Does anyone know how to parse such JSON file in R, please?
Edit: Above mentioned applies for UTF-8 file saved in Windows Notepad. When I save it in PSPad and the parse it, the result looks like this:
text type
1 <U+0412><U+0430><U+043B><U+0435><U+0440><U+0430>! status
2 <U+043A><U+043E><U+0433><U+0434><U+0430> <U+0432><U+044B><U+0439><U+0434><U+0435><U+0442> status
3 <U+041A><U+0410><U+041A> <U+0414><U+0415><U+041B><U+0410>?!) status
Try the following:
dat <- fromJSON(sprintf("[%s]",
paste(readLines("./ru_json_example_short.txt"),
collapse=",")))
dat
[[1]]
text type
1 Валера! status
2 когда выйдет status
3 КАК ДЕЛА?!) status
ref: Error parsing JSON file with the jsonlite package