recently i`ve faced a problem parsing JSON responses, that contains html injections.
[{"Date":"\/Date(1316445326553+0400)\/",
"Dishes":null,"Id":103,"Name":"Menutka уже с Вами!",
"PictureId":130144,
"TextHtml":"<!DOCTYPE html PUBLIC '-\/\/W3C\/\/DTD XHTML 1.0 Transitional\/\/EN' 'http:\/\/www.w3.org\/TR\/xhtml1\/DTD\/xhtml1-transitional.dtd'>\u000a<html xmlns='http:\/\/www.w3.org\/1999\/xhtml'>\u000a<head>\u000a... etc",
"Type":1,"UserId":1,"UserName":"Администратор"}]
and tried to do JSON.parse response.body where response body is my JSON. It silenly obeys, but returns empty collection. I tried to validate this json on this site and it says "It s valid"
So i`m a bit confused about whats gone wrong.
PS this is my parse method:
def self.get(uri)
raw_url = BASE_URL+uri
url = "#{BASE_URL}"+CGI::escape(uri)
f = File.open('response.log', 'a')
start = Time.new
f.print "#{start.to_s}\t#{uri}"
resp = Net::HTTP.get_response(URI.parse(url))
stop = Time.new
f.puts "\t\t#{stop-start} seconds"
f.close
data = resp.body
begin
if data.blank? or data.include?('<html')
return {}
end
object = JSON.parse(data)
rescue JSON::ParserError
raise Exceptions::NothingReturned, "GET Error on #{raw_url}"
end
end
i'm not quite sure if it's that simple, but you're deliberately returning {} if your JSON contains a <html tag? what behavior do you want?
if you want to distinguish between html and json responses, just use the Content-Type response header
Related
I'm trying to write a simple Roku application.
When I load the JSON file via roURLTransfer ParseJSON function gives me BRIGHTSCRIPT: ERROR: ParseJSON: Unknown identifier.
If I load the JSON file via ReadAsciiFile("pkg:/feed/feed.json") it works.
The JSON file is the same and I'm pretty sure that my JSON is correct.
url = "http://192.168.1.36/misc/roku/ifilm/feed.json"
result = ""
timeout = 10000
ut = CreateObject("roURLTransfer")
ut.SetPort(CreateObject("roMessagePort"))
ut.SetURL(url)
if ut.AsyncGetToString()
event = wait(timeout, ut.GetPort())
if type(event) = "roUrlEvent"
result = event.GetString()
elseif event = invalid
ut.AsyncCancel()
else
print "roUrlTransfer::AsyncGetToString(): unknown event"
end if
end if
' `print result` shows the correct lintable JSON
' print result
' Next line gives me: BRIGHTSCRIPT: ERROR: ParseJSON: Unknown identifier
json = ParseJSON(result)
But putting the JSON file inside the app works:
feed = ReadAsciiFile("pkg:/feed/feed.json")
sleep(2000)
json = ParseJson(feed)
I need to load the data from the Internet and using the embedded version doesn't help me. Does anyone know what should I do to make it work?
The "Unknown identifier" error is usually because there's a character in the json string that ParseJson() does not support. The reason why ReadAsciiFile() works is likely because the function "cleans up" the json string by applying UTF-8 encoding.
A common character that's present at the beginning of some JSON responses that causes this issue is the unicode character Byte Order Mark (BOM)
If you google "byte order mark json" you'll see lots of cases where this affects other platforms as well.
You can just do a simple find and replace to get rid of that character before attempting to parse the string.
bomChar = Chr(65279)
if result.left(len(bomChar)) = bomChar ' Check if the string has the BOM char prefix
result = result.replace(bomChar, "")
end if
If that doesn't work, then your response may have some other conflicting character, in that case I would advise using ifUrlTransfer::AsyncGetToFile() instead of AsyncGetToString() and then use ReadAsciiFile() which should guarantee a properly formatted json string every time (as long as your json is valid).
I recieve a POSTed JSON with mod_wsgi on Apache. I have to forward the JSON to some API (using POST), take API's response and respond back to where the initial POST came from.
Here goes the python code
import requests
import urllib.parse
def application(environ, start_response):
url = "http://texchange.nowtaxi.ru/api/secret_api_key/"
query = environ['QUERY_STRING']
if query == "get":
url += "tariff/list"
r = requests.get(url)
response_headers = [('Content-type', 'application/json')]
else:
url += "order/put"
input_len = int(environ.get('CONTENT_LENGTH', '0'))
data = environ['wsgi.input'].read(input_len)
decoded = data.decode('utf-8')
unquoted = urllib.parse.unquote(decoded)
print(decoded) # 'from%5Baddress%5D=%D0%'
print(unquoted) # 'from[address]=\xd0\xa0'
r = requests.post(url,data)
output_len = sum(len(line) for line in r.text)
response_headers = [('Content-type', 'application/json'),
('Content-Length', str(output_len))]
status = "200 OK"
start_response(status, response_headers)
return [r.text.encode('utf-8')]
The actual JSON starts "{"from":{"address":"Россия
I thought those \x's are called escaped symbols, so I tried ast.literal_eval and codecs.getdecoder("unicode_escape"), but it didn't help. I can't properly google the case, because I feel like I misunderstood wtf is happening here. Maybe I have to somehow change the $.post() call in the .js file that sends POST to the wsgi script?
UPD: my bro said that it's totally unclear what I need. I'll clarify. I need to get the string that represents the recieved JSON in it's initial form. With cyrillic letters, "s, {}s, etc. What I DO get after decoding recieved byte-sequence is 'from%5Baddress%5D=%D0%'. If I unquote it, it converts into 'from[address]=\xd0\xa0', but that's still not what I want
I have a backend to an IOS application. I am trying to read the data from my rails backend by using JSON. My bubbleWrap get request is as follows.
BW::HTTP.get("url_here/children/1.json") do |response|
json = BW::JSON.parse response.body.to_str
for line in json
p line[:name]
end
end
It doesn't bring any data back, it actually breaks my code. I can't find any documentation with an example of how to use REST from rubymotion/Bubblewrap and pull data back to my application.
Any help is appreciated.
Here's a handy class abstraction that I use in a lot of my applications... it completely abstracts the API call logic from the view controller logic for separation of concerns and is heavily modeled after Matt Green's Inspect 2013 talk.
class MyAPI
APIURL = "http://your.api.com/whatever.json?date="
def self.dataForDate(date, &block)
BW::HTTP.get(APIURL + date) do |response|
json = nil
error = nil
if response.ok?
json = BW::JSON.parse(response.body.to_str)
else
error = response.error_message
end
block.call json, error
end
end
end
Then to call this class we do:
MyAPI.dataForDate(dateString) do |json, error|
if error.nil?
if json.count > 0
json.each do |cd|
# Whatever with the data
end
else
App.alert("No Results.")
end
else
App.alert("There was an error downloading data from the server. Please check your internet connection or try again later.")
end
end
Always check the response code before parsing the response body. You might
BW::HTTP.get(url) do |response|
if response.ok?
data = BW::JSON.parse(response.body.to_str)
# make sure you have an array or hash before you try iterating over it
data.each {|item| p item}
else
warn "Trouble"
end
end
Also make sure you sanity-check the JSON response vs your code's expectation. Perhaps the JSON is an array and not a hash?
I have code that looks like this:
def client = new groovyx.net.http.RESTClient('myRestFulURL')
def json = client.get(contentType: JSON)
net.sf.json.JSON jsonData = json.data as net.sf.json.JSON
def slurper = new JsonSlurper().parseText(jsonData)
However, it doesn't work! :( The code above gives an error in parseText because the json elements are not quoted. The overriding issue is that the "data" is coming back as a Map, not as real Json. Not shown, but my first attempt, I just passed the parseText(json.data) which gives an error about not being able to parse a HashMap.
So my question is: how do I get JSON returned from the RESTClient to be parsed by JsonSlurper?
The RESTClient class automatically parses the content and it doesn't seem possible to keep it from doing so.
However, if you use HTTPBuilder you can overload the behavior. You want to get the information back as text, but if you only set the contentType as TEXT, it won't work, since the HTTPBuilder uses the contentType parameter of the HTTPBuilder.get() method to determine both the Accept HTTP Header to send, as well was the parsing to do on the object which is returned. In this case, you need application/json in the Accept header, but you want the parsing for TEXT (that is, no parsing).
The way you get around that is to set the Accept header on the HTTPBuilder object before calling get() on it. That overrides the header that would otherwise be set on it. The below code runs for me.
#Grab(group='org.codehaus.groovy.modules.http-builder', module='http-builder', version='0.6')
import static groovyx.net.http.ContentType.TEXT
def client = new groovyx.net.http.HTTPBuilder('myRestFulURL')
client.setHeaders(Accept: 'application/json')
def json = client.get(contentType: TEXT)
def slurper = new groovy.json.JsonSlurper().parse(json)
The type of response from RESTClient will depend on the version of :
org.codehaus.groovy.modules.http-builder:http-builder
For example, with version 0.5.2, i was getting a net.sf.json.JSONObject back.
In version 0.7.1, it now returns a HashMap as per the question's observations.
When it's a map, you can simply access the JSON data using the normal map operations :
def jsonMap = restClientResponse.getData()
def user = jsonMap.get("user")
....
Solution posted by jesseplymale workes for me, too.
HttpBuilder has dependencies to some appache libs,
so to avoid to add this dependencies to your project,
you can take this solution without making use of HttpBuilder:
def jsonSlurperRequest(urlString) {
def url = new URL(urlString)
def connection = (HttpURLConnection)url.openConnection()
connection.setRequestMethod("GET")
connection.setRequestProperty("Accept", "application/json")
connection.setRequestProperty("User-Agent", "Mozilla/5.0")
new JsonSlurper().parse(connection.getInputStream())
}
I'm trying to write a simple server frontend to a python3 application, using a restful JSON-based protocol. So far, bottle seems the best suited framework for the task (it supports python3, handles method dispatching in a nice way, and easily returns JSON.) The problem is parsing the JSON in the input request.
The documentation only mention request.fields and request.files, both I assume refer to multipart/form-data data. No mention of accessing the request data directly.
Peeking at the source code, I can see a request.body object of type BytesIO. json.load refuses to act on it directly, dying in the json lib with can't use a string pattern on a bytes-like object. The proper way to do it may be to first decode the bytes to unicode characters, according to whichever charset was specified in the Content-Type HTTP header. I don't know how to do that; I can see a StringIO class and assume it may hold a buffer of characters instead of bytes, but see no way of decoding a BytesIO to a StringIO, if this is even possible at all.
Of course, it may also be possible to read the BytesIO object into a bytestring, then decode it into a string before passing it to the JSON decoder, but if I understand correctly, that breaks the nice buffering behavior of the whole thing.
Or is there any better way to do it ?
It seems that io.TextIOWrapper from the standard library does the trick !
def parse(request):
encoding = ... #get encoding from headers
return json.load(TextIOWrapper(request.body, encoding=encoding))
Here's what I do to read in json on a RESTful service with Python3 and Bottle:
import bson.json_util as bson_json
#app.post('/location/API')
def post_json_example():
"""
param: _id, value
return: I usually return something like {"status": "successful", "message": "discription"}
"""
query_string = bottle.request.query.json
query_dict = bson_json.loads(query_string)
_id = query_dict['_id']
value = query_dict['value']
Then to Test
from python3 interpreter, import requests
s = request.Session()
r = s.post('http://youserver.com:8080/location/API?json
{"_id":"540a16663dafb492a0a7626c","value":"test"}')
use r.text to verify what was returned.
I wrote an helper to use the good idea of b0fh.
After 2 weeks on response.json analyzing, I connect to StackOver Flow and understand that we need a work around
Here is:
def json_app_rqt():
# about request
request.accept = 'application/json, text/plain; charset=utf-8'
def json_app_resp():
# about response
response.headers['Access-Control-Allow-Origin'] = _allow_origin
response.headers['Access-Control-Allow-Methods'] = _allow_methods
# response.headers['Access-Control-Allow-Headers'] = _allow_headers
response.headers['Content-Type'] = 'application/json; charset=utf-8'
def json_app():
json_app_rqt()
json_app_resp()
def get_json_request(rqt):
with TextIOWrapper(rqt.body, encoding = "UTF-8") as json_wrap:
json_text = ''.join(json_wrap.readlines())
json_data = json.loads(json_text)
return json_data
For the using, we cand do:
if __name__ == "__main__":
json_app()
#post("/train_control/:control")
def do_train_control(control):
json_app_resp()
data = get_json_request(request)
print(json.dumps(data))
return data
Thanks to all