Why can't this text be parsed through fastjson2? - fastjson

import com.alibaba.fastjson2.JSONArray
JSONArray.parseArray(str).toString()
I use the toString method of fastjson2 to parse this JSON string, but I will encounter an error:
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
at com.alibaba.fastjson2.JSONWriterUTF16JDK8.writeString(JSONWriterUTF16JDK8.java:183)
at com.alibaba.fastjson2.writer.ObjectWriterImplMap.write(ObjectWriterImplMap.java:428)
at com.alibaba.fastjson2.writer.ObjectWriterImplMap.write(ObjectWriterImplMap.java:457)
at com.alibaba.fastjson2.writer.ObjectWriterImplList.write(ObjectWriterImplList.java:278)
at com.alibaba.fastjson2.JSONArray.toString(JSONArray.java:871)
Similar strings can work normally. I really can't find which special character caused them.
My str is:
[{"response_info":{"header":"Content-Length: 388\r\nContent-Type: application/octet-stream\r\nUser-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.84 Safari/537.36\r\nHost: 180.102.211.212\r\n","body":"\u0000\u0000\u0000\u0003seq\u0000\u0000\u0000\u000241\u0000\u0000\u0000\u0003ver\u0000\u0000\u0000\u00011\u0000\u0000\u0000\tweixinnum\u0000\u0000\u0000\n1429629729\u0000\u0000\u0000\u0007authkey\u0000\u0000\u0000D0B\u0002\u0001\u0001\u0004;09\u0002\u0001\u0002\u0002\u0001\u0001\u0002\u0004U6k!\u0002\u0003\u000fBA\u0002\u0004\u0015zXu\u0002\u0004\ufffd\ufffdf\ufffd\u0002\u0003\u000fU\ufffd\u0002\u0003\u0006\u0000\u0000\u0002\u0004U6k!\u0002\u0004d=\u001eS\u0002\u0004\ufffd\ufffd7\u0019\u0004\u0000\u0000\u0000\u0000\u0006rsaver\u0000\u0000\u0000\u00011\u0000\u0000\u0000\brsavalue\u0000\u0000\u0000\ufffd\ufffd\ufffd\ufffd\ufffd\u0006\u001d\ufffd_;\ufffdi\ufffdT.\ufffd\ufffd\"CK\ufffd/\u00169\u0018\u0015bI\ufffd\ufffd`<n\ufffd\ufffd\ufffdw\ufffd\ufffd\ufffd!\ufffd\u001a\u0003\ufffdHh\ufffdP%i$\ufffd$\ufffd\u0005\ufffd<\ufffd8\ufffd\ufffd\ufffd\ufffd\n\ufffd$\u0016A-O5\ufffd`\r\ufffd\ufffdc\ufffd\ufffd\u001b\ufffd\ufffd\r3\ufffd\ufffd`\ufffd)\ufffd\ufffdV\ufffdf \ufffd`\t\ufffd%\u0010\ufffd\ufffd\ufffdJ\ufffd\u001aCu\u0010\u000b\ufffd\u0001X\ufffd\ufffd\u01b7\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd.\u0000\u0000\u0000\u0007filemd5\u0000\u0000\u0000 0d65f9a4beb26b55874965490344abef\u0000\u0000\u0000\bfiletype\u0000\u0000\u0000\u00015\u0000\u0000\u0000\u0006touser\u0000\u0000\u0000\u00101688854880368629"}}]
fastjson version is 2.0.10

Related

How to enable proper Description reason in N1QL Couchbase in case of query failure. Or any Exception Id(Icode)?

Following is the one failure log for same id here it is mark as status as success or error in failure case.
Here it is failure case in description it mention just "A N1QL EXPLAIN statement was executed" but didnt gave proper Exception Id and proper Description details-
{"clientContextId":"INTERNAL-b8d19563-94a1-442d-9a09-dde36743fb7d","description":"A
N1QL EXPLAIN statement was
executed","id":28673,"isAdHoc":true,"metrics":{"elapsedTime":"11.921ms","executionTime":"11.764ms","resultCount":1,"resultSize":649},"name":"EXPLAIN
statement","node":"127.0.0.1:8091","real_userid":{"domain":"builtin","user":"Administrator"},"remote":{"ip":"127.0.0.1","port":44695},"requestId":"958a7e12-d5a6-4d7b-bd40-ac9bb60cf4a3","statement":"explain
INSERT INTO `Guardium` (KEY, VALUE) \nVALUES ( "id::5554\n", { "Emp
Name": "Test4", "Emp Company" : "GS Lab", "Emp Country" :
"India"} )\nRETURNING
*;","status":"errors","timestamp":"2021-01-07T09:37:00.486Z","userAgent":"Mozilla/5.0
(Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/87.0.4280.88 Safari/537.36 (Couchbase Query Workbench
(6.6.1-9213-enterprise))"}
Please provide input on this, I want proper description about why this N1QL failure reason in audit.logs
Thank you..

Powerschool Login Form Data

I'm trying to login to PowerSchool to scrape my grades. Whenever I run the code it gives me the login pages HTML code instead of the secured pages HTML code.
Question 1: How do I get the value of the 3 fields that change labeled 'this changes' in the code above, and submit it to the current post?
Question 2: Am I required to add anything in the code for my password that gets hashed each post.
https://ps.lphs.net/public/home.html <--- Link to login page for HTML code.
Picture of form data on chrome
import requests
payload = {
'pstoken': 'this changes',
'contextData': 'this changes',
'dbpw': 'this changes',
'translator_username': '',
'translator_password': '',
'translator_ldappassword': '',
'serviceName':' PS Parent Portal',
'serviceTicket':'',
'pcasServerUrl':' /',
'credentialType':'User Id and Password Credential',
'account':'200276',
'pw':'my password',
'translatorpw':''
}
head = {'User-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3180.0 Safari/537.36'}
with requests.Session() as s:
p = s.post('https://ps.lphs.net/public/', data=payload, headers=head)
r = s.get('https://ps.lphs.net/guardian/home.html')
print(r.text)
EDIT 1 :
s.headers = {
'User-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3180.0 Safari/537.36'}
p = s.get('https://ps.lphs.net/guardian/home.html')
print(p.text)
r = s.post('https://ps.lphs.net/guardian/home.html', data=payload,
headers={'Content-Type': 'application/x-www-form-urlencoded',
'Referer': 'https://ps.lphs.net/public/home.html'})
print(r.text)
Give this a shot. It should fetch you the valid response:
import requests
payload = {
'pstoken': 'this changes',
'contextData': 'this changes',
'dbpw': 'this changes',
'translator_username': '',
'translator_password': '',
'translator_ldappassword': '',
'serviceName':' PS Parent Portal',
'serviceTicket':'',
'pcasServerUrl':' /',
'credentialType':'User Id and Password Credential',
'account':'200276',
'pw':'my password',
'translatorpw':''
}
with requests.Session() as s:
s.headers={'User-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3180.0 Safari/537.36'}
r = s.post('https://ps.lphs.net/guardian/home.html',data=payload,
headers={'Content-Type': 'application/x-www-form-urlencoded',
'Referer':'https://ps.lphs.net/public/home.html'})
print(r.text)
Btw, change the parameter in payload (if needed) to get logged in.

Get data from URL json in R

I'm using R and I would like to get JSON information from url and I have around 5000 user agent to sent to this API (http://www.useragentstring.com/pages/api.php)
I use this code to create the url and concatenate the user-agent:
url_1<-paste(" \"http://www.useragentstring.com/?uas=",uaelenchi[11,1],"&getJSON=all\"",sep = '');
json_data2<-fromJSON(readLines(cat(url_1)))
But I receive this error:
Error in readLines(cat(url_1)) : 'con' is not a connection
Any suggestions would be really appreciated! Thanks
I use rjson::fromJSON(file = paste(your_url)). If you make a reproducible example, I could check if it is working in your case.
library(httr)
library(jsonlite)
library(purrr)
uas <- c("Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:17.0) Gecko/20100101 Firefox/17.0",
"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:17.0) Gecko/20100101 Firefox/17.0",
"Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Firefox/31.0",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.6 Safari/537.11",
"Mozilla/5.0 (X11; OpenBSD amd64; rv:28.0) Gecko/20100101 Firefox/28.0",
"Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Firefox/31.0",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.6 Safari/537.11",
"Mozilla/5.0 (X11; OpenBSD amd64; rv:28.0) Gecko/20100101 Firefox/28.0",
"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:14.0) Gecko/20120405 Firefox/14.0a1",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1944.0 Safari/537.36",
"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:14.0) Gecko/20120405 Firefox/14.0a1",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1944.0 Safari/537.36")
parse_uas <- function(uas) {
res <- GET("http://www.useragentstring.com/", query=list(uas=uas, getJSON="all"))
stop_for_status(res)
content(res, as="text", encoding="UTF-8") %>%
fromJSON(res, flatten=TRUE) %>%
as.data.frame(stringsAsFactors=FALSE)
}
map_df(uas, parse_uas)
To save API calls you should add a caching layer to the parse_uas() function, which could be done pretty easily with the memoise package:
library(memoise)
.parse_uas <- function(uas) {
res <- GET("http://www.useragentstring.com/", query=list(uas=uas, getJSON="all"))
stop_for_status(res)
content(res, as="text", encoding="UTF-8") %>%
fromJSON(res, flatten=TRUE) %>%
as.data.frame(stringsAsFactors=FALSE)
}
parse_uas <- memoise(.parse_uas)
Also, if you're on Linux, you can also try this package (it doesn't compile well on macOS and not at all on Windows IIRC) which will do all the processing locally.

Error in http_statuses - subscript out of bounds

Can someone explain me why session2 gives me following error:
library("rvest")
uastring = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36"
session = html_session("https://www.linkedin.com/job/", user_agent(uastring))
session2 = html_session("https://www.linkedin.com/job/")
Error in http_statuses[[as.character(status)]] : subscript out of
bounds
I have these example from https://stat4701.github.io/edav/2015/04/02/rvest_tutorial/
How I can check which value of uastring I have to put to html_session (for different sites). I don't ask about this specific site (I put it here because it's comes from tutorial).

Using Jsawk to parse JSON access logs

With our new webservers, the access logs are in JSON and I'm not able to use typical awk commands to pull out traffic info. I've found jsawk, however I keep getting a parse error anytime I try to pull anything out of the access logs. I have the feeling that the logs are not in a format the the parser likes
Here is a sample entry from the logs:
{ "#timestamp": "2014-09-30T21:33:56+00:00", "webserver_remote_addr": "24.4.209.153", "webserver_remote_user": "-", "webserver_body_bytes_sent": 193, "webserver_request_time": 0.000, "webserver_status": "404", "webserver_request": "GET /favicon.ico HTTP/1.1", "webserver_request_method": "GET", "webserver_http_referrer": "-", "webserver_http_user_agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.124 Safari/537.36" }
So for example if I want to pull the IP addresses out of the logs, I would use this:
cat access.log | jsawk 'return this.webserver_remote_addr'
However this only results in 'jsawk: JSON parse error:' and the entire access log printed.
Am I correct in assuming that the access logs are in a format the parser doesn't recognize? Each entry in the logs is all on one line. How can I get jsawk to parse properly?
I tried this:
$ echo '{ "#timestamp": "2014-09-30T21:33:56+00:00", "webserver_remote_addr": "24.4.209.153", "webserver_remote_user": "-", "webserver_body_bytes_sent": 193, "webserver_request_time": 0.000, "webserver_status": "404", "webserver_request": "GET /favicon.ico HTTP/1.1", "webserver_request_method": "GET", "webserver_http_referrer": "-", "webserver_http_user_agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.124 Safari/537.36" }' | jsawk 'return this.webserver_remote_addr'
and got this:
24.4.209.153
Updates:
I think the problem is that you have each line as a json object, and there are multiple lines in access.log. There's a good way to work around at here: How to use jsawk if every line is a json object ?