It's not a programming question, but it will be interesting for some developers.
Today I came across a question where user were asking about parsing Json data in C#. Nothing new, but he gave a link with Json file:
http://sapi.confirmtkt.com/api/platform/hotel/gethotels?city=Bangalore&checkinDate=08-01-2016&checkoutDate=09-01-2016&adults=2&rooms=1&children=0&childrenages=
The page is not available anymore!.
If you browse the above link in Google Chrome you can notice that the file will be shown as XML. Then I checked the url in Edge and it showed me a Json file.
Google Chrome
Edge
I'm a bit confused. Why Google Chrome shows Json file as XML?
That's because Chrome is sending different data in the Accept HTTP header:
Chrome's request:
GET http://sapi.confirmtkt.com/api/platform/hotel/gethotels?city=Bangalore&checkinDate=08-01-2016&checkoutDate=09-01-2016&adults=2&rooms=1&children=0&childrenages= HTTP/1.1
Host: sapi.confirmtkt.com
Connection: keep-alive
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.48 Safari/537.36
Accept-Encoding: gzip, deflate, sdch
Accept-Language: es,en;q=0.8
Edge's request:
GET http://sapi.confirmtkt.com/api/platform/hotel/gethotels?city=Bangalore&checkinDate=08-01-2016&checkoutDate=09-01-2016&adults=2&rooms=1&children=0&childrenages= HTTP/1.1
Accept: text/html, application/xhtml+xml, image/jxr, */*
Accept-Language: es-CL,es;q=0.8,en-US;q=0.5,en;q=0.3
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2486.0 Safari/537.36 Edge/13.10586
Accept-Encoding: gzip, deflate
Host: sapi.confirmtkt.com
Connection: Keep-Alive
You can see how Chrome is accepting application/xml;q=0.9 in the Accept header. So, I fired up Fiddler and intercepted the GET request, deleted that part of the accept header and now the server replies back with json rather than XML.
TL;DR: The server was giving back two different responses for the same URL because of the Accept header.
Related
I have a webserver setup on my iot device. The device is not really powerful and does not have a file system to store images from.
Nonetheless i would still want to have a favicon given to the browsers that request it. Since I do not have a File System I was planning of saving the image directly into the source code of the device. I have read in this thread that you can convert an image into an encoded string so i can then save the encoded string into a variable somethingg like
String imageString = "Encoded String Here";
So here is google chrome favicon http request looks like
20:46:21.767 -> GET /favicon.ico HTTP/1.1
20:46:21.767 -> Host: 192.168.1.8
20:46:21.767 -> Connection: keep-alive
20:46:21.767 -> User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.93 Safari/537.36
20:46:21.814 -> Accept: image/avif,image/webp,image/apng,image/svg+xml,image/*,*/*;q=0.8
20:46:21.814 -> Referer: http://192.168.1.8/
20:46:21.814 -> Accept-Encoding: gzip, deflate
20:46:21.814 -> Accept-Language: en-US,en;q=0.9,fil;q=0.8
My webserver can only respond to request in pure text. So how would my response look like that the image is now in text format??
HTTP/1.1 200 OK
Connection: close
Content-Type: text/html
//Do i place the encoded string here?? will the browser understand that?
You can generate a base64 string of your favicon and then insert it into the head of the html document like this:
<link href="data:image/x-icon;base64,<your_base64_here>" rel="icon" type="image/x-icon" />
A couple of days ago in Power BI, I was able to create a web query that allowed me to extract the JSON data from NBA Player Stats without using any headers. As of today, I have noticed that the query no longer works; I am getting the following error message:
DataSource.Error: The underlying connection was closed. An unexpected error occurred on a receive.
Details: https://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=&DateTo=&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=PerGame&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2019-20&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&TwoWay=0&VsConference=&VsDivision=&Weight=
On a related note, I used to be able to pull the JSON data from NBA Team Stats using https://stats.nba.com/ as a Referer header, but now it's giving me the same error message as shown above. To try and get around these errors, I have tried entering the following headers:
Host: stats.nba.com
Connection: keep-alive
Accept: application/json
x-nba-stats-token: true
User-Agent: Chrome/79.0.3945.130
x-nba-stats-origin: stats
Referer: https://stats.nba.com/
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
When I do submit the query with the above headers, it comes back with the following error message:
Unable to connect
We encountered an error while trying to connect.
Details: "The 'Host' header must be modified using the appropriate property or method.
Parameter name: name"
I have run out of ideas as to how I'm able to properly run the query. I'm really new to web-scraping and HTML -- I've been trying to teach myself. Any help is greatly appreciated.
All headers for GET request:
Host: stats.nba.com
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache
Accept: application/json, text/plain, */*
x-nba-stats-token: true
X-NewRelic-ID: VQECWF5UChAHUlNTBwgBVw==
DNT: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36
x-nba-stats-origin: stats
Sec-Fetch-Site: same-origin
Sec-Fetch-Mode: cors
Referer: https://stats.nba.com/teams/traditional/?sort=TEAM_NAME&dir=-1
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US;q=0.9,en;q=0.7
URL:
https://stats.nba.com/stats/leaguedashteamstats?Conference=&DateFrom=&DateTo=&Division=&GameScope=&GameSegment=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=PerGame&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2019-20&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&TwoWay=0&VsConference=&VsDivision=
Required Headers:
Accept: application/json, text/plain, */*
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36
x-nba-stats-origin: stats
Sec-Fetch-Site: same-origin
Sec-Fetch-Mode: cors
Referer: https://stats.nba.com/teams/traditional/?sort=TEAM_NAME&dir=-1
Not sure if required:
x-nba-stats-token: true
X-NewRelic-ID: VQECWF5UChAHUlNTBwgBVw==
Possible problems:
You detected as a bot and blocked
Header X-NewRelic-ID is a token (maybe with timeout). Probably it's assign using different params like IP, User-Agent and among others.
You can get fresh X-NewRelic-ID in HTML response with GET request to https://stats.nba.com/.
Here is a part from HTML with xpid token:
<script type="text/javascript">(window.NREUM||(NREUM={})).loader_config={xpid:"VQECWF5UChAHUlNTBwgBVw==",licenseKey:"09f0cb5c68",applicationID:"76210961"};
On Thursday (2017-04-26), I began seeing the following error when I logged into my application using my Authenticator JSF page.
[#|2017-04-30T15:18:51.649-0500|WARNING|glassfish
4.1|javax.enterprise.web|_ThreadID=30;_ThreadName=http-listener-1(2);_TimeMillis=1493583531649;_LevelValue=
StandardWrapperValve[Faces Servlet]: Servlet.service() for servlet
Faces Servlet threw exception
javax.faces.application.ViewExpiredException:
viewId:/security/Authenticator.xhtml - View
/security/Authenticator.xhtml could not be restored. at
com.sun.faces.lifecycle.RestoreViewPhase.execute(RestoreViewPhase.java:212)
at com.sun.faces.lifecycle.Phase.doPhase(Phase.java:101)
at
com.sun.faces.lifecycle.RestoreViewPhase.doPhase(RestoreViewPhase.java:123)
at
com.sun.faces.lifecycle.LifecycleImpl.execute(LifecycleImpl.java:198)
My Authenicator.xhtml page is backed by a Authenticator.java class with the following header.
#Named
#ViewScoped
public class Authenticator implements Serializable {
During my research, I discovered the following:
I am able to log into my application using Chrome 58.0.3029.81 one time after restarting the computer running the GlassFish 4.1.2 server. If I log off, I will get the above error on every future log in attempt. (This is a weird one.)
I can log in using Internet Explorer
I can log in using Chrome versions older the 58.0.3029.81.
I can log in using Chrome 57.0.2987.132 on my Android telephone
I can log in using Chrome 58.0.3029.81 if I change the javax.faces.STATE_SAVING_METHOD variable in my web.xml file from server to client.
Why would Chrome 58.0.3029.81 kill the Authenticator view resulting in the ViewExpiredException?
As requested, I analyzed the network traffic and determined that Chrome 58.0.3029.81 sends two additional Get requests during the Authenticator.xhtml display process than Chrome 57.0.2987.133 sends.
Chrome 57:
GET /webapp/security/Authenticator.xhtml HTTP/1.1
GET /webapp/security/RES_NOT_FOUND HTTP/1.1
GET /webapp/security/RES_NOT_FOUND HTTP/1.1
POST /webapp/security/Authenticator.xhtml HTTP/1.1
Chrome 58:
GET /webapp/security/Authenticator.xhtml HTTP/1.1
GET /webapp/security/RES_NOT_FOUND HTTP/1.1
GET /webapp/security/RES_NOT_FOUND HTTP/1.1
GET /webapp/security/RES_NOT_FOUND HTTP/1.1
GET /webapp/security/RES_NOT_FOUND HTTP/1.1
POST /webapp/security/Authenticator.xhtml HTTP/1.1
Since I don't know why Chrome sends the RES_NOT_FOUND gets in the first place I don't know if sending two extra is a bad thing but it seems to be related to GlassFish 4.1.2 not being able to reconnect to the Authenticator view.
Could this be an issue with my Authenticator.xhtml page or is it a Chrome 58/GlassFish 4.1.2 issue?
The following is a comparison of the Post information:
Chrome 57 Post
POST /webapp/security/Authenticator.xhtml HTTP/1.1
Host: localhost:8080
Connection: keep-alive
Content-Length: 205
Cache-Control: max-age=0
Origin: http://localhost:8081
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36
Content-Type: application/x-www-form-urlencoded
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Referer: http://localhost:8081/webapp/security/Authenticator.xhtml
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.8
Cookie: JSESSIONID=4067aa3d0df7f2bc26b8200a8c4a;
modena_expandeditems=j_idt32%3Awelcome-menu
authentication-form=authentication-form&authentication-form%3AuserName=XXX&authentication-form%3Apassword=XXX&authentication-form%3Aj_idt93=&javax.faces.ViewState=-4577625721740212982%3A4298605796688550126
Chrome 58 Post
POST /webapp/security/Authenticator.xhtml HTTP/1.1
Host: localhost:8080
Connection: keep-alive
Content-Length: 204
Cache-Control: max-age=0
Origin: http://172.24.1.125:8081
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.81 Safari/537.36
Content-Type: application/x-www-form-urlencoded
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Referer: http://172.24.1.125:8081/webapp/security/Authenticator.xhtml
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.8
Cookie: JSESSIONID=4089ef02f0bca32d331de1f5404f
authentication-form=authentication-form&authentication-form%3AuserName=XXX&authentication-form%3Apassword=XXX&authentication-form%3Aj_idt93=&javax.faces.ViewState=3383766421781608154%3A6418504070036764787
The only difference that I see is that Chrome 57 appended "; modena_expandeditems=j_idt32%3Awelcome-menu" after the JSESSIONID.
This turned out to be an issue with version 2.1.1 of the PrimeFaces premium theme called Modena and PrimeFaces 6. During HTTP analysis, I noticed that Chrome 57 sent 2 RES_NOT_FOUND requests and Chrome 58 sent 4 RES_NOT_FOUND requests. This was a known issue with Modena 2.1.1 as documented in the following PrimeFaces Modena Forum issue:
PrimeFaces Modena Forum Issue
During each RES_NOT_FOUND request, the JSESSIONID would change and something about the additional 2 changes in Chrome 58 would break the link between JSESSION and ViewState.
Upgrading Modena to version 2.1.3 eliminated all the RES_NOT_FOUND requests and resolved the ViewExpired issue.
I have seen a lot of urls lately that seem to have hijacked a portion of the site - 'www.example.com' In the example url: "http://www.example.com//wp-includes/js/apparently_user_defined_dir/from_here_on/index.html" 'wp-includes' is wordpress, and js javascript - what has been done (typically) to these sites and what is to be done? (aside from notifying example.com or their host..)
Thank you.
Apart from the domain the other are "path"
The link you entered is translated by browser to a HTTP request with the header (example):
GET //wp-includes/js/apparently_user_defined_dir/from_here_on/index.html HTTP/1.1
Host: www.example.com
Connection: keep-alive
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2024.2 Safari/537.36
Accept-Encoding: gzip,deflate,sdch
Accept-Language: el-GR,el;q=0.8
Check this line:
GET //wp-includes/js/apparently_user_defined_dir/from_here_on/index.html HTTP/1.1
Its the server's job to reply to the request, so if the server knows that
//wp-includes/js/apparently_user_defined_dir/from_here_on/index.html
means its ok.
Most times this is translated to a path on a folder, but probably not in this case.
The page you entered returns a Status Code: 404 Not Found so ... your requested page was not found and it responds you with this error page ... which for some reason reports to the user no error. (We all know this is an example page.)
We have an SPA that draws more javascript modules from a separate backend server, assisted by Require.js. By nature of the XHR loading procedure, pre-flight OPTIONS requests are made to the backend server and the Access-Control-Allow-Origin response is perfectly valid. The process of login and initial module loading work just fine, as expected.
XHR finished loading: "http://backend.cloudapp.net/api/modules/resourceA".
XHR finished loading: "http://backend.cloudapp.net/api/modules/resourceB".
Funny thing is, certain subsequent actions that call for more modules would unexpectedly raise a CORS error in Chrome.
XMLHttpRequest cannot load
http://backend.cloudapp.net/api/modules/resourceC. Origin
https://frontend.cloudapp.net is not allowed by
Access-Control-Allow-Origin.
Which does not make sense, since the previous modules loaded just fine. Even the actual OPTIONS preflight came back proper for resourceC. Some other places in the UI have their modules loading fine too. And Firefox does not appear to suffer from this problem. Has anybody experienced similar CORS errors?
Request/response headers for successful (expected) module
Request URL:http://backend.cloudapp.net/api/modules/resourceA
Request Method:OPTIONS
Status Code:200 OK
Request Headers
Accept:*/*
Accept-Encoding:gzip,deflate,sdch
Accept-Language:en-GB,en-US;q=0.8,en;q=0.6
Access-Control-Request-Headers:accept, origin, content-type
Access-Control-Request-Method:GET
Host:backend.cloudapp.net
Origin:https://frontend.cloudapp.net
Proxy-Connection:keep-alive
User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.110 Safari/537.36
Response Headers
Access-Control-Allow-Headers:accept, origin, content-type
Access-Control-Allow-Methods:GET
Access-Control-Allow-Origin:https://frontend.cloudapp.net
Cache-Control:no-cache
Connection:Keep-Alive
Content-Length:0
Date:Wed, 19 Jun 2013 07:12:42 GMT
Expires:-1
Pragma:no-cache
Proxy-Connection:Keep-Alive
Server:Microsoft-IIS/7.5
X-AspNet-Version:4.0.30319
X-Powered-By:ASP.NET
Request URL:http://backend.cloudapp.net/api/modules/resourceA
Request Method:GET
Status Code:200 OK
Request Headers
Accept:application/json
Accept-Encoding:gzip,deflate,sdch
Accept-Language:en-GB,en-US;q=0.8,en;q=0.6
Content-Type:application/json
Host:backend.cloudapp.net
Origin:https://frontend.cloudapp.net
Proxy-Connection:keep-alive
User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.110 Safari/537.36
Response Headers
Access-Control-Allow-Origin:https://frontend.cloudapp.net
Cache-Control:no-cache
Connection:Keep-Alive
Content-Length:5048
Content-Type:application/json; charset=utf-8
Date:Wed, 19 Jun 2013 07:12:42 GMT
Expires:-1
Pragma:no-cache
Proxy-Connection:Keep-Alive
Server:Microsoft-IIS/7.5
X-AspNet-Version:4.0.30319
X-Powered-By:ASP.NET
Request/response headers for unsuccessful module
Request URL:http://backend.cloudapp.net/api/modules/resourceC
Request Method:OPTIONS
Status Code:200 OK
Request Headers
Accept:*/*
Accept-Encoding:gzip,deflate,sdch
Accept-Language:en-GB,en-US;q=0.8,en;q=0.6
Access-Control-Request-Headers:accept, origin, content-type
Access-Control-Request-Method:GET
Host:backend.cloudapp.net
Origin:https://frontend.cloudapp.net
Proxy-Connection:keep-alive
User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.110 Safari/537.36
Response Headers
Access-Control-Allow-Headers:accept, origin, content-type
Access-Control-Allow-Methods:GET
Access-Control-Allow-Origin:https://frontend.cloudapp.net
Cache-Control:no-cache
Connection:Keep-Alive
Content-Length:0
Date:Wed, 19 Jun 2013 07:12:59 GMT
Expires:-1
Pragma:no-cache
Proxy-Connection:Keep-Alive
Server:Microsoft-IIS/7.5
X-AspNet-Version:4.0.30319
X-Powered-By:ASP.NET
Request URL:http://backend.cloudapp.net/api/modules/resourceC
Request Headers
Accept:application/json
Content-Type:application/json
Origin:https://frontend.cloudapp.net
User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.110 Safari/537.36
(And browser blocks further action)
Given that Chrome has gone through so many version updates, in addition to us deploying some module resources in a different package manner now in different AWS infrastructure, we no longer experience this problem now.
Inspired from my answer here
It might be worth investigating if any of the failing XHRs are sending any peculiar unicode character. In our case, one of our user's name contained a unicode character and our HTTP proxy wasn't handling it properly thereby leading to a CORS error.