I am currently working on trying to take a packet capture and work backwards to determine what objects are associated to each page request. For example if a packet capture contains 2 different webpages worth of requests I want to be able to determine for each object (TCP stream) which root page it is associated with. Is there an easy way to do this?
I know there are tools that will isolate the TCP streams and which will pull the data within them out, however I am not looking to replicate the webpage. I am simply looking to be able to associate each stream to the original page that requested it.
What you are trying to do is reconstructing the "call graph" of a browsing session. For a simply analysis, you can inspect just the HTTP headers. Bro makes this process very convenient. If site A loads site B, A typically shows up in the Referer header of B.
However, if you aim for completeness, this task becomes a daunting challenge: you need to parse the HTTP body payload and even JavaScript to determine all the URLs that are being created at runtime in the client, e.g., via AJAX, iframes, and friends.
Related
I'm trying to get data from this site: [1] https://www.eurobet.it/it/scommesse/#!/calcio/?temporalFilter=TEMPORAL_FILTER_OGGI_DOMANI
I found this link where I can get the data in JSON format: [2] https://www.eurobet.it/detail-service/sport-schedule/services/discipline/calcio?prematch=1&live=0&temporalFilter=TEMPORAL_FILTER_OGGI_DOMANI
But there is a problem:
The JSON link Doesn't work every time in fact sometimes I get a 404 error.
I noticed that if I open the first link [1] before opening the second [2] it works perfectly.
This error is also more frequent when I try to scrape other data on the same site: [3] https://www.eurobet.it/detail-service/sport-schedule/services/discipline/calcio/piu-giocate/u-o-goal?prematch=1&live=0&temporalFilter=TEMPORAL_FILTER_OGGI_DOMANI
In this link [3] I try to get all "u-o-goal" odds but this link works only if (before starting my program to scrape data) in the main link [1] I press the "U/O GOAL" button -> https://i.stack.imgur.com/Nei5u.png
In my code, I'm using Java and htmlunit to scrape the data.
My question is: how this webpage works, why couldn't I open directly the links [2]/[3], I know that there is a sort of request and approval system behind but I can't see where.
You cannot directly open these URLs since the website (and many like it) will use cookies and bot-prevention techniques/session tracking so they can gather data about usage of their website. eg. they set a "Referer".
I'm not going to code a solution for you but I can at least help you understand what you need to do to get to where you want...
I've attempted to summarise how I'd typically unpick a request like this to recreate it, but in its essence, you need to understand the sequence of HTTP requests being made (this is how the web works - HTTP requests).
First you typically start with no session cookies and you access the site directly (no referer).
Once you access a website, typically the server responds with a session cookie for you to communicate back to the server a unique session ID so it has some sort of record of your browser having already been in contact.
Your browser may make more requests (asynchronously) and in doing so typically sends the cookies and the referring URL (usually the base Url will work... just don't use something that starts with something other than "https://www.eurobet.it"
anything else you're going to need to figure it out. Lots of headers are optional. Lots of query params have defaults.
https://stackoverflow.com/a/64671815/7619034 - here's an answer I've given before that answers this type of question which comes up often enough.
so to explain a bit further, for your specific scenario...
When you access https://www.eurobet.it/it/scommesse/#!/calcio/?temporalFilter=TEMPORAL_FILTER_OGGI_DOMANI, the server responds with HTTP headers:
...
set-cookie: __cfduid=dd38d***********41125; ...
...
The rest doesn't look that relevant:
Going straight to the other request: https://www.eurobet.it/detail-service/sport-schedule/services/discipline/calcio?prematch=1&live=0&temporalFilter=TEMPORAL_FILTER_OGGI_DOMANI
This HTTP request takes (as input):
cookie: __cfduid=dd38d***********41125; mbox=session#6661556c.....b6e8cc1fa6f03#1608242987; at_check=true; s_ecid=MCMID%***********2021453010; AMCVS_45F10C3A53DAEC9F0A490D4D%40AdobeOrg=1; AMCV_45F10C3A53DAEC9F0A490D4D%40AdobeOrg=1075005958%7CMCIDTS%7C18614%7CMCMID%7C91883906030825914429183258312021453010%7CMCAID%7CNONE%7CMCOPTOUT-1608248327s%7CNONE%7CvVersion%7C4.4.1; s_cc=true
...
referer: https://www.eurobet.it/it/scommesse/
...
x-eb-accept-language: it_IT
x-eb-marketid: 5
x-eb-platformid: 1
Cookies are set in an initial request (typically) using Set-Cookie header and then are passed back to the server in subsequent requests using the cookie header.
I'm not certain how many of these values are relevant but you'd need to figure out where each came from in the chain of HTTP requests between the initial one and this one and you'd need to replicate them (see url above of my previous answer - warning this can be time consuming).
The other headers can be set statically most likely since they probably aren't due to change.
If you have access to curl on the command line, you can attempt to reconstruct some of these requests by hand. Some will be time sensitive since cookies do expire after an amount of time (see set-cookie header details for exactly when). Once you've reconstructed a working request, you can then start coding it in your application.
If you can work all this out you should be able to re-construct the chain of HTTP GET requests to get the JSON data you want. Good luck!
I am having trouble finding a way to perform a simple operation without making it way more complex than it has to be.
Example: I want to say "Alexa, What is the status of my website?"
I want it to know I'm referring to http://refindustry.com/index1.php
And I want it to read the single line on that page that at current says"Our website is under construction"
It is a really simple operation each time I want it to request the defined page and read the single line of html text on the page.
Keep in mind I don't want to have to host a server or pay amazon to do this I just want her to simply request the page and read the single line.
Tried going to amazon developer and it looked insane account linking lamda requests just way more difficult that it should be.
You can get the free tier of AWS to host a lambda that can fetch the text from your website and return the response to Alexa.
Then you need to setup your skill with a few sample utterances for how to ask for the status of your website and use this lambda.
You have to have a service that will fulfil the intent from Alexa - this could be a server somewhere (so the skill would send the request via HTTP), or a Lambda function.
As Josep says, you can get a free AWS tier - write some code to fetch the page and "grab" the status, before using it in the response to Alexa. It should be pretty straight forward to do this.
If you're a little daunted about AWS, you could look at Hosted Skills - https://developer.amazon.com/blogs/alexa/post/ebc1c777-2cb2-4210-8c89-2a70dd1a0248/get-your-head-out-of-the-cloud-with-alexa-hosted-skills-preview
I want to make a turn based game (Something like Checkers) with the help of Servlets and jsp pages.I created a page that has a newGame button that redircet to the gamePage(It redirect the first into a Black.jsp and the other request will be redirected to Red.jsp).
My problem is ,how could I refresh the other jsp automaticaly if one of them changed.
Note:After the change in one of the jsp it redirect the request to servlet and servlet update the changed jsp graphics.but the other jsp stay inactive.I want to make it active.
Thank You
It sounds like what you need is Comet. Here's an overview of how it works.
http://www.ibm.com/developerworks/web/library/wa-cometjava/
Basically, the "other" user's browser will send a request to a servlet to get an update, but that request won't receive receive its response until the current player makes a move. This gets around the problem posed by the fact that, with traditional HTTP, the browser always has to be the one sending the request to the server, it can't be the other way around.
There are some variations on the technique. Now that you know the name, I'm sure you'll be able to find lots of useful information about it.
There's another technology called WebSocket which can also serve this purpose, but it requires additional capability built into the browser and, as of now, probably not all of your users will be using compatible browsers.
I am creating a dashboard application in which i show information about the servers. I have a Servlet called "poller.java" that will collect information from the servers and send it back to a client.jsp file. In the client.jsp , i make AJAX calls every 2 minutes to call the poller.java servlet in order to get information about the servers.
The client.jsp file shows information in the form of a table like
server1 info
server 2 info
Now, i want to add one more functionality. when the user clicks on the server1, I should show a separate page (call it server1.jsp) containing the time stamps in which the AJAX call was made by calling.jsp and the server information that was retrieved. This information is available in my calling.jsp page. But, how do i show it in the next page.
Initially, i thought of writing to a file and then retrieving it in my server1.jsp file. But, I dont think it is a good approach. I am sure i am missing a much simpler way to do this. Can someone help me ?
You should name your servlet Poller.java not poller.java. Classes should always start with an uppercase. You can implement your servlet to forward to a different page for example if sombody clicks to server1 then the servlet will forward to server1.jsp. Have a look at RequestDispatcher for this. Passing information between request's should be done by request attributes. if you need to retain the information over several request you could think about using session.
In the .NET world, we use SessionState to maintain data that must persist between requests. Surely there's something similar for JSP? (The session object, perhaps.)
If you can't use session state in a servelet, you're going to have to fall back on a physical backing store. I'd use a database, or a known standard file format (like XML). Avoid home-brew file formats that require you to write your own parser.
I have a simple RESTful web service and I wish to test the PUT method on a certain resource. I would like to do it in the most simple way using as few additional tools as possible.
For instance, testing the GET method of a resource is the peak of simplicity - just going to the resource URL in the browser. I understand that it is impossible to reach the same level of simplicity when testing a PUT method.
The following two assumptions should ease the task:
The request body is a json string prepared beforehand. Meaning, whatever is the solution to my problem it does not have to compose a json string from the user input - the user input is the final json string.
The REST engine I use (OpenRasta) understands certain URL decorators, which tell it what is the desired HTTP method. Hence I can issue a POST request, which would be treated as a PUT request inside the REST engine. This means, regular html form can be used to test the PUT action.
However, I wish the user to be able to enter the URL of the resource to be PUT to, which makes the task more complicated, but eases the testing.
Thanks to all the good samaritans out there in advance.
P.S.
I have neither PHP nor PERL installed, but I do have python. However, staying within the realm of javascript seems to be the simplest approach, if possible. My OS is Windows, if that matters.
I'd suggest using the Poster add-on for Firefox. You can find it over here.
As well as providing a means to inspect HTTP requests coming from desktop and web applications, Fiddler allows you to create arbitrary HTTP requests (as well as resend ones that were previously sent by an application).
It is browser-agnostic.
I use the RESTClient firefox plugin (you can not use an URL for the message body but at least you can save your request) but also would recommend curl on the command line.
Maybe you should also have a look at this SO question.