I'm struggling to figure out how to do an exact search in Google Dev - Elements. For example if I want to search http it will also show me https. How do I an exact search for http only?
Related
I have been using the Chrome dev console's Application tab to look through various cookies stored by some sites. But one key piece of information that seems to be missing is a log of what actually created the cookie (i.e. where it came from).
Is there some tool in Chrome that will show me where a cookie actually came from (eg: javascript filename + line of code, set-cookie response header, etc)?
If it's set in a HTTP response header, you can search for Set-Cookie in the Network panel's search UI.
Press Command+F or Control+F to open that search box.
You can search across all text by pressing Command+Shift+F or Control+Shift+F and then typing document.cookie into the Search tab, which is at the bottom of the next screenshot. If you know the name of the cookie, it would probably be more straightforward to search for that, instead.
Forgive me if I don't use the proper terminology. I have a webpage that I'm trying to scrape information from. The problem is that when I view the page source the data I want to scrape is not there. I've encountered this problem before where the main http request triggers other requests and so the information I'm looking for is actually somewhere else which I find using Google chromes inspect - Network feature. I manually search the various documents and xhr files so the one that has the correct information. This is sometimes long and tedious. I can also use google chromes inspect feature to inspect the element that contains the information I want and that brings up the correct source code but it I can't seem to figure out where or how I can use that to quickly find the corresponding HTTP headers.
Restated in a short - can I use the inspect element feature of google chrome and then ask it to show me the corresponding network event (HTTP request) that produced that code?
I'll add the case study I'm working on.
http://www.flashscore.com/tennis/atp-singles/acapulco/results/
shows the different matches that took place at a tennis tournament. I'm trying to scrape the the match hrefs but if you view source of the page you'll see they're not there.
Thanks
Restated in a short - can I use the inspect element feature of google chrome and then ask it to show me the corresponding network event (HTTP request) that produced that code?
No. This isn't something that the browser keeps track of.
In most situations, the HTTP response will pass through a good deal of Javascript code before being eventually turned into elements on the page. Tracing which HTTP response was "responsible" for a given element would involve a great deal of data flow analysis, and is impractical for a browser to do.
One way:
open firefox, install LiveHttpHeaders, then run it, and you will see the expected HEADERS.
There's the same addon for google chrome, but not tested.
I'm trying to browse this url: googleweblight.com/?lite_url=http://www.google.com
but I'm not able to, since I got:
Transcoding test failed:
This page could not be transcoded due to technical issues.
The problem is that I need to copy paste every search result I get from google search page into googleweblight.com/?lite_url=[here]
Why am I not able to use googleweblight for google? How can I make my urls go to googleweblight version directly, without copy and paste and not using a device emulator in user agent?
On Firefox, I am using the UAControl add-on to change my browser's User-Agent to Mobile Safari, and it gives me the mobile version of Google Search, which by default has all search result URLs pointing via GoogleWebLight.
In fact, since I didn't want the URLs to be redirected via GoogleWebLight as such, I had to write a GreaseMonkey script to convert them back to 'regular' (direct) URLs. Maybe you can modify it to do the opposite on the Google Search page, if you're not comfortable with the User-Agent switch approach. I believe you can utilize something like TamperMonkey if you're on a different browser such as Google Chrome.
I am encountering a problem in which elements that I am trying to select using their XPath do not exist according to Scrapy response. However, the when I inspect the same page on Google Chrome, the element DOES exist.
This problem is occurring on a LinkedIn scrape after using LinkedIn advanced search and getting to a results page. I want to scrape links in the results container.
For example: On the results page for a search on "John," there should be a div element with id="results-container" according to an Inspect Element on Google Chrome. When I use Scrapy response.xpath('//div[#id="results-container]'), there are no selectors returned.
url of result page: https://www.linkedin.com/vsearch/p?firstName=John&openAdvancedForm=true&locationType=Y&rsid=4319659841436374935558&orig=ADVS
Did you try to look up the URL you provided in a private session window of your browser (sometimes called incognito mode)?
If you do this you see that you get a registration form for LinkedIn.
As alecxe suggests in his comment try using the LinkedIn API (it is REST) and you can get XML responses which you can parse along to gather the information needed.
Alternatively you could try to log-in with Scrapy and store the authentication credentials and repeat your request (but I would use the API anyway).
Does anyone know what's going on with the Google Driving Directions Gadget?
If you click the add a gadget to your website and you just get a 404.
I have it on a page here, and if I open chrome developer tools I get:
Failed to load resource: the server responded with a status of 404 (Not Found)
http://www.gmodules.com/ig/ifr?url=http://hosting.gmodules.com/ig/gadgets/f…&lang=en&country=US&border=%23ffffff%7C3px%2C1px+solid+%23999999&output=js Failed to load
On this page.
Does anyone know a working alternative or a work around for this? It's strange that Google would let something like this break.
If you click the google maps gadget above that too, it also gives a 404. It has been like that for the last few days that I have tried it, so not sure how long its been like that for.
Those gadgets are by third party developers. From the bottom of the page you link to:
Much of the content in this directory was developed by other companies or by Google's users, not by Google. Google makes no promises or representations about its performance, quality, or content. Google doesn't charge for inclusion in this directory or accept payment for better placement.