So i have a school issue where i need to access a site but this site requires me to go through around 4 portals to get there and am hoping to just write a quick script to do this for me. Problem comes where the site is very sloppy and are written with names being the same on certian buttons so I would like to click the buttons based on class
the classes are readit2, readit23, readit239, and readit2394
$(function(){
document.getElementByClassName(readit2).click();
});
the above code i thought would click it as soon as it loads the first page but does not. any help would be great
// ==UserScript==
// #name dumb spider web
// #namespace ===============
// #version 0.1
// #description gets me through this dumb stuff
// #match ===============
// #copyright 2012+, You
// #require http://code.jquery.com/jquery-latest.js
// ==/UserScript==
edit ^ added the header stuff
more edits:
it works through the console now but through tamper monkey OR grease monkey I can not get it to actually preform the action.
$(function(){
document.getElementsByClassName("readit2")[0].click();
});
Works through console but does not run when script is started.
I started looking at all the wrong things, your original code:
$(function(){
document.getElementByClassName(readit2).click();
});
The issue with this is that the call should be getElementsByClassName (plural "Elements"), since class is usually a shared property. That also returns an array, so if you are positive that there is only 1 ever:
$(function(){
document.getElementsByClassName("readit2")[0].click();
});
If not, I'd suggest getting the text and verifying that it contains something that you expect.
Edit: Added quotes.
The problem is that you are trying to access some nodes that still don't exist. Add this to your script:
// #run-at document-end
Then you make sure that all nodes are loaded and printed.
Another bypass would be this:
$(document).ready(
//your script
);
I'm trying to grab a table from the following webpage
http://www.bloomberg.com/markets/companies/country/hong-kong/
I have some sample code which was kindly provided by Phil Bozak here:
grabbing table from html using Google script
which grabs the table for this website:
http://www.airchina.com.cn/www/en/html/index/ir/traffic/
As you can see from Phil's code, there is alot of "getElement()" in the code. If i look at the html code for the Air China website. It looks like it's nested four times? that's why the string of .getElement?
Now I look at the source code for the Bloomberg page and its is load with "div"...
the question is can someone show me how to grab the table from this the Bloomberg page?
and just a brief explanation of the theory also would be useful. Thanks a bunch.
Let's flip your question upside down, and start with the theory. Methodology might be a better word for it.
You want to get at something specific in a structured page. To do that, you either need a way to zap right to the element (which can be done if it's labeled in a unique way that we can access), OR you need to navigate the structure more-or-less manually. You already know how to look at the source of a page, so you're familiar with this step. Here's a screenshot of Firefox Inspector, highlighting the element we're interested in.
We can see the hierarchy of elements that lead to the table: html, body, div, div, div.ticker, table.ticker_data. We can also see the source:
<table class="ticker_data">
Neat! It's labeled! Unfortunately, that class info gets dropped when we process the HTML in our script. Bummer. If it was id="ticker_data" instead, we could use the getElementByVal() utility from this answer to reach it, and give ourselves some immunity from future restructuring of the page. Put a pin in that - we'll come back to it.
It can help to visualize this in the debugger. Here's a utility script for that - run it in debug mode, and you'll have your HTML document laid out to explore:
/**
* Debug-run this in the editor to be able to explore the structure of web pages.
*
* Set target to the page you're interested in.
*/
function pageExplorer() {
var target = "http://www.bloomberg.com/markets/companies/country/hong-kong/";
var pageTxt = UrlFetchApp.fetch(target).getContentText();
var pageDoc = Xml.parse(pageTxt,true);
debugger; // Pause in debugger - explore pageDoc
}
This is what our page looks like in the debugger:
You might be wondering what the numbered elements are, since you don't see them in the source. When there are multiples of an element type at the same level in an XML document, the parser presents them as an array, numbered 0..n. Thus, when we see 0 under a div in the debugger, that's telling us that there are multiple <div> tags in the HTML source at that level, and we can access them as an array, for example .div[0].
Ok, theory behind us, let's go ahead and see how we can access the table by brute-force.
Knowing the hierarchy, including the div arrays shown in the debugger, we could do this, ala Phil's previous answer. I'll do some weird indenting to illustrate the document structure:
...
var target = "http://www.bloomberg.com/markets/companies/country/hong-kong/";
var pageTxt = UrlFetchApp.fetch(target).getContentText();
var pageDoc = Xml.parse(pageTxt,true);
var table = pageDoc.getElement()
.getElement("body")
.getElements("div")[0] // 0-th div under body, shown in debugger
.getElements("div")[5] // 5-th div under there
.getElement("div") // another div
.getElement("table"); // finally, our table
As a much more compact alternative to all those .getElement() calls, we can navigate using dot notation.
var table = pageDoc.getElement().body.div[0].div[5].div.table;
And that's that.
Let's go back to that pinned idea. In the debugger, we can see that there are various attributes attached to elements. In particular, there's an "id" on that div[5] that contains the div that contains the table. Remember, in the source we saw "class" attributes, but note that they don't make it this far.
Still, the fact that a kindly programmer put this "id" in place means we can do this, with getDivById() from that earlier question:
var contentDiv = getDivById( pageDoc.getElement().body, 'content' );
var table = contentDiv.div.table;
If they move things around, we might still be able to find that table, without changing our code.
You already know what to do once you have the table element, so we're done here!
First of all, I know there's libraries that provide polyfills for location.pushState/popState (History.js, Hash.js, jQuery hashchange), so please don't just link to those.
I need a more powerful library to achieve the following in a RIA:
User clicks a link
library is notified and loads context via Ajax (no complete reload!)
All <a> elements are leveraged with a click handler that
prevents page reloads in 2. (preventDefault) and
calls location.pushState instead / sets location.hash for older browsers
loaded content is inserted in page and replaces current content
Continue with 1.
Also, previously loaded content should be restored as the user navigates back.
As an example, klick through Google+ in Internet Explorer <10 and any other browser.
Is there anything that comes even close? I need support for IE8, FF10, Safari 5 and Chrome 18. Also, it should have a permissive license like MIT or Apache.
I believe Sammy.js ( http://sammyjs.org) (MIT-licenced) has the best focus on what you want to do, with its 2 main pillars being:
Routes
Events
I could quote from the docs but it's pretty straightforward:
setup clientside routes that relate to stuff to be done, e.g: update the view through ajax
link events to call routes, e.g: call the route above when I click an link. (You would have to make sure e.preventDefault is called in the defined event I believe, since this is an app decision really, so that can't be abstracted away by any library that you're going to use imho)
Some relevant docs
http://sammyjs.org/docs
http://sammyjs.org/docs/routes
http://sammyjs.org/docs/events
Example for a route: (from http://sammyjs.org/docs/tutorials/json_store_1)
this.get('#/', function(context) {
$.ajax({
url: 'data/items.json',
dataType: 'json',
success: function(items) {
$.each(items, function(i, item) {
context.log(item.title, '-', item.artist);
});
}
});
});
Or something like
this.get('#/', function(context) {
context.app.swap(''); ///the 'swap' here indicates a cleaning of the view
//before partials are loaded, effectively rerendering the entire screen. NOt doing the swap enables you to do infinite-scrolling / appending style, etc.
// ...
});
Of course other clientside MVC-frameworks could be an option too, which take away even more plumbing, but might be overkill in this situation.
a pretty good (and still fairly recent) comparison:
http://codebrief.com/2012/01/the-top-10-javascript-mvc-frameworks-reviewed/
( I use Spine.js myself ) .
Lastly, I thought it might be useful to include an answer I've written a while ago that goes into detail to the whole best-practice (as I see it) in client-side refreshes, etc. Perhaps you find it useful:
Accessibility and all these JavaScript frameworks
I currently use PathJS in one of my applications.
It has been the best decision that i have made.
For your particular usecase take a look at HTML5 Example.
The piece of code that that makes the example work (from the source):
<script type="text/javascript">
// This example makes use of the jQuery library.
// You can use any methods as actions in PathJS. You can define them as I do below,
// assign them to variables, or use anonymous functions. The choice is yours.
function notFound(){
$("#output .content").html("404 Not Found");
$("#output .content").addClass("error");
}
function setPageBackground(){
$("#output .content").removeClass("error");
}
// Here we define our routes. You'll notice that I only define three routes, even
// though there are four links. Each route has an action assigned to it (via the
// `to` method, as well as an `enter` method. The `enter` method is called before
// the route is performed, which allows you to do any setup you need (changes classes,
// performing AJAX calls, adding animations, etc.
Path.map("/users").to(function(){
$("#output .content").html("Users");
}).enter(setPageBackground);
Path.map("/about").to(function(){
$("#output .content").html("About");
}).enter(setPageBackground);
Path.map("/contact").to(function(){
$("#output .content").html("Contact");
}).enter(setPageBackground);
// The `Path.rescue()` method takes a function as an argument, and will be called when
// a route is activated that you have not yet defined an action for. On this example
// page, you'll notice there is no defined route for the "Unicorns!?" link. Since no
// route is defined, it calls this method instead.
Path.rescue(notFound);
$(document).ready(function(){
// This line is used to start the HTML5 PathJS listener. This will modify the
// `window.onpopstate` method accordingly, check that HTML5 is supported, and
// fall back to hashtags if you tell it to. Calling it with no arguments will
// cause it to do nothing if HTML5 is not supported
Path.history.listen();
// If you would like it to gracefully fallback to Hashtags in the event that HTML5
// isn't supported, just pass `true` into the method.
// Path.history.listen(true);
$("a").click(function(event){
event.preventDefault();
// To make use of the HTML5 History API, you need to tell your click events to
// add to the history stack by calling the `Path.history.pushState` method. This
// method is analogous to the regular `window.history.pushState` method, but
// wraps calls to it around the PathJS dispatched. Conveniently, you'll still have
// access to any state data you assign to it as if you had manually set it via
// the standard methods.
Path.history.pushState({}, "", $(this).attr("href"));
});
});
</script>
PathJS has some of the most wanted features of a routing library:
Lightweight
Supports the HTML5 History API, the 'onhashchange' method, and graceful degredation
Supports root routes, rescue methods, paramaterized routes, optional route components (dynamic routes), and Aspect Oriented Programming
Well Tested (tests available in the ./tests directory)
Compatible with all major browsers (Tested on Firefox 3.6, Firefox 4.0, Firefox 5.0, Chrome 9, Opera 11, IE7, IE8, IE9)
Independant of all third party libraries, but plays nice with all of them
I found the last too points most attractive.
You can find them here
I hope you find this useful.
i'd like to suggest a combination of
crossroads.js as a router
http://millermedeiros.github.com/crossroads.js/
and hasher for handling browser history and hash urls (w/ plenty of fallback solutions):
https://github.com/millermedeiros/hasher/
(based on http://millermedeiros.github.com/js-signals/)
This will still require a few lines of code (to load ajax content etc.), but give you loads and loads of other possibilities when handling a route.
Here's an example using jQuery (none of the above libraries require jQuery, i'm just lazy...)
http://fiddle.jshell.net/Fe5Kz/2/show/light
HTML
<ul id="menu">
<li>
foo
</li>
<li>
bar/baz
</li>
</ul>
<div id="content"></div>
JS
//register routes
crossroads.addRoute('foo', function() {
$('#content').html('this could be ajax loaded content or whatever');
});
crossroads.addRoute('bar/{baz}', function(baz) {
//maybe do something with the parameter ...
//$('#content').load('ajax_url?baz='+baz, function(){
// $('#content').html('bar route called with parameter ' + baz);
//});
$('#content').html('bar route called with parameter ' + baz);
});
//setup hash handling
function parseHash(newHash, oldHash) {
crossroads.parse(newHash);
}
hasher.initialized.add(parseHash);
hasher.changed.add(parseHash);
hasher.init();
//add click listener to menu items
$('#menu li a').on('click', function(e) {
e.preventDefault();
$('#menu a').removeClass('active');
$(this).addClass('active');
hasher.setHash($(this).attr('href'));
});
Have you looked at the BigShelf sample SPA (Single Page Application) from Microsoft? It sounds like it covers how to achieve most of what you're asking.
It makes use of History.js, a custom wrapper object to easily control navigation called NavHistory and Knockout.js for click handling.
Here's an extremely abbreviated workflow of how this works: first you'll need to initialize a NavHistory object which wraps history.js and registers a callback which executes when there is a push state or hash change:
var nav = new NavHistory({
params: { page: 1, filter: "all", ... etc ... },
onNavigate: function (navEntry) {
// Respond to the incoming sort/page/filter parameters
// by updating booksDataSource and re-querying the server
}
});
Next, you'll define one or more Knockout.js view models with commands that can be bound to links buttons, etc:
var ViewModel = function (nav) {
this.search = function () {
nav.navigate({ page: 2, filter: '', ... }); // JSON object matching the NavHistory params
};
}
Finally, in your markup, you'll use Knockout.js to bind your commands to various elements:
<a data-bind="click: search">...</a>
The linked resources are much more detailed in explaining how all of this works. Unfortunately, it's not a single framework like you're seeking, but you'd be surprised how easy it is to get this working.
One more thing, following the BigShelf example, the site I'm building is fully cross-browser compatible, IE6+, Firefox, Safari (mobile and desktop) and Chrome (mobile and desktop).
The AjaxTCR Library seems to cover all bases and contains robust methods that I haven't seen before. It's released under a BSD License (Open Source Initiative).
For example, here are five AjaxTCR.history(); methods:
init(onStateChangeCallback, initState);
addToHistory(id, data, title, url, options);
getAll();
getPosition();
enableBackGuard(message, immediate);
The above addToHistory(); has enough parameters to allow for deep hash-linking in websites.
More eye-candy of .com.cookie(), .storage(), and .template() provides more than enough methods to handle any session data requirements.
The well documented AjaxTCR API webpage has a plethora of information with downloadable doc's to boot!
Status Update:
That website also has an Examples Webpage Section including downloadable .zip files with ready to use Front End(Client) and Back End(Server) project files.
Notably are the following ready-to-use examples:
One-way Cookie
HttpOnly Cookies
History Stealing
History Explorer
There are quite a bit other examples that rounds out the process to use many of their API methods, making any small learning curve faster to complete.
Several suggestions
ExtJs, see their History Example, and here are the docs.
YUI Browser History Manager.
jQuery BBQ seem to provide a more advanced feature-set over jQuery.hashcode.
ReallySimpleHistory may also be of help, though it's quite old and possibly outdated.
Note: ExtJs History has been extended to optimize duplicate (redundant) calls to add().
PJAX is the process you're describing.
The more advanced pjax techniques will even start to preload the content, when the user hovers over the link.
This is a good pjax library.
https://github.com/MoOx/pjax
You mark the containers which need will be updated on the subsequent requests:
new Pjax({ selectors: ["title", ".my-Header", ".my-Content", ".my-Sidebar"] })
So in the above, only the title, the .my-header, .my-content, and .my-sidebar will be replaced with the content from the ajax call.
Somethings to look out for
Pay attention to how your JS loads and detects when the page is ready. The javascript will not reload on new pages. Also pay attention to when any analytics calls get called, for the same reason.
Our web app is built entirely in JS.
To make it snappy we cache resources (models) between page views and reload the resource when you view a page.
Our flow is like this:
The user is in ViewA
The user switches to ViewB
We use the cached resource to render ViewB
We start a fetch for resource
When the resource is fetched we render again
This has a nasty drawback of causing <img> tags to flicker, ever if they are the same.
The problem is that Backbone.js, which we use, doesn't tell us if anything changed when fetching a collection, just that it was fetched.
Here's a quick demo of what I mean: http://jsfiddle.net/p7DdG/
It only happens in webkit and with <img> tags, not with background images as you can see.
We think it's kinda ugly to use background-image instead of a proper img tag.
Is there any solution to this?
The problem is gone in Chrome 19, problem solved :)
Not knowing exactly how the URL of each image is being built I'm not certain this will work, but could you check the src attribute of each image tag against the one you are replacing it with before doing the replace?
e.g.
var newImageSrc = "http://www.google.com/intl/en_com/images/srpr/logo3w.png";
if (newImageSrc != $("img").attr("src")) {
$('img').replaceWith('<img src="'+newImageSrc +'">');
}
Alternatively - load the image offscreen, and attach an event handler to the onload event of the image, which moves the image to the current image's parent tag, and remove the old one.
e.g.
var oldImage = $("#oldImageId");
var newImageSrc = "http://www.google.com/intl/en_com/images/srpr/logo3w.png";
var newImage = new Image();
$(newImage).load(function (event) {
$(oldImage).parent().append(newImage);
$(oldImage).detach();
});
$(newImage).attr("src", newImageSrc);
I ran into the same problem and noticed that sometimes images do flicker and sometimes don't. Even in latest Chrome (v33 as of now).
For posterity, flickering happens with uncached images.
In my case, Cache-Control: public, max-age=31536000 totally eliminated it.
Is it possible to extend the addEvent function in mootools to do something and also calls the normal addEvent method? Or if someone has a better way to do what I need I'm all years.
I have different 'click' handlers depending on which page I'm on the site. Also, there might be more than one on each page. I want to have every click on the page execute a piece of code, besides doing whatever that click listener will do. Adding that two lines on each of the handlers, would be a PITA to say the least, so I thought about overriding the addEvent that every time I add a 'click' listener it will create a new function executing the code and then calling the function.
Any idea how I could do it?
Whereas this is not impossible, it's a questionable practice--changing mootools internal apis. Unless you are well versed with mootools and follow dev direction on github and know your change won't break future compatibility, I would recommend against it.
The way I see it, you have two routes:
make a new Element method via implement that does your logic. eg: Element.addMyEvent that does your thing, then calls the normal element.addEvent after. this is preferable and has no real adverse effects (see above)
change the prototype directly. means you don't get to refactor any code and it will just work. this can mean others that get to work with your code will have difficulties following it as well as difficulties tracing/troubleshooting- think, somebody who knows mootools and the standard addEvent behaviour won't even think to check the prototypes if they get problems.
mootools 2.0 coming will likely INVALIDATE method 2 above if mootools moves away from Element.prototype modification in favour of a wrapper (for compatibility with other frameworks). Go back to method 1 :)
I think solution 1 is better and obvious.
as for 2: http://jsfiddle.net/dimitar/aTukP/
(function() {
// setup a proxy via the Element prototype.
var oldProto = Element.prototype.addEvent;
// you really need [Element, Document, Window] but this is fine.
Element.prototype.addEvent = function(type, fn, internal){
console.log("added " + type, this); // add new logic here. 'this' == element.
oldProto.apply(this, arguments);
};
})();
document.id("foo").addEvent("click", function(e) {
e.stop();
console.log("clicked");
console.log(e);
});
it is that simple. keep in mind Element.events also should go to document and window. also, this won't change the Events class mixin, for that you need to refactor Events.addEvent instead.