Page.setHtmlContent error with embedded gadgets - google-apps-script

In an attempt to fudge a workaround for the weird "viewers cannot leave comments" issue in google sites*, I wrote the following silly test script:
function addComment(e) {
var currentPage = SitesApp.getActivePage();
var pageHTML = currentPage.getHtmlContent();
var newHTML = pageHTML.replace("BEGIN", "BEGIN "+Session.getActiveUser().getEmail());
currentPage.setHtmlContent(newHTML);
};
When the user presses the button, the current page content should be changed to include the current user's email address right after the word BEGIN (which I manually inserted- if this works I can just stick in a comment tag thingamabob.
This more or less works. The problem is that the setHtmlContent call does all sorts of weird things to the apps script gadget that contains the button in the first place. Here's the gadget before:
<img src="https://www.google.com/chart?chc=sites&cht=d&chdp=sites&chl=%5B%5BGoogle+Apps+Script'%3D20'f%5Cv'a%5C%3D0'10'%3D499'0'dim'%5Cbox1'b%5CF6F6F6'fC%5CF6F6F6'eC%5C0'sk'%5C%5B%22Apps+Script+Gadget%22'%5D'a%5CV%5C%3D12'f%5C%5DV%5Cta%5C%3D10'%3D0'%3D500'%3D197'dim'%5C%3D10'%3D10'%3D500'%3D197'vdim'%5Cbox1'b%5Cva%5CF6F6F6'fC%5CC8C8C8'eC%5C'a%5C%5Do%5CLauto'f%5C&sig=TbGPi2pnqyuhJ_BfSq_CO5U6FOI" data-origsrc="https://sites.google.com/a/macros/kstf.org/s/AKfycbzEsLBQucXCZZJwEh9c3RYhn81uJucvz3R5vHeJ2w/exec" data-type="maestro" data-props="align:left;borderTitle:Apps Script Gadget;height:200;showBorder:false;showBorderTitle:false;" width="500" height="200" style="display:block;text-align:left;margin-right:auto;"></div>
and here it is after:
<img src="https://www.google.com/chart?chc=sites&cht=d&chdp=sites&chl=%5B%5BGoogle+Gadget'%3D20'f%5Cv'a%5C%3D0'10'%3D499'0'dim'%5Cbox1'b%5CF6F6F6'fC%5CF6F6F6'eC%5C0'sk'%5C%5B%22Include+gadget+(iframe)%22'%5D'a%5CV%5C%3D12'f%5C%5DV%5Cta%5C%3D10'%3D0'%3D500'%3D197'dim'%5C%3D10'%3D10'%3D500'%3D197'vdim'%5Cbox1'b%5Cva%5CF6F6F6'fC%5CC8C8C8'eC%5C'a%5C%5Do%5CLauto'f%5C&sig=CvjXRgodwYVKPvmsyZR7EbHx2uM" data-igsrc="http://0.gmodules.com/ig/ifr?mid=0&synd=trogedit&url=http%3A%2F%2Fwww.gstatic.com%2Fsites-gadgets%2Fiframe%2Fiframe.xml&up_iframeURL=%2Fa%2Fmacros%2Fkstf.org%2Fs%2FAKfycbzEsLBQucXCZZJwEh9c3RYhn81uJucvz3R5vHeJ2w%2Fexec%3Fmid%3DACjPJvFOqF88RUUrqDeapp1PHF_lI3Xc3g5Hd3euTifzUYeaILmTTlMfBQ13yI_6%26bc%3Dtransparent%26f%3DArial%2C%2BVerdana%2C%2Bsans-serif%26tc%3D%2523444444%26lc%3D%25230033cc&up_scroll=no&w=100%&h=200" data-type="ggs-gadget" data-props="height:200;igsrc:http#58//0.gmodules.com/ig/ifr?mid=0&synd=trogedit&url=http%3A%2F%2Fwww.gstatic.com%2Fsites-gadgets%2Fiframe%2Fiframe.xml&up_iframeURL=%2Fa%2Fmacros%2Fkstf.org%2Fs%2FAKfycbzEsLBQucXCZZJwEh9c3RYhn81uJucvz3R5vHeJ2w%2Fexec%3Fmid%3DACjPJvFOqF88RUUrqDeapp1PHF_lI3Xc3g5Hd3euTifzUYeaILmTTlMfBQ13yI_6%26bc%3Dtransparent%26f%3DArial%2C%2BVerdana%2C%2Bsans-serif%26tc%3D%2523444444%26lc%3D%25230033cc&up_scroll=no&w=100%&h=200;mid:0;spec:http#58//www.gstatic.com/sites-gadgets/iframe/iframe.xml;up_iframeURL:/a/macros/kstf.org/s/AKfycbzEsLBQucXCZZJwEh9c3RYhn81uJucvz3R5vHeJ2w/exec?mid=ACjPJvFOqF88RUUrqDeapp1PHF_lI3Xc3g5Hd3euTifzUYeaILmTTlMfBQ13yI_6&bc=transparent&f=Arial,+Verdana,+sans-serif&tc=%23444444&lc=%230033cc;up_scroll:no;width:100%;" width="500" height="200" style="display:block;text-align:left;margin-right:auto;" class="igm"></div>
As best as I can tell, the "hey please set this as your HTML" method seems to be doing some chicanery to make certain that the document is properly parsed, but it's getting caught up in a tail-chasing effect of the iframe redirect. If I could hand in some DOM or something similar, this wouldn't be an issue.
Any advice? This was just a kind of exercise to see if I could finesse the visitor comment system anyway, so I'll probably just take another approach.
*: I know about several other ways to handle visitor comments, but this system needs to be able to work on many site pages, for many site authors in an apps domain, without needing complicated setup on the part of the site author. Eventually, I'll use something else (probably a variation on one of the two app engine forum systems I found this morning), but this was a quick stab at an interim solution. Next interim step is to save these data to the site DB and lay out the comments in the gadget itself. Gadget sizing is unsatisfying, however - I'd rather have the comments right in the page instead of in a separate iframe that has its own scroll bars.

First of all, the getEmail() method doesn't work with consumer accounts or when people outside your domain visit the site (unless your script runs as the user accessing the app)
Next, when you change the HTML of a page, it doesn't make the change immediately, but should happen on the next refresh.
Having said that, what you could do is have your script display the comments as well. You can have labels in your script that display previous comments.

Related

retrieving URLs from functions within HTML (python)

I need to scrape some URLs from some retailer product pages, but the specific URLs I need to get aren't in the html part of the page. The html looks like this for each of the items on which one would click to get to the page with the URL I need to grab:
<div id="name" class="hand bold" onclick="AVON.productcontrol.Go(45714);">ADVANCE TECHNIQUES Color Protection Conditioner Bonus Size</div>
I wrote the following to get URLs from the page, but since the actual URLs I need don’t seem to be stored in the page, it doesn’t get what I need:
def getUrls(URL):
"""input: product page url
output: list of urls to products
"""
connection = urllib.urlopen(URL)
dom = lxml.html.fromstring(connection.read())
selAnchor = CSSSelector('a')
foundElements = selAnchor(dom)
urlList = [e.get('href') for e in foundElements]
return urlList
Is there a way to get the link that the function after ‘onclick’ (I guess AVON.productcontrol.Go(#);) takes you to? I don’t fully understand html, and while I’ve read a bit about onclick, I can’t figure out how the function after 'onclick' works.
In order to find the URL that you are taken to on click, you need to find the JavaScript source code of the 'Go' function and read and understand it. It's buried somewhere within a tag or some JavaScript .js file that is referenced directly or indirectly by the HTML page. Happy digging!
Or: you automate the interaction with the web page with a tool like Selenium (http://docs.seleniumhq.org/) and just check where it takes you if you click.

HTML injection into someone else's website?

I've got a product that embeds into websites similarly to Paypal (customers add my button to their website, users click on this button and once the service is complete I redirect them back to the original website).
I'd like to demo my technology to customers without actually modifying their live website. To that end, is it possible to configure http://stackoverflow.myserver.com/ so it mirrors http://www.stackoverflow.com/ while seamlessly injecting my button?
Meaning, I want to demo the experience of using my button on the live website without actually re-hosting the customer's database on my server.
I know there are security concerns here, so feel free to mention them so long as we meet the requirements. I do not need to demo this for website that uses HTTPS.
More specifically, I would like to demonstrate the idea of financial bounties on Stackoverflow questions by injecting a Paypal button into the page. How would I demo this off http://stackoverflow.myserver.com/ without modifying https://stackoverflow.com/?
REQUEST TO REOPEN: I have reworded the question to be more specific per your request. If you still believe it is too broad, please help me understand your reasoning by posting a comment below.
UPDATE: I posted a follow-up challenge at How to rewrite URLs referenced by Javascript code?
UPDATE2: I discarded the idea of bookmarklets and Greasemonkey because they require customer-side installation/modification. We need to make the process as seamless as possible, otherwise many of get turned off by the process and won't let us pitch.
I would suggest to create a proxy using a HTTP handler.
In the ProcessRequest you can do a HttpWebRequest to get the content on the other side, alter it and return the adjusted html to the browser. You can rewrite the urls inside to allow the loading of images, etc from the original source.
public void ProcessRequest(HttpContext context)
{
// get the content using HttpWebRequest
string html = ...
// alter it
// write back the adjusted html
context.Response.Write(html);
}
If you're demoing on the client-side and looking to just hack it in quickly, you could pull it off with some jQuery. I slapped the button after the SO logo just for a demo. You could type this into your console:
$('head').append('<script src="https://www.paypalobjects.com/js/external/dg.js" type="text/javascript"></script>')
$('#hlogo').append('<form action="https://www.sandbox.paypal.com/webapps/adaptivepayment/flow/pay" target="PPDGFrame" class="standard"><label for="buy">Buy Now:</label><input type="image" id="submitBtn" value="Pay with PayPal" src="https://www.paypalobjects.com/en_US/i/btn/btn_paynowCC_LG.gif"><input id="type" type="hidden" name="expType" value="light"><input id="paykey" type="hidden" name="paykey" value="insert_pay_key">')
var embeddedPPFlow = new PAYPAL.apps.DGFlow({trigger: 'submitBtn'});
Now, I'm not sure if I did something wrong or not because I got this error on the last part:
Expected 'none' or URL but found 'alpha('. Error in parsing value for 'filter'. Declaration dropped.
But at any rate if you are demoing you could just do this, maybe as a plan B. (You could also write a userscript for this so you don't have to open the console, I guess?)
After playing with this for a very long time I ended up doing the following:
Rewrite the HTML and JS files on the fly. All other resources are hosted by the original website.
For HTML files, inject a <base> tag, pointing to the website being redirected. This will cause the browser to automatically redirect relative links (in the HTML file, CSS files, and even Flash!) to the original website.
For the JS files, apply a regular expression to patch specific sections of code that point to the wrote URL. I load up the redirected page in a browser, look for broken links, and figure out which section of JS needs to be patched to correct the problem.
This sounds a lot harder than it actually is. On average, patching each page takes less than 5 minutes of work.
The big discovery was the <base> tag! It corrected the vast majority of links on my behalf.

IE injects VBScript tags in the middle of rendering, causing malformed HTML

For some reason, it seems like IE9 (I believe IE8 too, but not sure), is injecting
<SCRIPT LANGUAGE=VBScript>on error resume next pluginFound = IsObject(CreateObject("DIFFERENT PLUGIN EVERY TIME"))
in the middle of my content without any regards to surrounding context. This means it gets added in the middle of an attribute, or in the middle of some JavaScript, causing the HTML to be malformed and causing all sorts of problems.
This happens on multiple computers with different plugins, so it's not machine specific. And it's also not consistent: the location in which the offending script gets injected varies, the offending script varies. Sometimes you'll get several page loads without a problem and then you'll get the broken HTML.
My page is using a fair amount of JS, but nothing crazy. It's currently using jQuery, Google Maps, Bootstrap, Google Tag Manager, and loading a couple of Twitter, Google+, Facebook Iframes with their own little JS snippets. So, there are some asynchronous callbacks happening, but I wouldn't think this would interfere with how the browser renders the DOM and when it decides to inject plugin code.
You can see the problem if you reload http://www.rew.ca/properties/search/839721 enough times. If you scroll to the bottom of the page, you'll see raw JSON, or sometimes just some random HTML snippet will show in the middle of the page (because of mismatched tags).
Any ideas of why these scripts get injected arbitrarily and how to work around that?
Thanks
[UPDATE]
Here's another example of the script tags getting included in the middle of HTML content:
In my opinion, the problem lies in the http://cn.clickable.net/js/cct.js script, and specifically in the IsIEPlugin method of the __cct_tracker class:
this.IsIEPlugin = function (e) {
var t = !1;
return document.write('<SCRIPT LANGUAGE=VBScript>\n on error resume next \n pluginFound = IsObject(CreateObject("' + e + '")) </SCR' + "IPT>\n"), t ? 1 : 0
}
This method is called several times, with different arguments:
this.pixelRequestParams.cctDir = this.IsIEPlugin("SWCtl.SWCtl.1"),
this.pixelRequestParams.cctFlashPlugin = this.IsIEPlugin("ShockwaveFlash.ShockwaveFlash.1");
if (this.IsIEPlugin("PDF.PdfCtrl.1") == 1 ||
this.IsIEPlugin("PDF.PdfCtrl.5") == 1 ||
this.IsIEPlugin("PDF.PdfCtrl.6") == 1)
this.pixelRequestParams.cctPdf = 1;
this.pixelRequestParams.cctQuickTime = this.IsIEPlugin("Quicktime.Quicktime"),
this.pixelRequestParams.cctRealPlayer = this.IsIEPlugin("rmocx.RealPlayer G2 Control.1"),
this.pixelRequestParams.cctWmPlayer = this.IsIEPlugin("wmplayer.ocx")
I suppose (even I'm not sure) that "cct" stands for "Clickable Conversion Tracking", so it must be some sort of tracking code.
After further investigations, I've determined that the "cct.js" script gets loaded by the Google Tag Manager (GTM) script http://www.googletagmanager.com/gtm.js.
So, if I'm not wrong, by removing the GTM code snippet from your HTML page, you should be able to solve the problem.
It seems that the VBSCRIPT code dynamically injected by the GTM JavaScript code sometimes is not executed, but don't ask me why, since I don't really know.

CodeIgniter + jQuery(ajax) + HTML5 pushstate: How can I make a clean navigation with real URLs?

I'm currently trying to build a new website, nothing special, nice and small, but I'm stuck at the very beginning.
My problems are clean URLs and page navigation. I want to do it "the right way".
What I would like to have:
I use CodeIgniter to get clean URLs like
"www.example.com/hello/world"
jQuery helps me using ajax, so I can
.load() additional content
Now I want to use HTML5 features like pushstate to
get rid of the # in the URL
It should be possible to go back and forth without a page refresh but the page will still display the right content according to the current URL.
It should also be possible to reload a page without getting a 404 error. The site should exist thanks to CodeIgniter. (there is a controller and a view)
For example:
A very basic website. Two links, called "foo" and "bar" and a emtpy div box beneath them.
The basic URL is example.com
When you click on "foo" the URL changes to "example.com/foo" without reloading and the div box gets new content with jQuery .load(). The same goes for the other link, just of course different content and URL.
After clicking "foo" and then "bar" the back button will bring me back to "example.com/foo" with the according content. If I load this link directly or refresh the page, it will look the same. No 404 error or something.
Just think about this page and tell me how you would do this.
I would really love to have this kind of navigation and so I tried several things.
So far...
I know how to use CodeIgniter to get the URLs like this. I know how to use jQuery to load additional content and while I don't fully understand the html5 pushstate stuff, I at least got it to work somehow.
But I can't get it to work all together.
My code right now is a mess, that's the reason I don't really want to post it here. I looked at different tutorials and copy pasted some code together. Would be better to upload my CI folder I guess.
Some of the tutorials I looked at:
Dive into HTML5
HTML5 demos
Mozilla manipulating the browser history
Saner HTML5 history
Github: History.js
(max. number of links reached :/)
I think my main problem is, that everybody tries to make it compatible with all browsers and different versions, adds scripts/jQuery plugins and whatnot and I get confused by all the additional code. There is more code between my script-tags then actual html content.
Could somebody post the most basic method how to use HTML5 for my example page?
My failed attemp:
On my test page, when I go back, the URL changes, but the div box will still show the same content, not the old one. I also don't know how to change the URL in the script according to the href attribute from the link. Is there something like $(this).attr('href'), that changes according to which link I click? Right now I would have to use a script for every link, which of course is bad.
When I refresh the site, CodeIgniter kicks in and loads the view, but really only the view by itself, the one I loaded with ajax, not the whole page. But I guess that should be easy to fix with a layout and the right controller settings. Haven't paid much attention to this yet.
Thanks in advance for any help.
If you have suggestions, ideas, or simple just want to mention something, please let me know.
regards
DiLer
I've put up a successful minimal example of HTML5 history here: http://cairo140.github.com/html5-history-example/one.html
The easiest way to get into HTML5 pushstate in my opinion is to ignore the framework for a while and use the most simplistic state transition possible: a wholesale replacement of the <body> and <title> elements. Outside of those elements, the rest of the markup is probably just boilerplate, although if it varies (e.g., if you change the class on HTML in the backend), you can adapt that.
What a dynamic backend like CI does is essentially fake the existence of data at particular locations (identified by the URL) by generating it dynamically on the fly. We can abstract away from the effect of the framework by literally creating the resources and putting them in locations through which your web server (Apache, probably) will simply identify them and feed them on through. We'll have a very simple file system structure relative to the domain root:
/one.html
/two.html
/assets/application.js
Those are the only three files we're working with.
Here's the code for the two HTML files. If you're at the level when you're dealing with HTML5 features, you should be able to understand the markup, but if I didn't make something clear, just leave a comment, and I'll walk you through it:
one.html
<!doctype html>
<html>
<head>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.6.2/jquery.js"></script>
<script src="assets/application.js"></script>
<title>One</title>
</head>
<body>
<div class="container">
<h1>One</h1>
Two
</div>
</body>
</html>
two.html
<!doctype html>
<html>
<head>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.6.2/jquery.js"></script>
<script src="assets/application.js"></script>
<title>Two</title>
</head>
<body>
<div class="container">
<h1>Two</h1>
One
</div>
</body>
</html>
You'll notice that if you load one.html through your browser, you can click on the link to two.html, which will load and display a new page. And from two.html, you can do the same back to one.html. Cool.
Now, for the history part:
assets/application.js
$(function(){
var replacePage = function(url) {
$.ajax({
url: url,
type: 'get',
dataType: 'html',
success: function(data){
var dom = $(data);
var title = dom.filter('title').text();
var html = dom.filter('.container').html();
$('title').text(title);
$('.container').html(html);
}
});
}
$('a').live('click', function(e){
history.pushState(null, null, this.href);
replacePage(this.href);
e.preventDefault();
});
$(window).bind('popstate', function(){
replacePage(location.pathname);
});
});
How it works
I define replacePage within the jQuery ready callback to do some straightforward loading of the URL in the argument and to replace the contents of the title and .container elements with those retrieved remotely.
The live call means that any link clicked on the page will trigger the callback, and the callback pushes the state to the href in the link and calls replacePage. It also uses e.preventDefault to prevent the link from being processed the normal way.
Finally, there's a popstate event that fires when a user uses browser-based page navigation (back, forward). We bind a simple callback to that event. Of note is that I couldn't get the version on the Dive Into HTML page to work for some reason in FF for Mac. No clue why.
How to extend it
This extremely basic example can more or less be transplanted onto any site because it does a very uncreative transition: HTML replacement. I suggest you can use this as a foundation and transition into more creative transitions. One example of what you could do would be to emulate what Github does with the directory navigation in its repositories. It's an intermediate manoever that requires floats and overflow management. You could start with a simpler transition like appending the .container in the loaded page to the DOM and then animating the old container to {height: 0}.
Addressing your specific "For example"
You're on the right track for using HTML5 history, but you need to clarify your idea of exactly what /foo and /bar will contain. Basically, you're going to have three pages: /, /foo, and /bar. / will have an empty container div. /foo will be identical to / except in that container div has some foo content in it. /bar will be identical to /foo except in that the container div has some bar content in it. Now, the question comes to how you would extract the contents of the container through Javascript. Assuming that your /foo body tag looked something like this:
<body>
foo
bar
<div class="container">foo</div>
</body>
Then you would extract it from the response data through var html = $(data).filter('.container').html() and then put it back into the parent page through $('.container').html(html). You use filter instead of the much more reasonable find because from some wacky reason, jQuery's DOM parser produces a jQuery object containing every child of the head and every child of the body elements instead of just a jQuery object wrapping the html element. I don't know why.
The rest is just adapting this back into the "vanilla" version above. If you are stuck at any particular stage, let me know, and I can guide you better though it.
Code
https://github.com/cairo140/html5-history-example
Try this in your controller:
if (!$this->input->is_ajax_request())
$this->load->view('header');
$this->load->view('your_view', $data);
if (!$this->input->is_ajax_request())
$this->load->view('footer');

MediaWiki: page editing allowed by creator only or with approval

I'm trying to restraint editing on the Wiki (using MediaWiki) that I'm creating as an internal project for my company.
We would like to be able to let the page creators specify none or one of the two following options:
Nobody besides the creator of this page can edit the content of this page
Anybody can edit the content of this page, but there must be an approval by the page creator before the changes are visible (whether it'd be by mail, on the wiki directly or something else - does not matter).
If the creator does not specify any of the 2 options, anybody can edit the page, and the changes are immediatly visible (default behaviour).
I've been browsing the net but I did not find an out-of-the-box solution for this. We managed to make some great custom stuff thanks to the edition of the LocalSettings file but not this.
Is there a solution for that functionality?
I don't know of an extension that would make this easy.
What I think you could do would be to take an extension like Flagged Revs or Approved Revs and make it so that instead of using groups as the determiner of approval status, it uses username. This might not be too difficult. Does this make sense?
I had the same problem as you and now i fixed it, here is the solution:
I am using http://www.mediawiki.org/wiki/Extension%3aApproved_Revs for article protection but it didn't fulfil my need it allowed the user to change the currently approved revision of the article and so the change was immediately reflected on the main page so I hacked it a bit, basically you need only one change
go to ApprovedRevs/ApprovedRevs.hooks.php
and find the following code:
static public function setLatestAsApproved( &$article , &$user, $text,
$summary, $flags, $unused1, $unused2, &$flags, $revision,
&$status, $baseRevId ) {
this is a function declaration just after it add the following code:
return false;
and it will work the way you wanted it to be i.e (the change you did will not be reflected until you approve it)