I want to create a link like
Link text
But instead of mailing a specific person I want it to open a postgresql tool with as much settings that are available.
Understand that this will depend a lot on the client settings, but just want to give it an decent chance of working if there is such a thing.
Experimented unsuccessfully with
Connect
According to the documentation, HREF links are not restricted to HTTP-based URLs, but the list is quite short. They can use any URL scheme supported by browsers, the usual http, the mailtoyou mentionned or even tel for phone numbers. postgres is not an option, however with registerProtocolHandler() you can add your own.
That means that you can use registerProtocolHandler() to register postgresin order to open a specific link on your page. But that's not yet the solution to your question... You need then to have this specific link to point to some postgres service
Related
I'm wondering if there's a way to make google searches where you can set filters you want to be in effect permanently - like a filter profile. So, for instance, every time you would do a search, you could get results that didn't include say, Yahoo Answers, without having to type in -yahoo -answers.
A feature like this would be invaluable because it's very common to perform a search and want to filter out a lot of popular sites that would normally top the rankings. For example, suppose you're searching for a news topic and don't want to read mainstream media articles. You could add the words reuters, cnn, huffington post, daily mail, and so on to your filter profile and never see those sites turn up in any of your searches ever again.
I'm asking because I'm interesting in making an extension that would do precisely this, but there's no point if such a feature already exists.
You can create a Custom Search in minutes. It's called Google CSE (Custom Search Engine)
This is a sample public link that I've created based on your example above: https://www.google.com/cse/publicurl?cx=006201654654568968489:1kv4asuwfvs
In the settings:
I can choose to exclude by url, url pattern, or even urls within my search results
If you need more ways, here's a good and relevant link.
Search filters can be specified as part of the URL (e.g. append site:example.com/section1 to a Google query to only yield results whose locations start with that prefix). So you can make a search plugin that substitutes your query into such a template and install it into your browser.
Search plugins are generally XML files with a standardized schema. OpenSearch is one such standard supported by Chrome.
There are sites that host collections of user-submitted plugins as well as tools to generate your own. An example that I use is the Mycroft project (originally created for Apple Sherlock software that pioneered the concept and later accepted into the Mozilla project when Firefox took on the feature).
What ways are used to show a user help the first time they use a page - to showcase certain features they might not realize are there.
For instance, say a search form is introduced that has a hidden "advanced search" option:
I would think most people would see the chevron and click it, but..you never know. I know that I could add a cookie to say "Hey - this user has seen it" or create a table in the database.
The problem I see with adding a cookie, is if the user deletes cookies and logs back in - they will have to always dismiss the alert/error/whatever. Unless after a period of time, I go in and manually delete it (which then new users wouldn't see the alert.)
Alternatively, adding a table to the database seems too much for such a simple task. It's what I'm leaning towards, but I hate it...there has to be a better way.
Are there any other ways to show a one time alert for certain pages?
Edit - I used a pretty trivial example on purpose.
I guess both your options are right. The cookie option is bit better cause it will be lighter on the server, again in case you have many users then the database options will be not great.
You may also lookup the new HTML5 feature of storing data on client side. Its a better local storage method.
It goes like localStorage.uid="1234" or something like clickcount.. Refer the html5 docs its a great feature as well.
Heres the link..
http://www.w3schools.com/html/html5_webstorage.asp
have fun..
I have written a website for a local Go meeting in Berlin. It is translated into German, English and Chinese. Currently, I use the naming scheme index.<lang>.html for the three translations and a navigation bar on top to let the user choose.
Is it possible to use meta tags on the index.html (which currently is just a symlink) to let the user agent automagically redirect to the site with the right language if possible? I am interested in solutions that neither involve reconfiguring the server nor need java script to be enabled although the first one might be possible.
You can use HTTP content negotiation to select a version that best matches the language preference information that the browser sends. So it is possible without scripting, but you need to set things up in the server for the negotiation.
However, this is not very practical, because the language preference information cannot be relied on. It is mostly based on browser defaults, since few users even know about the relevant settings in the browser, still less set the appropriately.
Is it possible to use meta tags on the index.html (which currently is just a symlink) to let the user agent automagically redirect to the site with the right language if possible?
No.
If you want automatic selection, then you need to pay attention to the Accept header in the request. That needs server configuration or scripting.
Without it, the best you can have is links to the translations of the document which the user can select manually.
I want to store video embed code in mysql database.. Which is the best way to store it in database, the code can be from various video hosting services..
I would suggest following:
add field video_embed text;
make sure to sanitize the embed code before storing it
now, here comes tricky bit - if you allow storing html code, you should be very strict about what users are allowed to enter - if you are not checking it, users can insert any arbitrary code and perform bad things.
in a way safer is to store url only, as it's easier to validate. also, given that you store url, it's always possible to reverse engineer the embed code ;)
so it depends on your skill and needs after all.
The best way is to save the entire address in your database in a varchar table.
Why you ask me?
lets say you strip all the inputs of their tags and only keep the code, if something changes in the future you only have codes in your database which tells you nothing.
Try to add an dropdown with the several hosting services so you can distinguish the link.
example:
Dropdown [youtube] --- input[youtubelink].
When you want to show the video at the front then you know what type it is and then you know how to handle the link.
When screen-scraping, what are the "gotcha"s to look out for?
The inspiration for this is: my spouse's co-worker asked me to scrape all the pages from a Blogger-hosted blog that her friend with cancer kept in her final months and this lady wanted to keep all of the posts in case the blog were ever deleted. I eventually found a free tool that was barely good enough.
One issue with scraping many Blogger pages is that there's often a navigation menu where you can click on the triangles to expand the post lists by year or month. These little buggers created insane amounts of duplicate content because you'd have the same page over and over again with different combinations of the menus being expanded/collapsed. In Blogger's case I'm not sure this is avoidable since the links are all formatted as real http links and not obvious JavaScript calls. Still, it got me thinking:
If you were to scrape a website, what kinds of potentially non-obvious things would you compensate for?
Do not use regex to scrape
While regular expressions can be good for a large variety of tasks, I find it usually falls short when parsing HTML DOM. The problem with HTML is that the structure of your document is so variable that it is hard to accurately (and by accurately I mean 100% success rate with no false positive) extract a tag.
What I recommend you do is use a DOM parser such as BeautifulSoup or equivalent (SimpleHTMLDom in PHP).
Some may think this is overkill, but in the end, it will be easier to maintain and also allows for more extensibility.
A regular expression could be devised to achieve the same goal but would be limited. For example, developing a regex to get the src and alt tag would force the alt attribute to be after the src or the opposite, and to overcome this limitation would add more complexity to the regular expression.
Also, consider the following. To properly match an <img> tag using regular expressions and to get only the src attribute (captured in group 2), you need the following regular expression:
<\s*?img\s+?[^>]*?\s*?src\s*?=\s*?(["'])((\\?+.)*?)\1[^>]*?>
And then again, the above can fail if:
The attribute or tag name is in capital and the i modifier is not used.
Quotes are not used around the src attribute.
Another attribute then src uses the > character somewhere in their value.
Some other reason I have not foreseen.
So again, simply don't use regular expressions to parse a dom document.
I screen scrape a lot. Some advice:
Emulate a User-Agent string for some browser you want to use. Different websites frequently return very different results depending on what your user agent is. If they don't recognize the User-Agent they will often revert to lowest common denominator, so it's usually best to start with some recent browser. (For example the World of Warcraft Armory returns beautiful, easy to parse XML if it thinks you're a recent Firefox. If it doesn't know what you are it sends terrible HTML).
Be polite to the site you're scraping; don't hit it too hard. Your scraper will go faster if you multi-thread it, making many requests at once, but that will annoy the site owner.
Be smart about error handling. Do not write code like while (1) { makeRequest(); }. If your code or the server throws an error a loop like this will immediately fetch another request, generating another error. It can get ugly quickly. Handle errors well and consider putting in sleeps or exits if you see a lot of errors.
When developing your parsing code, test against a cached version rather than hitting the server every time. Will make your development go faster and is the basis of a simple test suite.
First, I'd check for an RSS feed. On blogger, you just have to add /rss to the root url, if I remember correctly.
Then I'd check if there isn't already some tool to scrape blogger.
Then if there's no RSS feed, and no existing tool, I'd give up and do it by hand with copy/paste. Unless we're talking 5000 pages, it's much faster and easier that way. Take it from someone who's tried.
If you have access to the actual account, blogger has an export function.
edit: Or of course, you could try mechanical turk.
As far as gotchas are concerned..It's usually a good idea to limit the amount of requests made over a certain period of time. Smashing a site with alot of requests in a short space of time is a good way to have your requests rejected.
Aside from the technical considerations, make sure your not putting yourself at legal risk. Most large sites have specific legal language in their terms of use that disallows programmatic access to their services via an automated computer program, and also, the obvious copyright concerns.
From a technical standpoint, definitely use a DOM parser library and you'll save loads of time. Many provide the ability to read HTML into an XML structure that can be queried using XPath to find exactly what you need.
If you know someone who has access to the account, they can use Blogger's export "Export blog" feature.