Search box microformat? - html

Is there any microformat/standard for implementing search form on the site?
(access keys, naming etc.)
Any good practices?

The only things I can think of are that searches should be GET requests, and that you may want to implement a RESTful API that allows developers to query JSON and XML results in addition to HTML
If you are trying to implement a plugin for browsers like IE and Firefox to allow for search/autocomplete in the search box of the browser, check out this: https://developer.mozilla.org/en/creating_opensearch_plugins_for_firefox

Here are some conventions I follow. Forgive me if these are not exactly on topic with microformats, or "technically not" in the way I describe the various parts to my answer.
I have found validation in these few standards I've copied from others:
HTML form ID = "search"
Action URL for the form is //root-of-site/search/
Search Results URL construct:
//root-of-site/search?q=queryClause1+Clause2&AnotherParamName=foo
[personally that structure bugs me a little because search-forward-slash appears to be a directory, and the search-question-mark looks like a page taking a query string, and IMO a page should have a suffix. I've been tempted to use search.cgi or search.app, but I see the big guys using /search?q= and so it is]
ID of search query is "q" (this is nearly universal in adoption)

Related

How to extract data and generate URL from it?

I'm new to stackoverflow (Hello World!). I have some basic understanding of JS, C++, HTML, and CSS and I have been looking in this and other forums but I am having problems figuring out this one, mostly because I don't know what this would be called (TLDR at the bottom):
Essentially, I would like build a chrome extension that extracts data from a website (in this case, copart - a website where people sell cars) and create a link from it that opens another window to one of three car evaluators (edmunds, kbb, nada). I fix cars as a hobby but it's a pain to have to input vehicle info over and over so I wanted to automatize the process as much as possible. Hopefully this will help others as well.
E.g. a generic link to edmunds is: https://www.edmunds.com/ford/escape/2018/appraisal-value/?vin=XXXXXXXXXXXXXX. I would like to know how to extract the make, model, year and VIN, in this case, from copart (Example copart page). On Kbb, e.g., all I see that can automatized is inputting the vin into the window and clicking "Go". Is there a way to have the plugin automatically select "VIN" and copy the VIN into the field while clicking the "Go" button?
Kbb
I know, a lot of questions. I'm also not quite sure what this would be called? A crawler? A scraper? A craper? :)
Either way, here the basic (TLDR) question:
How to create a chrome plugin that extracts data from one website, opens a URL using that data, and which then performs an action like switching a label, populating a textbox, and clicking a button on that URL?
I have only posed this question here so if there's a better place to put it, please let me know.
Mark
Extracting data from one website and searching more for scraped data in other website
1. For this project you can use combination of selenium and scrapy
2. Since both are dynamic page powered by javascript do need to check on security constraints
3. Can make use of spider under scrapy each spider with support of selenium
4. there is need of pressing Go button that can be achieved using selenium

Send and receive data to and from a website using the TWebbrowser component in Delphi

I'm creating a VCL Application with Delpi 10.3 and want to support some web functionality by having the user enter the ISBN of a book into a TEdit component and from there passing/sending this value to a search field on this website: https://isbnsearch.org after which the website looks up the ISBN and displays the Author of the book. I want to somehow access the information (i.e Author) presented by the search result and again use it in my application.
This is my GUI, for a better idea of what I want to accomplish:
What code can I use for this? Any other feasible suggestions or approaches are acceptable.
When performing a search on that website, it simply loads a page with a specific URL query string...
https://isbnsearch.org/search?s=suess
The above example is when I search for "suess", so you can easily concatenate a search URL.
You can use any HTTP component, such as TIdHTTP, to load this search page, then use an HTML parser to scrape the page and read what you need. Much, much easier than trying to read through the TWebBrowser.
In the end, you won't actually display the HTML (I mean you can if you want to), but the idea is to read the data and display it in your own format.
On that specific page, start by locating the ul element with id searchresults. Then, each li element contains individual results. Unfortunately, this website uses pagination, and only shows 10 results per page. To do this, call this page again with another parameter &p=2 for the 2nd page, &p=3 for the 3rd page, and so on.
On the other hand, that is the worst way to acquire such information. What you should be doing is using a proper API which gives you machine-friendly data. The service you are referencing doesn't appear to have an option, but here's an example of one which does:
https://openlibrary.org/dev/docs/api/books - this also appears to provide you MUCH more information than the one you're using.

Using Google Chrome's console to perform searches

I work at a gym at the front desk. The website we use takes awhile to navigate using the GUI. This console in the Google Chrome browser seems pretty powerful. Can someone please direct me to some sort of tutorial or even answer this question yourself?
How would I use the Google Chrome console (inspect element > console) to perform searches using the website's search ability?
Thanks for your help!
You will need some basic understanding of javascript and DOM to do this.
If your site has a simple form for searching, you could use something like this.
document.getElementById('IdOfSearchField').value = 'test';
document.getElementById('IdOfSearchForm').submit();
What is does is that it fiends the searchfield and sets its value to "test" and then it submits the search form.
On this site (sorry for it being in Swedish, it was the one I was currently working on) you could use the following to search for 'test'.
document.getElementById('query').value = 'test'; document.getElementById('SpeedSearchForm').submit();
If your site has jQuery loaded you could simplify this a bit.
$('#query').val('test'); $('#SpeedSearchForm').submit();
Another way of doing it would be to navigate directly to the searchresult-page with the proper querystring (if that is supported by your website. In my sample case, it would look like this (because the search page is located at /search and just need the querystring query to work).
window.location = '/search?query=test';
But as ajp15243 noted in the comments on the question, it all depends on how your site is built. It's also a bit messy to type all that for every search.

HTML form method with nice URL

I just want to know whether there is a way to answer this question with "Yes" without using JavaScript.
What I want to do is have a search form that automatically generates URLs like http://example.com/search/my+search+term or something similar when I enter my search term into a search text field.
EDIT: Due to some mis-understanding (and not being clear on my part), a clarification: I want the browser to generate that URL based on the value of the text field when the form is submitted.
No, it's not possible without using JavaScript.
The best you can do is using a GET action and have an url like http://example.com/search/?q=my+search+term, where q is the name of the input search box.
Using html only, no.
You could have something server side that might work. You could have the server respond with a 302 response code. If you are using Apache, you could probably use mod_rewrite to take the GET request and generate a new url.
For example, the browser might ask for http://example.com/search/?q=blah+foo+bar, the server could then take that and send the browser a 302 redirect for http://example.com/search/blah+foo+bar.
See more information at the Apache url rewriting guide, or by using your favorite search engine.
You could still use javascript to generate the correct url, but if someone has javascript disabled, this would work as a fallback.
The answer is No
No if you want it to be client side, if you can do it server side (by submitting the form) you can use something like PHP
Yes you could perform something like this server-side pretty easily as long as you don't mind submitting a form.
EDIT: Upon further clarification from the author in comments below: It is not possible in a pure client-side manner without JavaScript or some other client-side tool like Flash/Silverlight (which is admittedly overkill).

REST/Ajax deep linking compatibility - Anchor tags vs query string

So I'm working on a web app, and I want to filter search results.
A nice restful implementation might look like this:
1. mysite.com/clothes/men/hats+scarfs
But lets say we want to ajax up the filtering, like the cool kids, and we want to retain deep linking, we might use the anchor tag and parse that with Javascript to show the correct listings:
2. mysite.com/clothes#/men/hats+scarfs
However, if someone clicks the first link with JS enabled, and then changes filters, we might get:
3. mysite.com/clothes/men/hats+scarfs#/women/shoes
Urk.
Similarly, if someone does not have JS enabled, and clicks link 2 - JS will not parse the options and the correct listings will not be shown.
Are Ajax deep links and non-Ajax links incompatible? It would seem so, as servers cannot parse the # part of a url, since it is not sent to the server.
There's a monkeywrench being thrown into this issue by Google: A proposal for making Ajax crawlable. Google is including recommendations for url structure there that may give you ideas for your own application.
Here's the wrapup:
In summary, starting with a stateful
URL such as
http://example.com/dictionary.html#AJAX
, it could be available to both
crawlers and users as
http://example.com/dictionary.html#!AJAX
which could be crawled as
http://example.com/dictionary.html?_escaped_fragment_=AJAX
which in turn would be shown to users
and accessed as
http://example.com/dictionary.html#!AJAX
View Google's Presentation here (note: google docs presentation)
In general I think it's useful to simply turn off JavaScript and CSS entirely and browse your website and web application and see what ends up getting exposed. Once you get a sense of what's visible, you will understand what most search engines see and that in turn will show you what is and is not getting spidered.
If you go to mysite.com/clothes/men/hats+scarfs with JavaScript enabled then your JavaScript should automatically rewrite that to mysite.com/clothes#men/hats+scarfs - when you click on a filter, they should be controlled by JavaScript meaning you'll only change the hashtag rather than the entire URL (as you're going to have return false anyway).
The problem you have is for non-JS users going to your JS enabled deeplinks as the server can't determine that stuff. Unfortunately, the only thing you can do is take them to mysite.com/clothes and make them start their journey again (as far as I'm aware). You'll need to try and ensure that when people link to the site, they use the hardcoded deeplink rather than the hashed deeplink
I don't recommend ever using the query string as you are sending data back to the server without direct relevance to the prior specified destination. That is a corruptible security hole as malicious code can be manually added to the query string to cause a XSS or buffer overflow attack at your webserver.
I believe REST was intended to work with absolute URIs without a query string, because then your specifying only a location of a resource and it is that location that is descriptive and semantically relevant in addition to the possibility of the resource being so equally relevant. Even if there is no resource at the specified path you have still instantiated a potentially unique and descriptive location that can be processed accordingly.
Users entering the site via deep links
Nonsensical links (like /clothes/men/hats#women/shoes) can be avoided if you construct your Ajax initialisation code in such a way that users who enter the site on filtered pages (e.g. /clothes/women/shoes) are taken to the /clothes page before any Ajax filtering happens. For example, you might do something like this (using jQuery):
$("a.filter")
.each(function() {
var href = $(this).attr("href").replace("/clothes/", "/clothes#");
$(this).attr("href", href);
})
.click(function() {
update_filter($(this).attr("href").split("#")[1]);
});
Users without JavaScript
As you said in the question, there's no way for the server to know about the URL fragment so filtering would not be applied for users without JavaScript enabled if they were given a link to /clothes#filter.
However, even without filtering, these links could be made more meaningful for non-JS users by using the filter strings as IDs in your /clothes page. To prevent this messing with the Ajax experience the IDs would need to be changed (or the elements removed) with JavaScript before the Ajax links were initialised.
How practical this is depends on how many categories you have and what your /clothes page contains.