How to make a link url go through another page when clicked HTML - html

I'm sorry I do not know how to word that title better. I have tried searching google but my terminology isn't helping my results.
Let me explain the context. When you're on a news website or blog and you're on their homepage like: www.homepage.co.uk/ and then you click an article it will go somewhere like this: www.homepage.co.uk/2017/article/ how do they make the 2017 appear? because if you remove the /article/ from the url it takes you to an archive of all the links in that year? I don't understand, is there a process to this?
When I click a link in my website it goes to: www.website.co.uk/link
I want to be able to have that 2017/link/ in the url so they can find the archive of that year just like on their websites?
How do I do this?
I am sorry if I am not explaining this very well.
I understand changing my filenames to : "2017/article.html" might work but I do not believe that is the correct way of doing it?
Thanks a lot for your time and suggestions!

You're asking about a couple of things: one is the taxonomy of the site. Taxonomy, if you don't know, is the "shape" of or how your site is organized. News sites, for instance, are usually organized by date and perhaps topic (Health and Leisure, Politics, Entertainment, etc.). The other aspect of your question is regarding what you might call RESful "hacking" of URLs. One of the tenents of REST is that URLS (uri, to be accurate) are supposed to be hackable. A news site might have /2017/10/10 to display all articles for Oct 10. Maybe you remove the last "10", and get all the articles for October so far. If you are not using a site platform that does this for you, you will have to maintain that taxonomy yourself, and manually write all the links. Systems such as Drupal and Joomla, among others, will translate your taxonomy into automatically-maintained links. In editing a page on one of these platforms, you typically only refer to the system's internal name of the page (could be a shortened version of the article's title in the above example), and the underlying engine takes care of reconstructing the URL for you (in case the page moves, or its tags/taxonomy changes).
This is a big topic, and I encourage you to do some further reading:
http://searchcontentmanagement.techtarget.com/feature/Building-a-website-taxonomy-in-eight-steps
https://www.drupal.org/docs/7/organizing-content-with-taxonomies/organizing-content-with-taxonomies

Related

Beginning html/css designer - how can I add tags to posts that people can use to sort content?

I'm working on a site to help students with ACT prep, and I want to have a page where I can post explanations to questions that people submit. I want to be able to put a few tags on each post so that site visitors can click on or search whatever's relevant for them in the archives ("semicolons", "geometry", etc.) and all the relevant posts will come up, blog style. I'm very new to this, though, and I don't know how to do it or even what to search - when I search for tags I keep getting SEO recommendations, and that doesn't seem like the right thing.
Here's a solution (but it's not great)
It might be the only way to make what you want happen with a static HTML site.
You could, by hand, create pages that you fill with links to all of the posts that fit a certain category or "tag". For example, you could make a page that has links to all of your posts concerning geometry. Lets call this your archive page for geometry.
Then, when you include tags in a post, you would make each tag link to it's corresponding archive page.
Why do I say its not the best solution?
Virtually every blog that you see has a "back end" with a database that stores posts. When someone comes to your website and looks at a post, that posts data is inserted into a template and displayed to the user. You do not have to re-write the entire web page every time. Thing like the header, sidebar, footer, main page background etc are all in a template.
Having a database also lets you search the database and return relevant results. And a blog with a back end will typically let you write rules (or have them already written) that say, when you add a "tag" to a post, a link to that post should be automatically added to an archive page etc.
As far as I can tell you don't have database, so you'll just be linking static HTML pages. That means that every time you make a new post, you'll have to add a link to all of it's relevant archive pages by hand. Maybe you don't mind that now, but eventually it will be a nightmare to maintain.
I would strongly encourage you to look into a blogging platform like Wordpress to make your site. It will be more complicated to learn at first, but technology that's meant to do what you want it to do will ultimately be easier to use and maintain than technology that's simply meant to mark up a page.

Using a list of dynamic links throughout website

By "dynamic links", I mean a list of links that will constantly be updated.
To illustrate my question, I have a website that I am constantly writing new articles for. I currently have about 10 articles. If someone is to read article #5, there is a list of links to all 10 articles in the right panel of the page. As I update the site, and article #1 becomes out of date, I'd like to replace article #1 with article #11. Rather than updating the links within every article (so 10 times), is there a way to update the links once and have them all update simultaneously to every page?? Could I create an iframe for this??
Thanks for any and all help!
What's your goal? Do you want to learn to be a web developer? Or are you mostly concerned with getting your articles published?
If you want to be a web developer, I'd recommend steering clear of large CMS system like Wordpress or Drupal. Those are great products. But you want to learn the basics first. I think starting a PHP tutorial is the way to go.
If you just want to publish your articles, I'd recommend you find a nice place to create a blog. There are so many to choose from. It all depends on how much you want to spend.
Feel free to ask follow up questions. Web development sounds simple. But it's really a complex topic. I can't imagine what is must be like starting out these days with so many choices and competing technologies.
One way to do it would be to use Server-side includes. (Wikipedia) They work like this:
<!--#include file="some-content.html" -->
or
<!--#include virtual="some-folder/some-content.html" -->
The difference is file="" finds a file relative to the current page, whereas virtual="" finds it from the domain root. Either way, this method can use any type of regular text file as a source. The actual addition of the content is done by the server (hence the name) so its contents will be parsed as regular HTML and all CSS will apply to it as if the file were part of your page. I don't know about compatibility with different hosts, but if your web server supports it, this is probably the easiest way to go.

Crawling data or using API

How these sites gather all the data - questionhub, bigresource, thedevsea, developerbay?
Is this legal to show data in frame as bigresource do?
#amazed
EDITED : fixed some spelling issues 20110310
How these sites gather all data- questionhub, bigresource ...
Here's a very general sketch of what is probably happening in the background at website like questionhub.com
Spider program (google "spider program" to learn more)
a. configured to start reading web pages at stackoverflow.com (for example)
b. run program so it goes to home page of stackoverflow.com and starts visiting all links that it finds on those pages.
c. Returns HTML data from all of those pages
Search Index Program
Reads HTML data returned by spider and creates search index
Storing the words that it found AND what URL those words where found at
User Interface web-page
Provides feature rich user-interface so you can search the sites that have been spidered.
Is this legal to show data in frame as bigresource do?
To be technical, "it all depends" ;-)
Normally, websites want to be visible in google, so why not other search engines too.
Just as google displays part of the text that was found when a site was spidered,
questionhub.com (or others) has chosen to show more of the text found on the original page,
possibly keeping the formatting that was in the orginal HTML OR changing the formatting to
fit their standard visual styling.
A remote site can 'request' that spyders do NOT go thru some/all of their web pages
by adding a rule in a well-known file called robots.txt. Spiders do not
have to honor the robots.txt, but a vigilant website will track the IP addresses
of spyders that do not honor their robots.txt file and then block that IP address
from looking at anything on their website. You can find plenty of information about robots.txt here on stackoverflow OR by running a query on google.
There is a several industries (besides google) built about what you are asking. There are tags in stack-overflow for search-engine, search; read some of those question/answers. Lucene/Solr are open source search engine components. There is a companion open-source spider, but the name eludes me right now. Good luck.
I hope this helps.
P.S. as you appear to be a new user, if you get an answer that helps you please remember to mark it as accepted, or give it a + (or -) as a useful answer. This goes for your other posts here too ;-)

How can I create my web pages Read Only for peoples?

I have a website http://www.bccfalna.com/ and the contents on this site are in HINDI Language. I want to make all these pages read only for peoples so that they can not copy the content.
Since I have written some books in HINDI Language on Computer Technology and I know that there are very few Information in HINDI language on the Internet about the Computer and I.T. Technology so I want to sell my EBooks in PDF format.
To show the usefulness of the contents of my books, I have placed all the contents in TEXT format in my website, so that people can see, read and can make decision to buy the book if the book is useful for them.
Since I have placed my whole books in Content form on my site so that various search engines also can give more and more traffic to my site but I am afraid that since I have placed all my content on my site in text form, any one can copy and will not be interested to buy them as PDF Format EBOOK.
I want that people can Read the content of my site but can not be able to copy the contents in any word processor.
Is it possible?
I don't want to make image like content, because Google, Yahoo like modern search engines don't gives too much importance to image sites.
I don't want to use Flash like sites too. The reason is same. Modern Search Engines don't gives too much attention to these kinds of sites.
I want my contents in TEXT format but I want to make them READ ONLY. Is it possible? If Yes: I would like to know HOW? and if No, I would like to get the alternative type solution.
Is there someone Genius to solve this problem? Thanks.
Generally speaking, any web content that is readable by a search engine will also be readable and copyable by people visiting your page.
I suppose you could examine the user_agent in the HTTP request to determine whether it originated from a popular search engine or not; if it did, return the plain-text of your content; if it did not, return a raster image of your content (text in an image can't be selected for copying and pasting, but it could be OCR'd or otherwise printed by the user). Some websites will use a script to disable right-clicking to save an image (but such scripts can easily be circumvented). Some sites will place a transparent image over the image containing the content (but this, too, can be circumvented). Note that the user_agent can be falsified if the web surfer knows you're treating search engines specially.
I suggest the best approach, though, is to keep things simple. Only publish the first chapter of your book and a table of contents online, or else only publish the first page of each chapter, or something similar. Search engines do not need the complete text of your book, only representative samples. Nobody will go to the trouble of copy/pasting your text if they can only get to a portion of the complete book.
You can't make it indexable to search engines and incapable of being copy & pasted... Google has to be able to copy words from your text to use in it's index. Maybe you could put snippets of the parts you want indexed in text format and put the majority in image/flash. It's not uncommon to see chapter previews on websites selling books.
Try Google Books:
I don't know if it works with the HINDI Language (It works. Some examples: http://www.scribd.com/doc/15257971/Google-Hindi-Books)
This solution allows Google to index and everyone to read the whole content. Anyhow copying remains awkward.
http://books.google.com/googlebooks/tour/
"Read-only" means they cant modify your webpages, "readable but not copyable" is impossible by definition, and makes about as much sense as "I want to give someone some water, but I dont want it to be wet". So, to answer your question, no this is not possible at all. (I regularly have to deal with people who think that this (and others) law of physics/mathematics doesn't apply to them, so sorry if I sound a bit rude.)
On a practical level, if you only give them some of the information, then they will only be able to copy that part of the information. (If they buy the book, they will be able to copy the rest from there.)
As others here have said, what you are asking is not possible.
If you host content for people to view in a browser, and for Google to index, there is absolutely no way to stop anyone from copying it. It is possible to make copying the content difficult (or at least inconvenient), but there's no way to stop someone from copying it if that's what they really want to do.
The only alternative, as others have already said, is to only post the first chapter of the book, and allow your readers to make a judgement based on that chapter. If they like the chapter they'll buy the whole book. This is a pretty common practice.
I understand that posting only part of the content is not what you want, but if you want to make it impossible to copy the whole book then this is your only real option.
The other alternative is to not worry about it. Cory Doctorow (and others I'm sure) publishes all his books under a Creative Commons license. They are free to download from his website but he still manages to make money from selling actual books. If people like your work enough, they'll pay to have it in a nice format.
There is, a way to instruct the browser to disable copying text. This does not, however, prevent copying, just makes is difficult. Not all browsers recognize this, especially older browsers. However, there are ways around this, the user can download the entire page and search for the text embedded in the HTML.
Another way, is to make it a graphic, rather than ASCII text. That way would mean that if anyone really wanted to copy your content, they would have to go through the process of using OCR (optical character recognition), then proof read plus correct the result.
Another way is to make it into a Flash animation, that can also be bypassed by doing a screen capture, then doing an OCR. In short, there is no way to prevent copying of material displayed in a browser ... but you can make it difficult and, hopefully, people won't bother.
FYI, typically people want their website to be read-only, to make it difficult or impossible for hackers to change their website content (i.e. replace content with vandalized content) ... not to prevent people from accessing the content legitiamately uploaded to the website.
Hope this helps.
Scan the text and post as an image, people can still read but not copy the text directly. They can copy the image but that will not matter as it would be the same as just reading from the screen they would have to retype it all if they wanted to steal the work.

Distingushing features of a blog, i.e deference between a blog and a normal site

I'm looking at things that can distinguish a blog from a normal website. These are things that a program needs to be able identify from the html of a website or particular features that a site supports. For eg. pings. The same for news websites.
I'm working on a blog/news monitor program and it will index sites to automatically determine if it is a blog or a news site and then monitor user feedback in comments etc on posts from sites that it determines to be of a blog or news nature.
So what i'm really after is suggestions on what i can use or look out for in identifying these sites.
It's going to be a desktop app written in java so if you have any code specifics in java that'll be great.
thanks in advance
You can search the page for the word "blog", as this will probably be present. Specifically, you can look for it in parts of the HTML page, or exclude parts - like links. This will give you a decent starting point.
Ultimately, though, this is something that will have to be done manually. You should construct an interface for people to specify if it's a blog or news site, or different features of it, when the site is submitted. Then you should create a database of sites and features, and flag them so that you or another administrator can review them and make changes. Once you do this for a site, you'll never need to do it again, so for example http://*.wordpress.com/ is all going to be blogs.
Some features you can automatically detect or get a pretty good chance of detecting, but ultimately you will need a manual review.
Look for a discoverable RSS or Atom feed, which should be present on a blog or serially-updated news site.