Distingushing features of a blog, i.e deference between a blog and a normal site - blogs

I'm looking at things that can distinguish a blog from a normal website. These are things that a program needs to be able identify from the html of a website or particular features that a site supports. For eg. pings. The same for news websites.
I'm working on a blog/news monitor program and it will index sites to automatically determine if it is a blog or a news site and then monitor user feedback in comments etc on posts from sites that it determines to be of a blog or news nature.
So what i'm really after is suggestions on what i can use or look out for in identifying these sites.
It's going to be a desktop app written in java so if you have any code specifics in java that'll be great.
thanks in advance

You can search the page for the word "blog", as this will probably be present. Specifically, you can look for it in parts of the HTML page, or exclude parts - like links. This will give you a decent starting point.
Ultimately, though, this is something that will have to be done manually. You should construct an interface for people to specify if it's a blog or news site, or different features of it, when the site is submitted. Then you should create a database of sites and features, and flag them so that you or another administrator can review them and make changes. Once you do this for a site, you'll never need to do it again, so for example http://*.wordpress.com/ is all going to be blogs.
Some features you can automatically detect or get a pretty good chance of detecting, but ultimately you will need a manual review.

Look for a discoverable RSS or Atom feed, which should be present on a blog or serially-updated news site.

Related

How to make a link url go through another page when clicked HTML

I'm sorry I do not know how to word that title better. I have tried searching google but my terminology isn't helping my results.
Let me explain the context. When you're on a news website or blog and you're on their homepage like: www.homepage.co.uk/ and then you click an article it will go somewhere like this: www.homepage.co.uk/2017/article/ how do they make the 2017 appear? because if you remove the /article/ from the url it takes you to an archive of all the links in that year? I don't understand, is there a process to this?
When I click a link in my website it goes to: www.website.co.uk/link
I want to be able to have that 2017/link/ in the url so they can find the archive of that year just like on their websites?
How do I do this?
I am sorry if I am not explaining this very well.
I understand changing my filenames to : "2017/article.html" might work but I do not believe that is the correct way of doing it?
Thanks a lot for your time and suggestions!
You're asking about a couple of things: one is the taxonomy of the site. Taxonomy, if you don't know, is the "shape" of or how your site is organized. News sites, for instance, are usually organized by date and perhaps topic (Health and Leisure, Politics, Entertainment, etc.). The other aspect of your question is regarding what you might call RESful "hacking" of URLs. One of the tenents of REST is that URLS (uri, to be accurate) are supposed to be hackable. A news site might have /2017/10/10 to display all articles for Oct 10. Maybe you remove the last "10", and get all the articles for October so far. If you are not using a site platform that does this for you, you will have to maintain that taxonomy yourself, and manually write all the links. Systems such as Drupal and Joomla, among others, will translate your taxonomy into automatically-maintained links. In editing a page on one of these platforms, you typically only refer to the system's internal name of the page (could be a shortened version of the article's title in the above example), and the underlying engine takes care of reconstructing the URL for you (in case the page moves, or its tags/taxonomy changes).
This is a big topic, and I encourage you to do some further reading:
http://searchcontentmanagement.techtarget.com/feature/Building-a-website-taxonomy-in-eight-steps
https://www.drupal.org/docs/7/organizing-content-with-taxonomies/organizing-content-with-taxonomies

Followers with Jekyll

I am developing a blog using Jekyll. I've been reading over the documentation, however I realized that I haven't seen anything about followers. Does Jekyll offer an option for people to follow a blog, as well as a means to keep track of followers?
Thanks for your help!
You would have to extend the functionality of Jekyll with a custom plugin to build followers with the site. You can use social media to promote and build followers. Jekyll is just a static site/blog generator.
I would look at adding a plugin like addthis for social, etc. This will allow your users to promote content on the various social platforms.
"Does Jekyll offer an option for people to follow a blog, as well as a means to keep track of followers?"
The answer to that question is NO.
You can build your own plugin, but having a dynamic (user influenced) number on your website does not fit the concept of a static site (and is therefore not part of Jekyll).
A RSS feed that uses server metrics (log files) to measure the amount of readers, however, fits the Jekyll concept perfectly. A mailing list, like Brandon suggests, is also a very good solution. Adding spyware through AddThis works too, but that would be my very last resort. However, these are not solutions 'Jekyll offers'. Therefore, the answer remains 'no'.

Using a list of dynamic links throughout website

By "dynamic links", I mean a list of links that will constantly be updated.
To illustrate my question, I have a website that I am constantly writing new articles for. I currently have about 10 articles. If someone is to read article #5, there is a list of links to all 10 articles in the right panel of the page. As I update the site, and article #1 becomes out of date, I'd like to replace article #1 with article #11. Rather than updating the links within every article (so 10 times), is there a way to update the links once and have them all update simultaneously to every page?? Could I create an iframe for this??
Thanks for any and all help!
What's your goal? Do you want to learn to be a web developer? Or are you mostly concerned with getting your articles published?
If you want to be a web developer, I'd recommend steering clear of large CMS system like Wordpress or Drupal. Those are great products. But you want to learn the basics first. I think starting a PHP tutorial is the way to go.
If you just want to publish your articles, I'd recommend you find a nice place to create a blog. There are so many to choose from. It all depends on how much you want to spend.
Feel free to ask follow up questions. Web development sounds simple. But it's really a complex topic. I can't imagine what is must be like starting out these days with so many choices and competing technologies.
One way to do it would be to use Server-side includes. (Wikipedia) They work like this:
<!--#include file="some-content.html" -->
or
<!--#include virtual="some-folder/some-content.html" -->
The difference is file="" finds a file relative to the current page, whereas virtual="" finds it from the domain root. Either way, this method can use any type of regular text file as a source. The actual addition of the content is done by the server (hence the name) so its contents will be parsed as regular HTML and all CSS will apply to it as if the file were part of your page. I don't know about compatibility with different hosts, but if your web server supports it, this is probably the easiest way to go.

How to allow clients to manage their website?

I do small websites for local companies. All I know is HTML5 and CSS3, no JavaScript, no PHP.
I have this client who wants me to make a website for his coffee shop. All good so far. I have an idea for a beautiful responsive design which will get his coffee shop a lot of fame.
The problem:
The guy wants to be able to manage his website, meaning: he wants to add a photo if he needs to, or even some text on a particular page. He doesn't want to depend on me so he wants to do it by himself. The problem is that I can't teach him HTML so he would download the HTML file and write the code for the desired thing.. I need to do beautiful websites for my portfolio.
No Wordpress: I don't like Wordpress because it's limited so I can't be creative with the design. I thought of that as being the only solution requiring his needs.
I'm willing to learn more: if there is a solution that I could implement in one month or two, I will do this and learn what is needed, but can't learn PHP in two months.
Any advice?
You might find that CushyCMS does what you want. From the site:
Allow clients to safely edit content
No software to install, no programming required
Takes just a few minutes to setup
Define exactly which parts of the page can be changed
Produces standards compliant, search engine friendly content
From experience, a couple of downsides:
No choice of editor
You have to add pages that can be edited - the client cannot create new pages.

Generating a static website from a set of content data (possibly with webgen, webby or a similar toolkit)

My company (an engineering firm) is looking to redesign their website with some dynamic content. We have a nice portfolio of projects that we'd like to present on our site by category.
To elaborate, I'd like to have a "Projects Category" menu, where you can choose a sub-project category (such as churches, schools, etc) which links to a page with images of all projects which have been tagged with that category attribute. Clicking on an image would then take you to a detailed page for that project.
I have done a good bit of asp and jsp page development, but I've always worked on the front end in an enterprise environment - I've never built a production site from the back end. The advice I've gotten so far is that a full-blown CMS solution would be somewhat overkill, as we won't have a large hit count, and we'll be displaying a few hundred projects at most.
One big-picture choice I appear to have - whether to dynamically generate the pages (with asp or jsp) or to use a tool to generate a set of static html pages. The tool would build the menus, project summary pages, and individual project pages based on a set of data I could provide (in the form of a database or text file.)
I'm leaning towards trying to use a tool like webgen or webby to statically generate the site due to our current web hosting situation. Any thoughts on which approach is more appropriate? Is webgen or webby capable of doing what I am trying to do? Or can anyone recommend other web authoring tools better equipped to accomplish this?
Thanks for any feedback!
You could always use Template Toolkit :)
Jekyll may be worth a look.
Refer: https://github.com/jekyll/jekyll/wiki/
I've been told that webgen can't do what I'm trying to do (without some manual coding extensions myself) but that nanoc can.
http://nanoc.stoneship.org/