Why can't I use <p> </p>? - html

I can't find a convincing answer for this. Is it wrong in terms of semantic HTML? SEO unfriendly? Accessibility?
A lot of WYSIWYG editors use it. I think it is a good way to add some extra space between paragraphs, like you do when you're writing a document and want to express 'extra differentiation' between 2 specific paragraphs. Of course you can do that with CSS, but you need to add extra classes like so:
<p class="extra-space">
Some text
</p>
<p>
Other topic
</p>
I'm sure this is not a problem for screen readers. And semantics … why an 'empty paragraph' has not a valid meaning by itself?

It's because an empty paragraph is not a paragraph at all. A paragraph is defined as "A paragraph is a self-contained unit of a discourse in writing dealing with a particular point or idea." (by Wikipedia)
From a typographic point-of-view there are other elements that help seperating content in a more semantic way. If the next paragraph is so important you maybe want to add a heading in front of it.
You also violate the rules of seperating content and design. It's just not a good idea. Think of a blog you're writing. If you do something like that, you're typographic appearance may become totally inconsistent because sometimes you use 2 empty paragraphs, sometimes none. It's just not a good idea to mix content and appearance.

It is wrong in terms of semantics because these semantics define a way to properly "Talk" to the machine.
Although it displays fine, it is can become an issue when you need some automated process.
To put it in another perspective, lets say you have a recipe, that works well when you tell it to someone. At some point you are writing down the recipe for an automated robot.
Instead of writing "Add ice cream in the blender"
you are writing "Add I scream in the blender"
When you hear it, it is the same.
When someone reads it, and can correct the mistake, is fine.
What about when the robot reads it ?

Related

What is a correct approach to using strong/em tags when localising strings?

I know some languages emphasise words differently to English, e.g. via changing word endings rather than stressing words with inflection of the voice.
If you are localising a site, would you trust that <strong> and <em> tags (and their placement) will have the same meaning in other languages — would you maintain this emphasis, check with your translator or leave them out?
What I'm wondering is how this translates (excuse the pun) into the semantics of the web? — Strong and em tags carry semantic meaning that is used within SEO, screen-readers etc. So should they be left in place so this isn't lost, or dropped to better conform with the target language?
Mark-up is there to convey meaning a whole, and so long as the meaning is conveyed, you have succeeded in your mark-up. So in a language where stress emphasis is conveyed in the text, using tags to signal the emphasis is redundant and optional.
Inline level markup, much more so than block level markup, may need to be radically different in different languages. In a good translation, the text should be marked up from scratch in each language.
For individual words, I would leave the markup tags in the translation strings for the reasons outlined in comments above. If emphasising blocks of text, whole sentences, numerals, etc, I'd put it in the template if possible as it's not really something that would need to be be messed with by the translation.
A good idea might be to flag in the template the you have done this using comments. You will also need some reliable process for getting all the translation files changed if you decide to alter the emphasis ever (which you inevitably will). This is a pain, so I tend to avoid adding emphasis to individual words wherever possible :)
Interesting question! The only thing I can add is that strong and em tags are only useful for SEO if the search engines connect those tags with the content on your site.
I'd recommend using these tags only if there's an actual reason (for emphasis, say) rather than hoping to gain SEO benefit from bolding or italicizing keywords.
For screen readers, it comes down to what language you are talking about. JAWS for example, you can download voice files. If it isn't listed, then they have to have to choose another language or find alternative means. The key thing is for you to set the lang attribute correctly.

Reasons Against Empty Paragraphs in HTML

EDIT: Rephrased question.
Other than being bad practice, what other reasons are there against empty paragraphs in HTML?
ORIGINAL:
Background
Currently to add a nicely space paragraph in our CMS you press Enter key twice. I don't like empty paragraphs because they seem unnecessary to me. If you want a new paragraph, just press Enter and space it with CSS. If you want to write just below some text (e.g. to display code), then do a line break with Shift+Enter.
Question
Is there any very good reason in not allowing empty paragraphs? Is there a standard here? Seems like I just have a philosophical issue right now -- i.e. using empty paragraphs probably won't make page viewing faster or save that much space.
One thing I've learnt the heard way is that any time you have a WYSIWYG editor for a web page, you stand a risk of ending up with poor quality HTML.
It doesn't matter how good the editor is, or how well trained your people are to use it, you will end up with bad code.
They'll click the 'bold' button instead of selecting your sub-title class. They'll create spurious paragraph tags rather than line breaks. And I've had to explain to one person several times why it's a bad idea to use multiple spaces to indent stuff.
Even when people are very good at using the editor and understand the implications, you'll still get things like stray markup setting styles and then unsetting them without any content, because if you (for example) make a word bold and then delete it, it generally doesn't delete the bold tags, and no-one thinks to switch to the HTML view to check.
The basic problem is that when you make it easy to use like a word processor, people will treat it like a word processor, and the underlying code becomes completely irrelevant to them. Their job is to produce content that looks good, and as long as they can achieve that, they don't generally care for how the code looks.
The good thing is that there is a solution. In general, the people generating the content are the same people who care the most about SEO. If you emphasise that there might be SEO consequences to poor quality HTML, I find that they suddenly care a lot more about the code they're generating. They still don't generally have the skills to fix it when they've broken it, but it does seem to make people take more care to follow the rules.
To directly answer your question, I don't think it's a disaster to have empty paragraph tags like that. It's preferable not to though, and you need to consider how the content would look semantically to a search engine - it may cause the search engine to see the two paragraphs of content as being less connected to each other than they should be. This may affect how it weights the content of each paragraph when it comes to deciding its page rank. In truth, it's unlikely to be a huge difference; in fact, I'd say it's probably very tiny, but in a competitive world, it could be enough to push you down a few places. There are probably other more important SEO issues for you to deal with, but as they say, every little helps.
There are times when you have a CSS styling a particular element in your case a paragraph. IF you will use empty paragraph they will unneccesarily pick up that styling which might not be needed.
By styling paragraphs with CSS, you can change the way paragraphs are styled easily in future.
For example, you might want to style differently if the user is browsing on a mobile device, or you might just decide that you want to add more or less space between paragraphs (using attributes like margin-top and margin-bottom on the p tag I guess) because it just looks better that way. If the spacing is done with extra p tags it'd be a lot harder to change.
I expect that things like screen readers for the visually impaired would deal with CSS-styled paragraphs better than if the structure of the page is changed by adding empty paragraphs.

HTML5 for marking up functionality - what semantic tags should I use?

When it comes to writing blog markup, I absolutely understand the use of article and section tags. But my masthead sections have two widgets. One has a search engine embedded and the other is marketing copy leading to an FAQ page.
What would be the correct HTML5 markup in this case? How do I mark up widget functionality?
my masthead sections have two widgets. One has a search engine embedded...
A search engine embedded? Do you mean a search field, i.e. a text field into which you can type search terms? For that, you want <input type="search">.
...and the other is marketing copy leading to an FAQ page.
Does this really qualify as a “widget”? If it’s marketing copy “leading” to an FAQ page, that just sounds like a link to me, which has been semantically represented in HTML since version 1 with the <a> element.
HTML is pretty simple, you really don’t want to over-think it. You don’t need specific tags for everything people could possibly give a name to. (What exactly is a “widget”? Isn’t it just a section of the page?) For most things, <section> is fine.
While HTML5 is a big improvement, there's one thing it doesn't fix: The subjectivity of what is considered proper semantics for every situation.
And, I doubt HTML will ever fix that.
If you're already using HTML5 containers for other more obvious parts of the page, I wouldn't sweat these too elements much. You could put the marketing stuff in an aside. Search could be considered a form of nav. But...I don't think bad karma will come your way if you just stick them in a couple of divs, either. ;)

Quantify the semantic value of <p> as opposed to <div>

I'm transforming some XML, which I have no control over, to XHTML. The XML schema defines a <para> tag for paragraphs and <unordered-list> and <ordered-list> for lists.
Frequently in this XML, I find lists nested within paragraphs. So, a straight-forward transformation causes <ul>s to get nested within <p>s, which is illegal in XHTML.
I've created a list of ways to deal with it and here are the most obvious:
Just don't worry about it. The browsers will do fine. Who cares. (I don't like this option, but it's an option!)
Write a fancy-pants component to my transform that makes sure all <para> tags get closed before unordered lists start, and re-opened afterward. (I like this option the most, but it's complicated due to multiple levels of nesting, and we may not have the budget for this)
Just transform <para> to <div> and set the margins on the divs so it looks like a paragraph in the browser. This is the easiest solution that emits valid XHTML, but it takes from the semantic value of the markup.
My questions are:
how much value do I lose if I go with option 3?
Does it really matter?
What is the actual effect on the user experience?
If you can cite references, please do (this is easy to speculate on). For example, I was thinking it might affect search results from a Google Search Appliance that we are using.
If search terms appear in divs, do they carry less weight?
Or is there less of an association between them and preceding header tags?
How can I find this out?
I've come up against this too.
Personally, I consider it a grave mistake on part of the standard that a p cannot contain lists. I think it's typographically legal, so it should be legal in what was originally intended to be a markup for text.
I may be flamed for this, but XHTML has crashed and burned in the real world, regardless of whether it was a good idea or not. The often horrible tag soup that is today's HTML markup will continue to survive for a goodly long time, if only because bad markup and lenient browsers will continue to perpetuate each other forever.
Thus, I tend to go with Option 1.
Option 3 is also viable, in my opinion. While I don't have proof, I'm pretty sure no search engine is crazy enough to actually put any trust in most of the formatting tags we apply to our HTML. meta and a tags are obvious exceptions, of course.
First of all, unless you set every CSS property available now plus every one possibly available in the future, then you can't guarantee your <div> will match up, WRT styles, with <p>. (Though I agree you can get close and this is probably good enough, but read on.) I don't know of any visual browsers or other tools that would seriously treat them differently, but this is just as much an artifact, IMHO, of the current widespread loose interpretation on the web, as it is of them being close in meaning.
Is <ul> the right transformation for every <unordered-list> in your source data? If they are always displayed as block-level content instead of 1) an, 2) inline, 3) list; then that's a safe bet. If so, you can break the paragraph into two (and wrap the whole thing in <div> if you like).
Example input:
<para>Yadda yadda: <unordered-list/> And so fin.</para>
Output:
<div>
<p>Yadda yadda:</p>
<ul/>
<p>And so fin.</p>
</div>
The good news is that any of these 3 options would work.
There are many, many people on SO that will tell you "if it works, forget semantics and do it." So Option 1 would probably be a site favorite if everyone here was asked.
Option 2 is my favorite and would be the best semantically. I would definetely do it if time/budget allows.
However, Option 3 is a close second and hopefully this will answer your question: The <div> element and the <p> element are near-identical. In fact, the biggest difference is semantics. They each have only one rule applied to them in most browsers' CSS specification: display: block.

Does the CSS property "text-transform" affect SEO results?

I am building a site with a ton of 1999 style capitalization of navigation and headings. I have been simply adding in the text content as it appears (capitalized), but the other designer on the project insists on using lower case text in his HTML and capitalizing it with an applied style:
.tedious {text-transform:uppercase;}
I understand the argument of separation of style from content, but in this case it really doesn't matter because I personally will not maintain the site, nor do I ever imagine that the client will need to un-capitalize all of this text. The question is: 1. will search engines pay any attention at all to capitalization of text in a document and 2. would a crawler go so far as to read my style sheet and look for such things (me thinks not). I know that BOLD, STRONG, EM, etc have a (diminishing) effect on SEO so I can imagine a scenario where CAPS would, but have never heard of anyone actually claiming, let alone confirming this.
Digging this site the last few months. First post.
It will only effect what is shown in the search results, you colleagues work will show as lower case in the results.
You mentioned separation of style from content, but i'm not convinced that text-transform is a style really, it's a change of content, i'm sure some people would argue the other side though.
if i was a search engine - I wouldn't care about casing. I would care about the content.
From a human readability standpoint - upper case isn't as easy to read.
Well, I was taught at school that all proper nouns (eg names and names of places) should begin with capital letters.
How would Google know whether I was talking about reading (as in a book) or Reading (as in the town of Reading, Berkshire), without taking into account the capitalisation? I would argue that capitalisation is definitely a semantic indicator rather than simply a case of aesthetics, and is therefore one factor that could be used for SEO.
As noted elsewhere, Google clearly does have knowledge of the CSS being used to render a page (eg Google can spot black-hat techniques such as white text on a white background).
So if capitalisation (or lack of) is a relevant SEO factor, can the CSS text-transform (or lack of) value also be an SEO factor?
Yes - because Google considers page speed to be an important factor. Text that doesn't need to be transformed by CSS will display faster.
Answer from google:
I don't think we'd do anything special with all-caps headings, but it feels like the kind of thing you'd want to do in CSS instead of in the content, since it's more about styling.
https://mobile.twitter.com/JohnMu/status/1438159561391751170?s=19