Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I'm writing a story, When ever I'm done with a chapter I place it in a HTML file for fun. I've been using the paragraph tag for each paragraph. But I have a lot of paragraphs and there's a lot more to come. I'm here to ask if there's a easier way to do so. Willing to use css.
This is not a programming question, but still, since I know the answer I am answering it. What you are looking for is, Markdown.
Markdown is a Wiki like syntax, specifically used for Rich Text Content. So now since you have an issue with writing too many paragraphs, Markdown makes it easy to convert all the double line breaks to paragraphs for you.
For eg., consider the below text:
Pages that you view in incognito tabs won’t stick around in your browser’s history, cookie store or search history after you’ve closed all of your incognito tabs. Any files that you download or bookmarks that you create will be kept.
Going incognito doesn’t hide your browsing from your employer, your internet service provider or the websites that you visit.
You can see that there are two line breaks after the word kept.. This will render in a browser, when you input the above text in a Markdown converter, this way:
<p>Pages that you view in incognito tabs won’t stick around in your browser’s history, cookie store or search history after you’ve closed all of your incognito tabs. Any files that you download or bookmarks that you create will be kept.</p>
<p>Going incognito doesn’t hide your browsing from your employer, your internet service provider or the websites that you visit.</p>
Depending on how picky you are about presentation, you may consider wrapping the entire thing in...
<div style="white-space: pre-wrap">
Epic story of epic unicorn epicness!
</div>
This will preserve all whitespace, not just newlines but also spaces between words. This means that if you have a large space in your text (conveying a pause through typography, or whatever) then you can just type a bunch of spaces and they will work.
However, you won't have the ability to control how much space there is between paragraphs, other than by adding more newlines. You can't make the space whatever size you want like you could with paragraphs. Consider what you need and choose a course of action.
Personally, I'd use Notepad++ (or a similar editor), and do a "Find and Replace" for all regex (.+) to be replaced with <p>\1<\/p>
You can use simple search and replace as Niet the Dark Absol has suggested. Depending upon your Notepad++ version, the replace may require $0 to refer to source match. (Here is a more discussion about it, Notepad++ Regex replace - \1 doesn't work?).
Here is how you can search for the paragraph and replace with a paragraph tag.
Result
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 months ago.
Improve this question
So I am now learning html, and I was just wondering why tags such as cite even exist. When I open a website as a user, I still see the text as italic when the code is written as cite.
I found that the tags are useful when it comes to screen readers, so basically for users that have problems with their vision.
Are there any more reasons for these tags? Thank you so much in advance!
Tags are small snippets of HTML coding that tell engines how to properly “read” your content. In fact, you can vastly improve search engine visibility by adding SEO tags in HTML.
When a search engine’s crawler comes across your content, it takes a look at the HTML tags of the site. This information helps engines like Google determine what your content is about and how to categorize the material.
Some of them also improve how visitors view your content in those search engines. And this is in addition to how social media uses content tags to show your articles.
In the end, it’s HTML tags for SEO that will affect how your website performs on the Internet. Without these tags, you’re far less likely to really connect with an audience.
About cite tag: The tag defines the title of a creative work (e.g. a book, a poem, a song, a movie, a painting, a sculpture, etc.). Note: A person's name is not the title of a work. The text in the element usually renders in italic.
Regarding the cite tag, according to MDN:
The HTML element is used to describe a reference to a cited
creative work, and must include the title of that work. The reference
may be in an abbreviated form according to context-appropriate
conventions related to citation metadata.
This enables you to manage all the css applied to quotes easily, were that to be your use case (if you happened to have a lot of quotes on a site). The italics you have observed are part of that css, or rather the default css applied by the browser.
In the broader spectrum
Oftentimes you will run into tags that as of today are not in use anymore. There's different industry standards for different time periods.
All of the tags exist, because there was a reason for web browsers to have a specific way of reading a piece of content.
For example centering a div used to be an almost legendary task that was achievable using multiple methods, all of which had different advantages and disadvantages. However, nowdays it's customary to use the flexbox.
Bottom line is its a way for web browsers and search engines to read and interpret the content you're providing
Tags such as and are used for text decoration nothing else you can also change text fonts and styles by using CSS.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
The website I'm making is for an author/illustrator and she wants words in the navigation bar to be written in her own handwriting, so the links in the navigation bar and her own name, which serves as the title, are all in the form of pictures rather than text.
Similarly, the homepage consists of some of her illustrations, each accompanied by a handwritten link, so that there is no text at all. I'm starting to realise from reading online that this may be seen as 'bad practise', so I want to ask those with more experience than me: how problematic is the lack of text?
I am not too worried about loading times and such - I've managed to make the image files quite small - but more things concerning accessibility and whether the site will appear in search engines.
And are there any ways I can avoid problems whilst still using the handwriting?
When you want to use images as navigation elements and are concerned about SEO and accessibility, you can use the alt-tag which you should use anyways.
Example:
<img src='images/nav1.png' alt='Home' />
Screenreaders and search engines use these tags to deal with images which they of course can not read.
There are two issues here. First, people who do not see images (for one reason or another) will find the site almost impossible to navigate, unless the img elements have alt attributes. Correctly written alt attributes resolve this problem and can be expected to provide adequate information to search engines as well. Second, people who use normal graphic browsers will see the texts in a specific appearance. This may mean that they find it less legible than normal text, perhaps even illegible. This greatly depends on the style of the text, including size and the contrast between text and background color.
If a downloadable font were used instead, via #font-face, then the latter problem would in principle be less severe, since users could disable page fonts and see the text in their preferred font. This is rather theoretical, though, and creating a font is nontrivial and probably not worth the effort here.
On the practical side, write the alt attributes and ask the author consider whether the font is legible enough to all visitors, including people with eyesight problems. It’s up to the author to decide whether the reduction in usability and accessibility is justified by the artistic impression made.
You can choose from many handwritting fonts and link them via #font-face.
If she wants to use her 'font', use images (ideally one image - looking for sprite) and put text underneath - it's call image replacement.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
I run a site that was recently indexed by Google (a few days ago). The main page has a few small images separating paragraphs of text. When I search Google for the site, it happens to show the parts of the paragraphs with the images, in the search snippet, which would be fine except it displays the alt text of the images, which looks bad.
Is there a way to stop this from happening, besides removing the alt text or toying with the images' placement?
As far as I know, there is no way to prevent this. But looking at your problem with a more technical perspective, you could off course:
simply remove the alt text or use a better alt text
remove the images from the DOM and instead put a placeholder element instead of it like:
<div class="img-holder" data-src="/img/example.jpg"></div>
With javascript you could find all instances of .img-holder and replace them with an inline image with the given source (and alt-text when you also store that as data attribute).
You cannot prevent search engines, or other user agents, from doing whatever they like with attributes in your markup. You can add attributes, or change their values, in client-side scripting, and then the odds are that search engines do not see such additions or changes (since they normally do not run client-side script code).
If the images are just decorative separators between paragraphs, then you should simply use alt="", avoiding this problem (and other problems too).
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 months ago.
Improve this question
I have my blog (you can see it if you want, from my profile), and it's fresh, as well as google robots parsing results are.
The results were alarming to me. Apparently the most common 2 words on my site are "rss" and "feed", because I use text for links like "Comments RSS", "Post Feed", etc. These 2 words will be present in every post, while other words will be more rare.
Is there a way to make these links disappear from Google's parsing? I don't want technical links getting indexed. I only want content, titles, descriptions to get indexed. I am looking for something other than replacing this text with images.
I found some old discussions on Google, back from 2007 (I think in 3 years many things could have changed, hopefully this too)
This question is not about robots.txt and how to make Google ignore pages. It is about making it ignore small parts of the page, or transforming the parts in such a way that it will be seen by humans and invisible to robots.
There is a simple way to tell google to not index parts of your documents, that is using googleon and googleoff:
<p>This is normal (X)HTML content that will be indexed by Google.</p>
<!--googleoff: index-->
<p>This (X)HTML content will NOT be indexed by Google.</p>
<!--googleon: index-->
In this example, the second paragraph will not be indexed by Google. Notice the “index” parameter, which may be set to any of the following:
index — content surrounded by “googleoff: index” will not be indexed
by Google
anchor — anchor text for any links within a “googleoff: anchor” area
will not be associated with the target page
snippet — content surrounded by “googleoff: snippet” will not be used
to create snippets for search results
all — content surrounded by “googleoff: all” are treated with all
source
Google ignores HTML tags which have data-nosnippet:
<p>
This text can be included in a snippet
<span data-nosnippet>and this part would not be shown</span>.
</p>
Source: Special tags that Google understands - Inline directives
I work on a site with top-3 google ranking for thousands of school names in the US, and we do a lot of work to protect our SEO. There are 3 main things you could do (which are all probably a waste of time, keep reading):
Move the stuff you want to downplay to the bottom of your HTML and use CSS and/or to place it where you want readers to see it. This won't hide it from crawlers, but they'll value it lower.
Replace those links with images (you say you don't want to do that, but don't explain why not)
Serve a different page to crawlers, with those links stripped. There's nothing black hat about this, as long as the content is fundamentally the same as a browser sees. Search engines will ding you if you serve up a page that's significantly different from what users see, but if you stripped RSS links from the version of the page crawlers index, you would not have a problem.
That said, crawlers are smart, and you're not the only site filled with permalink and rss links. They care about context, and look for terms and phrases in your headings and body text. They know how to determine that your blog is about technology and not RSS. I highly doubt those links have any negative effect on your SEO. What problem are you actually trying to solve?
If you want to build SEO, figure out what value you provide to readers and write about that. Say interesting things that will lead others to link to your blog, and crawlers will understand that you're an information source that people value. Think more about what your readers see and understand, and less about what you think a crawler sees.
Firstly think about the issue. If Google think "RSS" is the main keyword that may suggest the rest of your content is a bit shallow and needs expanding. Perhaps this should be the focus of your attention.If the rest of your content is rich I wouldn't worry about the issue as a search engine should know what the page is about from title and headings. Just make sure RSS etc is not in a heading or bold or strong tag.
Secondly as you rightly mention, you probably don't want use images as they are not assessable to screen readers without alt text and if they have alt text or supporting text then you add the keyword back in. However aria live may help you get around this issue, but I'm not an expert on accessibility.
Options:
Use JavaScript to write that bit of content (maybe ajax it in after load). Search engines like Google can execute JavaScript but I would guess it wont value any JS written content very highly.
Re-word the content or remove duplicates of it, one prominent RSS feed link may be better than several smaller ones dotted around the page.
Use the css content attribute with pseudo :before or :after to add your content. I'm not sure if bots will index words in content attributes in CSS and know that contents value in relation to each page but it seems unlikely. Putting words like RSS in the CSS basically says it's a style thing not an HTML thing, therefore even if engines to index it they wont add much/any value to it. For example, the HTML and CSS could be:
.add-text:after { content:'View my RSS feed'; }
Note the above will not work in older versions of IE, so you may need some IE version comments if you care about that.
"googleon" and "googleoff" are only supported by the Google Search Appliance (when you host your own search results, usually for your own internal website).
They are not supported by Google's web-search at all. So please refrain from doing that and I think that should not be marked as a correct answer as this might create ambiguity.
Now, to get Google to exclude part of a page, you will need to place that content in a separate file, such as excluded.html, and use an iframe to display that content in the host page.
The iframe tag grabs content from another file and inserts it into the host page. I think there is no other available method so far.
The only control that you have over the indexing robots, is the robots.txt file. See this documentation, linked by Google on their page explaining the usage of the file.
You basically can prohibit certain links and URL's but not necessarily keywords.
Other than black-hat server-side methods, there is nothing you can do. You may want to look at why you have those words so often and remove some of them from the site.
It used to be that you could use JS to "hide" things from googlebot, but you can't now that it parses JS. ( http://www.webmasterworld.com/google/4159807.htm )
Google crawler are smart but someone that program them are smartest. Human always sees what is sensible in the page, they will spend time on blog that have some nice content and most rare and unique.
It is all about common sense, how people visit your blog and how much time they spend. Google measure the search result in the same way. Your page ranking also increase as daily visits increase and site content get better and update every day.
This page has "Answer" words repeated multiple times. It doesn't mean that it will not get indexed. It is how much useful is to every one.
I hope it will give you some idea
you have to manually detect the "Google Bot" from request's user agent and feed them little different content than you normally serve to your user.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Simple question - I've got a bucketload of cruddy html pages to clean up and I'm looking for a open source or freeware script/utility to remove any junk and reformat them into nicely laid out consistent code. Any recommendations?
If it's relevant I generally manipulate HTML inside Dreamweaver - but by editing the code and using the wysiwyg window as preview rather than vica-versa - so a Dreamweaver compatible script would be a plus.
I don't think it plugs into Dreamweaver but whenever i need html cleaned up HTML Tidy is my go to guy
I second HTML Tidy.
I just wanted to add it is a library with various ports and bindings. As such it is also integrated in some editors like HTML-Kit or NoteTab, and it has a GUI front end. All these are linked in the page given above.
Note also that the W3C Markup Validation Service has an option to "Clean up Markup with HTML Tidy" (after validation result display).
Dreamweaver CS3 has a built in "Clean up HTML" choice under the "Commands" menu item. I don't think it is nearly as comprehensive as HTML Tidy though.
From the Adobe site:
Clean up code
You can automatically remove empty tags, combine nested font tags, and otherwise improve messy or unreadable HTML or XHTML code.
For information on how to clean up HTML generated from a Microsoft Word document, see Open and edit existing documents.
Open a document:
If the document is in HTML, select Commands > Clean Up HTML.
If the document is in XHTML, select Commands > Clean Up XHTML. -- For an XHTML document, the Clean Up XHTML command fixes XHTML syntax errors, sets the case of tag attributes to lowercase, and adds or reports the missing required attributes for a tag in addition to performing the HTML cleanup operations.
In the dialog box that appears, select any of the options, and click OK. -- Note: Depending on the size of your document and the number of options selected, it may take several seconds to complete the cleanup.
Remove Empty Container Tags Removes any tags that have no content between them. For example, <b></b> and <font color="#FF0000"></font> are empty tags, but the &ly;b> tag in <b>some text</b> is not.
Remove Redundant Nested Tags Removes all redundant instances of a tag. For example, in the code <b>This is what I <b>really</b> wanted to say</b>, the b tags surrounding the word really are redundant and would be removed.
Remove Non-Dreamweaver HTML Comments Removes all comments that were not inserted by Dreamweaver. For example, <!--begin body text--> would be removed, but <!-- TemplateBeginEditable name="doctitle" --> wouldn’t, because it’s a Dreamweaver comment that marks the beginning of an editable region in a template.
Remove Dreamweaver Special Markup Removes comments that Dreamweaver adds to code to allow documents to be automatically updated when templates and library items are updated. If you select this option when cleaning up code in a template-based document, the document is detached from the template. For more information, see Detach a document from a template.
Remove Specific Tag(s) Removes the tags specified in the adjacent text box. Use this option to remove custom tags inserted by other visual editors and other tags that you don’t want to appear on your site (for example, blink). Separate multiple tags with commas (for example, font,blink).
Combine Nested <font> Tags When Possible Consolidates two or more font tags when they control the same range of text. For example, <font size="7"><font color="#FF0000">big red</font></font> would be changed to <font size="7" color="#FF0000">big red</font>.
Show Log On Completion Displays an alert box with details about the changes made to the document as soon as the cleanup is finished.
I use the HTML Formatter...it does exactly what you are looking for.
I definitely think the best tool out there is the HTML Formatter from Logichammer.com. It does exactly what you need and is dead simple to use. Worth it to check out...the guy even has a video on his site showing how easy it is to use. I've been using it for two years now and couldn't live with out it...I get lots of messy code.
I use Cleanup HTML it does the job well cleaning and formatting HTML
I would suggest purehtml.in...it beautifies html, style and JavaScript tags...
You can even buffer your existing HTML through HTML Tidy before it reaches the browser - if it's a low traffic site, then this will make things neat without any effort.
I too recommend HTML Tidy, whilst its not maintained by Dave Ragett anymore the tool is definitely being updated frequently with tweaks.
I use HTML Trim which is a win32 app to cleanup some awful autogenerated blobs of code that some of our devs knock up.
You can also grab the command line version which you may able to integrate into Dreamweaver.
Sorry i cant post more than one hyperlink - still a n00b here.
I've been using Polystyle for a long time, and I'm quite happy. It's fairly flexible about formatting rules and costs around $15. A trial version is available.
I would recommend vim. You could format a block of code with v to select the block and '=' to indent the code.