I hope I do not break any rules of this website but posting a link to the issue is a necessity. I have copied the html from the source and tested in a local html file and it does not break. I can not work this out for the life of me.
If you look at the demo web page you can see where the text breaks throughout the whole site (it is a wordpress site if that helps).
Online Demo
Here is the html:
<h2>Our Core Values:</h2>
<strong>Relationship with God: </strong> This is our primary relationship. We were created to serve and give praise to our Creator, through our thoughts, words, and actions. When we do this, we experience the presence of God as our Heavenly Father and live in a joyful, intimate relationship with Him.
<strong>Relationship with Self:</strong> People are uniquely created in the image of God and thus have inherent worth and dignity. While we must remember that we are not God, we have the high calling of reflecting God’s being, making us superior to the rest of creation.
<strong>Relationship with Others:</strong> God created us to live in loving relationship with one another, and to encourage one another to use the gifts God has given to each of us to fulfill our calling.
<strong>Relationship with the rest of Creation:</strong> The cultural mandate of Gen 1:28-30 teaches that God created us to be stewards, people who understand, subdue and manage the world that God created in order to produce bounty. While God made the World ‘perfect’ He left it incomplete. God called humans to interact with creation to make possibilities into realities and to be able to sustain ourselves via the fruit of our stewardship. The economically poor are singled out in the Scriptures as being in a particularly desperate category and as needing very specific attention (Acts :-1:6-7)
<ul>
<li>Faith – God is our provider and equips in all He calls us to do.</li>
<li>The Great Commissions - We are called to make disciples of all nations (Matthew 28:19-20)</li>
<li>Relationship - The body of Christ is held together in relationship with the Lord and each other and self.</li>
<li>Partnership – The Lord never calls one person to work alone. A biblical, effective model of missionary involvement. Ministry partnerships should promote interdependence, not dependence.</li>
<li>Leadership – The five-fold gifts are meant to operate in the establishment and leading of the church.</li>
<li>Faithful stewardship and accountability are essential for successful ministry.</li>
Like I said this works fine if you copy the source into a local html page and test using WAMP.
Please I hope someone can help me with this and again if posting a link is against the rules I am sorry but as the issue is localised to this one instance I have no other choice.
Add this CSS
li { word-wrap: break-word; }
Your page http://kenyaaustraliamission.com/statement-of-faith/
is breaking out of the container because the spaces in the text are replaced with non-breaking space entity references
check the actual text in your WordPress Dashboard
We believe the Bible to be the inspired, only, infallible and authoritative Word of God
Your html problem please remove in p tag in your html code
Example is below
Now check to this answer
Your page is http://kenyaaustraliamission.com/statement-of-faith/ is breaking because its taking whole line as a single word and going out of div. This is due to non breaking space in each space.
Related
I am using standard HTML . All I care about is new lines and plaintext. Textarea works great for me. Except, when pasting content that is embedded in . This makes the Textarea uneditable, unless I delete the last line of the blockquote.
Example: http://jsfiddle.net/enf84xmj/1/
<textarea
autoCapitalize='sentences'
cols='69'
maxLength='1000'
minLength='1'
rows='1000'>
“One of us can be dismissed. Two of us can be ignored. But together, we are a movement and we are unstoppable.”
- Cecile Richards, president of Planned Parenthood Federation of America & Planned Parenthood Action Fund
In a surreal time when our hard-fought reproductive rights are in real peril, it is imperative that all-of-us stand strong and stand together in their defense.
To help Planned Parenthood launch its new project, UNSTOPPABLE, our Bay Area filmmaker friend and fellow activist Tiffany Shlain made a powerful short film called “Unstoppable Manifesto.” Hear what Tiffany had to share about her own personal experience and motivation:
“I grew up hearing stories from my late father, who, as a young surgeon, would try to save women in the emergency room after they were unable to get a safe abortion and ended up trying to do it themselves. In my early twenties, long before I was ready to have children, or had started my career or even met Ken, I became pregnant. I was in no way ready to become a mother, and was so grateful to be able to get a safe abortion. But even though I had a place to go, I still, like so many women, had to make my way through a line of protestors shouting terrible things at me, making what was already so difficult, worse.
</textarea>
I am trying to avoid adding a WYSIWYG editor and doing it using just HTML. Is it not possible?
As #chaska pointed out, the maxlength attribute was responsible for it and had nothing to do with the type of content being pasted
Does Google really care if I use an <h5> as a <b> tag?
What are some real-world, practical reasons I should care about semantic markup?
A few examples
Many visually impaired people rely on speech browsers to read pages back to them. These programs cannot interpret pages very well unless they are clearly explained. In other words semantic code aids accessibility
Search engines need to understand what your content is about in order to rank you properly on search engines.
Semantic code tends to improve your placement on search engines, as it is easier for the "search engine spiders" to understand.
However, semantic code has other benefits too:
As you can see from the example above, semantic code is shorter and so downloads faster.
Semantic code makes site updates easier because you can apply design style to headings across an entire site instead of on a per page basis.
Semantic code is easier for people to understand too so if a new web designer picks up the code they can learn it much faster.
Because semantic code does not contain design elements it is possible to change the look and feel of your site without recoding all of the HTML.
Once again, because design is held separately from your content, semantic code allows anybody to add or edit pages without having to have a good eye for design.
You simply describe the content and the cascading style sheet defines what that content looks like.
Source: boagworld
Semantics and the Web
Semantics are the implied meaning of a subject, like a word or sentence. It aids how humans (and these days, machines) interpret subject matter. On the web, HTML serves both humans and machines, suggesting the purpose of the content enclosed within an HTML tag. Since the dawn of HTML, elements have been revised and adapted based on actual usage on the web, ideally so that authors can navigate markup with ease and create carefully structured documents, and so that machines can infer the context of the wonderful collection of data we humans can read.
Until — and perhaps even after — machines can understand language and all its nuances at the same level as a human, we need HTML to help machines understand what we mean. A computer doesn’t care if you had pizza for dinner. It likely just wants to know what on earth it should do with that information.
HTML semantics are a nuanced subject, widely debated and easily open to interpretation. Not everyone agrees on the same thing right away, and this is where problems arise.
Allow me to paint a picture:
You are busy creating a website.
You have a thought, “Oh, now I have to add an element.”
Then another thought, “I feel so guilty adding a div. Div-itis is terrible, I hear.”
Then, “I should use something else. The aside element might be appropriate.”
Three searches and five articles later, you’re fairly confident that aside is not semantically correct.
You decide on article, because at least it’s not a div.
You’ve wasted 40 minutes, with no tangible benefit to show for it.
— Divya Manian
This generated a storm of responses, both positive and negative. In Pursuing Semantic Value By Jeremy Keith argued that being semantically correct is not fruitless, and he even gave an example of how <section> can be used to adjust a document’s outline. He concludes:
But if you can get past the blustery tone and get to the kernel of the article, it’s a fairly straightforward message: don’t get too hung up on semantics to the detriment of other important facets of web development.
— Jeremy Keith
Naming Things
Of all the possible new element names in HTML5, the spec is pretty set on things like <nav> and <footer>. If you’ve used either of those as a class or id in your own markup, it’s no coincidence. Studies of the web from the likes of Google and Opera (amongst others) looked at which names people were using to hint at the purpose of a part of their HTML documents. The authors of the HTML5 spec recognised that developers needed more semantic elements and looked at what classes and IDs were already being used to convey such meaning.
Of course, it isn’t possible to use all of the names researched, and of the millions of words in the English language that could have been used, it’s better to focus on a small subset that meets the demands of the web. Yet some people feel that the spec isn’t yet doing so.
Source: html5doctor (This goes on for quite a while so I've only put a few examples here.)
Hope this helps!
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I am very well aware of the principles behind deprecating layout HTML like FONT, CENTER, etc., in favor of the more-or-less equivalent CSS.
But, in reading the extensive amount of verbiage on this over the years, I've never read anything about what this does to the teaching of HTML, especially to those new to the entire concept of computer languages, markup, and metadata.
I'm thinking of 4th and 5th graders, but there are many older people who fit this profile, too.
Suppose one of these kids wants to build a website with things that kids like, such as colors, font families and sizes, formatting, and so on?
In putting together a course for schoolchildren, I find they want to know about controlling presentation at about lesson 5, just after I've introduced paragraphs. But, this is no time to introduce CSS, as I haven't even gotten to attributes yet. Equal signs, colons, quotes, and brackets are still sea of confusing characters at this point.
Trying to teach:
<p style='text-align: center;'>
instead of
<center>
just doesn't work. The students get discouraged and the course grinds to a halt.
Worse, the two aren't equivalent, since the inline styles only apply to one tag, whereas the deprecated tag applies until ended. Introducing DIV to get around this is no real help, as it still suffers from taking the student down a complex path much too early.
OK, so there must be a question in here at some point, right? ;-)
How about this for a some questions:
Has anyone actually taught HTML to 4th and 5th graders without using deprecated tags?
Is anyone aware of any part of the various HTML standards development that included educational issues? (I'm not talking about education for highly motivated, somewhat technically-inclined adults here. I'm talking about HTML for kids and casual computer users.)
Is anyone out there willing to agree with me that teaching deprecated HTML is a way to lower the slope of the learning curve?
Reviewing in my mind the various reasons for deprecation, I can't see why, for example, deprecating CENTER is justified. While it has been removed from certain strict HTML standards, there will never be a browser that doesn't handle it. (Other than research tools.) While more powerful constructs exist, none of them are even close to as convenient to code. Thoughts on this issue?
(Please, no responses along the lines of "let beginners use WYSIWYG editors". These kids want to learn HTML, not just post some pretty content. There's a big difference.)
I first have to commend you on a bold mission. My wife is a 5th grade teacher and I have seen worlds of difference in one student's inability to spell their own name vs. some that can do serious math. To teach them html when they haven't even mastered basic spelling is a challenge I would not dare take. But that is not the point...
I am a professional web developer. I have tried to teach people (grown up interns at our firm) basic code before, but limited time prevents me from getting too involved in more technical details. Even they had a hard time "getting it" so to say. So it could be similar to a kind of curriculum. I would say this - for students in 4th or 5th grade only - basic deprecated html 4 is perfectly acceptable to teach. I would boilerplate a basic template because I can't imagine them understanding headers or anything like that.
Why is it acceptable? Because the odds are less than 1% will actually go on to do anything with it. If they do, they are competent enough to understand the progression of code and have little problems adapting to the real world environment. Anyhow, they may end up taking a CS course and figuring out what's happening behind the code. The rest of them will probably dismiss it and never look back.
Good luck. I am curious to see the results of this plan. Make a site and report the progress.
I'd argue that while more complicated to start with, the separation of content from style is a key concept for writing real web pages and if learnt early on would give anyone a huge advantage later if they were to continue with their learning rather than use deprecated techniques.
Would doing something using classes to replicate the basic style properties provided by these tags be that hard to teach?
<html>
<head>
<style type="text/css">
.center {text-align: center;}
.bold {font-weight: bold;}
.italic {font-style: italic;}
.big {font-size: 18;}
</style>
</head>
<body>
<p class="big center bold">
Hello world!
</p>
</body>
</html>
I guess as it's at such a young age, maybe it would. Although as Kai suggested you will need a boilerplate template to provide them anyway, so setting up a bunch of classes and placing all the formatting for a paragraph in one place (in the class attribute) is in my opinion easier to read and understand than having a whole lot of tags you have to remember to close.
I know that Google’s search algorithm is mainly based on pagerank. However, it also does analysis and uses the structure of the document H1, H2, title and other HTML tags to enhance the search results.
What is the name of this technique "using the document structure to enhance the search results"?
And are there any academic papers to help me study this area?
The fact that Google is taking the HTML structure into account is well covered in SEO articles however I could not find it in the academic papers.
I think it's called "Semantic Markup"
[...] semantic markup is markup that is descriptive enough to allow us and the machines we program to recognize it and make decisions about it. In other words, markup means something when we can identify it and do useful things with it. In this way, semantic markup becomes more than merely descriptive. It becomes a brilliant mechanism that allows both humans and machines to “understand” the same information. http://www.digital-web.com/articles/writing_semantic_markup/
A more practical article here
http://robertnyman.com/2007/10/29/explaining-semantic-mark-up/
SEO has become almost a religion to some people where they obsess about minutiae. Frankly, I'm not convinced that all this effort is justified.
My advice? Ignore what so-called pundits say and just follow Google's guidelines.
You might be looking for an academic answer but honestly, this isn't an academic question beyond the very basics of how Web indexing works. The reality of a modern page indexing and ranking algorithm is far more complex.
You may want to look at one of the earlier works on search engines. Note the authors' names. You may also want to read Google Patent application 20050071741.
These general principles aside, Google's search algorithm is constantly tweaked based on actual and desired results. The exact workings are a closely guarded secret just to make it harder for people to game the system. Much of the "advice" or descriptions on how Google's search algorithm works is pure supposition.
So, apart from having a title and having well-formed and valid HTML, I don't think you're going to find what you're looking for.
Google very deliberately doesn't give away too much information about its search algorithm, so it's unlikely you will find a definitve answer or academic paper that confirms this. If you're interested from an SEO point of view, just write your pages so they are good for humans and the robots will like them too.
To make a page good for humans, you SHOULD use tags such as h1, h2 and so on to create a hierarchical page outlay... a bit like this...
h1 "Contact Us"
...h2 "Contact Details"
......h3 "Telephone Numbers"
......h3 "Email Addresses"
...h2 "How To Find Us"
......h3 "By Car"
......h3 "By Train"
The difficulty with your question is that if you put something in your h1 tag hoping that it would increase your position in Google, but it didn't match up with other content on your page, you could look like you are spamming. Similarly, if your page is made up of too many headings and not enough actual content, you could look like you are spamming. It's not as simple as add a h1 and h2 tag and you'll go up! That's why you need to write websites for humans, not robots.
I have found this paper:
A New Study on Using HTML Structures to Improve Retrieval
however it is an old paper 1999,
still looking for more recent papers.
Check out
http://jcmc.indiana.edu/vol12/issue3/pan.html
http://www.springerlink.com/content/l22811484243r261/
Some time spent on scholar.google.com might help you find what you are looking for
You can also try searching the 'Computer Science' section of arXiv: http://arxiv.org for "search engine" and the various terms that others have suggested.
It contains many academic papers, all freely available... hopefully some of them will be relevant to your research. (Of course the caveat of validating any paper's content applies.)
Like cletus said follow the google guidelines.
I did a few tests came to the conclusion that title, image alt and h tags the most important. Also worth to mention is google adsense. I had the feeling if you implement these, the rank of your site increase.
I believe what you are interested in is called structural-fingerprinting, and it is often used to determine the similarity of two structures. In Google's case, applying a weight to different tags and applying to a secret algorithm that (probably) uses the frequencies of the different elements in the fingerprint. This is deeply routed in information theory - if you are looking for academic papers on information theory, I would start with "A Mathematical Theory of Communication" by Claude Shannon
I would also suggest looking at Microformats and RDF's. Both are used to enhance searching. These are mostly search engine agnostic, but there are some specific things as well. For google specific guidelines for HTML content read this link.
In short; very carefully. In long:
Quote from anatomy of a large-scale hypertextual erb search engine:
[...] This gives us some limited
phrase searching as long as there are
not that many anchors for a particular
word. We expect to update the way that
anchor hits are stored to allow for
greater resolution in the position and
docIDhash fields. We use font size
relative to the rest of the document
because when searching, you do not
want to rank otherwise identical
documents differently just because one
of the documents is in a larger
font. [...]
It goes on:
[...] Another big difference between
the web and traditional well controlled collections is that there
is virtually no control over what
people can put on the web. Couple
this flexibility to publish anything
with the enormous influence of search
engines to route traffic and companies
which deliberately manipulating search
engines for profit become a serious
problem. This problem that has not
been addressed in traditional closed
information retrieval systems. Also,
it is interesting to note that
metadata efforts have largely failed
with web search engines, because any
text on the page which is not directly
represented to the user is abused to
manipulate search engines. [...]
The Challenges in a web search engine addresses these issues in a more modern fashion:
[...] Web pages in HTML fall into the middle of this continuum of structure in documents, being neither close to free text nor to well-structured data. Instead HTML markup provides limited structural information, typically used to control layout but providing clues about semantic information. Layout information in HTML may seem of limited utility, especially compared to information contained in languages like XML that can be used to tag content, but in fact it is a particularly valuable source of meta-data in unreliable corpora such as the web. The value in layout information stems from the fact that it is visible to the user [...]:
And adds:
[...] HTML tags can be analyzed for what semantic information can be inferred. In addition to the header tags mentioned above, there are tags that control the font face (bold, italic), size, and color. These can be analyzed to determine which words in the document the author thinks are particularly important. One advantage of HTML, or any markup language that maps very closely to how the content is displayed, is that there is less opportunity for abuse: it is difficult to use HTML markup in a way that encourages search engines to think the marked text is important, while to users it appears unimportant. For instance, the fixed meaning of the tag means that any text in an HI context will appear prominently on the rendered web page, so it is safe for search engines to weigh this text highly. However, the reliability of HTML markup is decreased by Cascading Style Sheets which separate the names of tags from their representation. There has been research in extracting information from what structure HTML does possess.For instance, [Chakrabarti etal, 2001; Chakrabarti, 2001] created a DOM tree of an HTML page and used this information to in-crease the accuracy of topic distillation, a link-based analysis technique.
There are number of issues a modern search engine needs to combat, for example web spam and blackhat SEO schemes.
Combating webspam with trustrank
Webspam taxonomy
Detecting spam web pages through content analysis
But even in a perfect world, e.g. after eliminating the bad apples from the index, the web is still an utter mess because no-one has identical structures. There are maps, games, video, photos (flickr) and lots and lots of user generated content. In other word, the web is still very unpredictable.
Resources
Hypertext and the web:
Extracting knowledge from the World Wide Web
Rich media and web 2.0
Thresher: automating the unwrapping of semantic content from the World Wide Web
Information retrieval
Webspam papers
Combating webspam with trustrank
Webspam taxonomy
Detecting spam web pages through content analysis
To keep it painfully simple. Make your information architecture logical. If the most important elements for user comprehension are highlighted with headings and grouped logically, then the document is easier to interpret using information processing algorithms. Magically, it will also be easier for users to interpret. Remember the search engine algorithms were written by people trying to interpret language.
The Basic Process is:
Write well structured HTML - using header tags to indicate the most critical elements on the page. Use logical tags based on the structure of your information. Lists for lists, headers for major topics.
Supply relevant alt tags and names for any visual elements, and then use simple css to arrange these elements.
If the site works well for users and contains relevant information, you don't risk becoming a black listed spammer, and search engine algorithms will favor your page.
I really enjoyed the book Transcending CSS
for a clean explanation of properly structured HTML.
I suggest trying Google scholar as one of your avenues when looking for academic articles
semantic search
I found it interesting that - with no meta keywords nor description provided - in a scenatio like this:
<p>Some introduction</p>
<h1>headline 1</h1>
<p>text for section one</p>
Always the "text for section one" is shown on the search result page.
New tag to use called CANONICAL can now also be used, from Google, click HERE
Does anyone know of a library or bit of code that converts British English to American English and vice versa?
I don't imagine there's too many differences (some examples that come to mind are doughnut/donut, colour/color, grey/gray, localised/localized) but it would be nice to be able to provide localised site content.
I've been working on one to convert US English to UK English. As I've discovered it's actually a lot harder to write something to convert the other way but I hope to get around to providing a reverse conversion one day.
This isn't perfect, but it's not a bad effort (even if I do say so myself). It'll convert most US spellings to UK ones but there are some words where UK English retains the US spelling (e.g. "program" where this refers to computer software). It won't convert words like pants to trousers because my main goal was simply to make the spelling uniform across the whole document.
There are also words such as practice and license where UK English uses either those or practise & licence, depending on whether the word's being used as a verb or a noun. For those two examples the conversion tool will highlight them and an explanatory note pops up on the lower left hand of your screen when you hover your mouse over them. All word patterns which are converted are underlined in red, and the output is shown in a side by side comparison with your original input.
It'll do quite large blocks of text quite quickly, but I prefer to go use it just for a couple of paragraphs at a time - copying them in from a Word doc.
It's still a work in progress so if anyone has any comments or suggestions then I'd appreciate feedback I can use to improve it.
http://www.us2uk.eu/
The difference between UK and US English is far greater than just a difference in spelling. There is also the hood/bonnet, sidewalk/pavement, pants/trousers idea.
Guess it depends how far you need to take it.
I looked forever to find a solution to this, but couldn't find one, so, I wrote my own bit of code for it, using a master list of ~20,000 different spellings that were freely available from the varcon project and the language experts at wordsworldwide:
https://github.com/HoldOffHunger/convert-british-to-american-spellings
Since I had two source lists, I used them each to crosscheck each other, and I found numerous errors and typos (varcon lists "preexistent"'s british equivalent as "preaexistent"). It is possible that I may have accidentally made typos, too, but, since I didn't do any wordsmithing here, I don't believe that to be the case.
Example:
require('AmericanBritishSpellings.php');
$american_british_spellings = new AmericanBritishSpellings();
$text = "Axiomatically ax that door, would you, my neighbour?";
$text = $american_british_spellings->SwapBritishSpellingsForAmericanSpellings(['text'=>$text]);
print($text); // output: Axiomatically axe that door, would you, my neighbor?
I think if you're thinking of converting from American English to British English, I personally wouldn't bother. Britain is very Americanised anyway, we accept silly yank spellings on the net :)
I had a similar problem recently. I discovered the following tool, called VarCon. I haven't tested it out, but I needed a rough converter for some text data. Here's an example.
echo "I apologise for my colourful tongue ." | ./translate british american
# >> I apologize for my colorful tongue .
It looks like it works for various dialects. Be sure to read the README and proceed with caution.
*note: This will only correct spelling variations.