Different webfont for Japanese special characters

Different webfont for Japanese special characters - html

I'm working on a Japanese website and I'm using the Meiryo webfont. I really like this font but the special characters have too much margin.
The exact issue is that there's too much space in front of special characters like 【, which makes the design look off if such a character is the first on a line.
font-family: "ヒラギノ角ゴ Pro W3","Hiragino Kaku Gothic Pro",Osaka,"メイリオ",Meiryo,"ＭＳ Ｐゴシック","MS PGothic",sans-serif;
<h3>【好き】</h3>
<p>猫、アクリル板、真空管、リンゴジュース、ゾロ目、柄物、ドラッグストア、レゴ、アイドル、アニメ、光る物、製菓、イラスト、ドラム(叩けない)、ダリ</p>
<h3>【好きくない】</h3>
<p>煙草、カフェイン、満員電車及び人混み、パンチのあるアルコール類、刺激物全般(辛い、苦い)、算数、スポーツ全般、読書、プンプンしてる人</p>
<h3>【弱点】</h3>
<p>近眼乱視、虚弱、猫舌、譜面が読めない、書けない。方向音痴、機械音痴、運動音痴。睡眠をとらないと死ぬ。強く怒られると死ぬ。</p>
An image of the issue in my browser:
http://i.stack.imgur.com/OYqMt.jpg
My idea to solve this is to write a script that puts every special character in a container which has a negative margin. That is obviously very hacky, and not practical at all, so are there any better solutions, like a different font for special characters only?

I wouldn't say this is a great option, but you could let CSS render the lenticular brackets instead. By doing so, you're removing the brackets from text entirely, so they're no longer selectable, indexable, etc., but it would solve the padding problem for these characters with a simple CSS class and it should work with any element.
This would only apply to these two characters. Other special characters would need similar CSS rules.
.brackets:before {
content:'\3010';
margin-left:-1.5%;
}
.brackets:after {
content:'\3011';
margin-right:-1.5%;
}
And then in your HTML, just add a brackets class and remove the brackets in text:
<h3 class="brackets">好き</h3>
.brackets:before {
content:'\3010';
margin-left:-1.5%;
}
.brackets:after {
content:'\3011';
margin-right:-1.5%;
}
<h3 class="brackets">好き</h3>
<p>猫、アクリル板、真空管、リンゴジュース、ゾロ目、柄物、ドラッグストア、レゴ、アイドル、アニメ、光る物、製菓、イラスト、ドラム(叩けない)、ダリ</p>
<h3 class="brackets">好きくない</h3>
<p>煙草、カフェイン、満員電車及び人混み、パンチのあるアルコール類、刺激物全般(辛い、苦い)、算数、スポーツ全般、読書、<span class="brackets">プンプンしてる人</span></p>
<h3 class="brackets">弱点</h3>
<p>近眼乱視、虚弱、猫舌、譜面が読めない、書けない。方向音痴、機械音痴、運動音痴。睡眠をとらないと死ぬ。強く怒られると死ぬ。</p>

Related

What is content in CSS before or after?

.icon-a:before { content: '\e803'; }
.icon-b:before { content: '\e96f'; }
Okay I know content can be used to render URL or quotes but what is happening in the above code?
I came across this code and it is confusing, I tried googling I can't find any.
Any help would be appreciated.
Thanks.

Quoting papiro as suggested here
Put simply, they're Unicode references. The "\e601", for example, is the hex code 0xe601. If you go here: http://unicodelookup.com/#0xe601/1 you'll see that the entry for that character is totally blank. It's in a part of the Unicode character set reserved for "private" use. Meaning icon libraries and the like can place whatever they want in those spots and not have to worry about overriding common characters like those of any of the alphabets of the world or a Chinese character, for instance.
In your case \e803 reffers to unicode character this
Hope this helps

It depends on font you are corrently using in parent element. This code is Unicode character code, which can display �. After \ code of character is entered.

Show paragraph marks, spaces and other formatting marks in a contenteditable div

I need to show paragraph marks, spaces and other formatting marks in a contenteditable div as you can in MS Word by pressing the Formatting Marks button Formatting Marks button http://blogs.mccombs.utexas.edu/the-most/files/2011/04/show-hide-button-in-outlook.jpg
Is there a simple way to achieve this?

<html>
<head>
<style>
span::after{
color:black;
content:"\00b6";
}
p::after{
color:black;
content:"\00b6";
}
</style>
</head>
<body>
<h3>
<span class="label">This is the main label</span>
<span class="secondary-label">secondary label</span>
</h3>
<P>Quote me</p>
</body>
</html>

Creating a font which draws spaces as dots and newlines as paragraph marks should solve your problem.
In code it will look like
.editable-div {
font-family: "Your custom font with spaces as dots and stuff", "Actual character font";
}
Here's an article which elaborates on this approach http://www.sitepoint.com/joy-of-subsets-web-fonts/

(I don't have access to Word, but I'm assuming it's the exact same functionality present in most text editors, or InDesign's 'show hidden characters' option &c.)
No, there definitely isn't a simple way to do this, because it's a fairly complex feature.
Your best bet if you really want to do this is to capture the input within the div as a user enters text. Something like Bacon that can easily capture keyed user input as a stream (and allow you to map across the stream) would simplify the process somewhat.
You'll then need to replace* (in realtime) every space/paragraph mark/&c with a relevant marker for the user. The actual input still needs be either saved as typed, or parsed again before saving to strip the new, pretend characters. And though you can use use unicode entities for many of the markers (pilcrows, maybe?), a space (for example) will still show as whitespace (or as the entity code if escaped), so you would need to use a representative icon - essentially, the majority of the hidden characters will each need to have their own specific, defined rendering rules.
This is all fairly nightmarish. It's doable if you can ensure the max amount of text can be kept small, and if you can control what users can enter. For large amounts of text, I can see it becoming horrific: not sure what the JS overhead would be in terms of performance, but I can't imagine it would be particularly good.
* or append - for example newlines/carriage returns etc need to be both displayed as a marker, and actually occur within the contenteditable element.
Edit: What you could do in addition to the above is to edit a font, replacing/adding unicode points for hidden characters instead/as well as visible ones - you would still need to capture input, but this would remove a few headaches. It would deal with spaces quite nicely, for example. Still a bit of a nightmare, but hey.

How to display raw HTML code in PRE or something like it but without escaping it

I'd like to display raw HTML. We all know one has to escape each "<" and ">" like this:
<PRE> this is a test &ltDIV&gt </PRE>
However, I do not want to do this. I'd like a way to keep the HTML code as is (since it is easier to read, (inside the editor) and I might want to copy it and use it again myself as actual HTML code, and do not want to have to change it again or have two versions of the same code, one escaped and one not escaped.
Is there any other environment that is more "raw" than PRE that might allow this? So one does not have to keep editing HTML and changing everything each time they want to show some raw HTML code, maybe in HTML5?
Something like <REALLY_REALLY_VERBATIM> ...... </<REALLY_REALLY_VERBATIM>
The JavaScript solution does not work on Firefox 21, here is a screenshot:
The first solution still does not work on Firefox, here is a screenshot:

You can use the xmp element, see What was the <XMP> tag used for?. It has been in HTML since the beginning and is supported by all browsers. Specifications frown upon it, but HTML5 CR still describes it and requires browsers to support it (though it also tells authors not to use it, but it cannot really prevent you).
Everything inside xmp is taken as such, no markup (tags or character references) is recognized there, except, for apparent reason, the end tag of the element itself, </xmp>.
Otherwise xmp is rendered like pre.
When using “real XHTML”, i.e. XHTML served with an XML media type (which is rare), the special parsing rules do not apply, so xmp is treated like pre. But in “real XHTML”, you can use a CDATA section, which implies similar parsing rules. It has no special formatting, so you would probably want to wrap it inside a pre element:
<pre><![CDATA[
This is a demo, tags like <p> will
appear literally.
]]></pre>
I don’t see how you could combine xmp and CDATA section to achieve so-called polyglot markup

Essentially the original question can be broken down in 2 parts:
Main objective/challenge: embedding(/transporting) a raw formatted code-snippet
(any kind of code) in a web-page's markup (for simple copy/paste/edit due to no
encoding/escaping)
correctly displaying/rendering that code-snippet (possibly edit it) in the
browser
The short (but) ambiguous answer is: you can't, ...but you can (get very close).
(I know, that are 3 contradicting answers, so read on...)
(polyglot)(x)(ht)ml Markup-languages rely on wrapping (almost) everything between begin/opening and end/closing tags/character(sequences).
So, to embed any kind of raw code/snippet inside your markup-language, one will always have to escape/encode every instance (inside that snippet) that resembles the character(-sequence) that would close the wrapping 'container' element in the markup. (During this post I'll refer to this as rule no 1.)
Think of "some "data" here" or <i>..close italics with '</i>'-tag</i>, where it is obvious one should escape/encode (something in) </i and " (or change container's quote-character from " to ').
So, because of rule no 1, you can't 'just' embed 'any' unknown raw code-snippet inside markup.
Because, if one has to escape/encode even one character inside the raw snippet, then that snippet would no longer be the same original 'pure raw code' that anyone can copy/paste/edit in the document's markup without further thought. It would lead to malformed/illegal markup and Mojibake (mainly) because of entities.
Also, should that snippet contain such characters, you'd still need some javascript to 'translate' that character(sequence) from (and to) it's escaped/encoded representation to display the snippet correctly in the 'webpage' (for copy/paste/edit).
That brings us to (some of) the datatypes that markup-languages specify. These datatypes essentially define what are considered 'valid characters' and their meaning (per tag, property, etc.):
PCDATA (Parsed Character DATA): will expand entities and one must
escape <, & (and > depending on markup language/version).
Most tags like body, div, pre, etc, but also textarea (until
HTML5) fall under this type.
So not only do you need to encode all the container's closing character-sequences
inside the snippet, you also have to encode all <, & (,>) characters
(at minimum).
Needless to say, encoding/escaping this many characters falls outside this
objective's scope of embedding a raw snippet in the markup.
'..But a textarea seems to work...', yes, either because of the browsers
error-engine trying to make something out of it, or because HTML5:
RCDATA (Replaceable Character DATA): will not not treat tags inside the
text as markup (but are still governed by rule 1), so one doesn't need to
encode < (>). BUT entities are still expanded, so they and 'ambiguous
ampersands' (&) need special care.
The current HTML5 spec says the textarea is now a RCDATA field and (quote):
The text in raw text and RCDATA elements must not contain any
occurrences of the string "</" (U+003C LESS-THAN SIGN, U+002F SOLIDUS)
followed by characters that case-insensitively match the tag name of
the element followed by one of U+0009 CHARACTER TABULATION (tab),
U+000A LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN
(CR), U+0020 SPACE, U+003E GREATER-THAN SIGN (>), or U+002F SOLIDUS (/).
Thus no matter what, textarea needs a hefty entity translation handler or
it will eventually Mojibake on entities!
CDATA (Character Data) will not treat tags inside the text as
markup and will not expand entities.
So as long as the raw snippet code does not violate rule 1 (that one can't
have the containers closing character(sequence) inside the snippet), this
requires no other escaping/encoding.
Clearly this boils down to: how can we minimize the number of characters/character-sequences that still need to be encoded in the snippet's raw source and the number of times that character(sequence) might appear in an average snippet; something that is also of importance for the javascript that handles the translation of these characters (if they occur).
So what 'containers' have this CDATA context?
Most value properties of tags are CDATA, so one could (ab)use a hidden input's value property (proof of concept jsfiddle here).
However (conform rule 1) this creates an encoding/escape problem with nested quotes (" and ') in the raw snippet and one needs some javascript to get/translate and set the snippet in another (visible) element (or simply setting it as a text-area's value). Somehow this gave me problems with entities in FF (just like in a textarea). But it doesn't really matter, since the 'price' of having to escape/encode nested quotes is higher then a (HTML5) textarea (quotes are quite common in source code..).
What about trying to (ab)use <![CDATA[<tag>bla & bla</tag>]]>?
As Jukka points out in his extended answer, this would only work in (rare) 'real xhtml'.
I thought of using a script-tag (with or without such a CDATA wrapper inside the script-tag) together with a multi-line comment /* */ that wraps the raw snippet (script-tags can have an id and you can access them by count). But since this obviously introduces a escaping problem with */, ]]> and </script in the raw snippet, this doesn't seem like a solution either.
Please post other viable 'containers' in the comments to this answer.
By the way, encoding or counting the number of - characters and balancing them out inside a comment tag <!-- --> is just insane for this purpose (apart from rule 1).
That leaves us with Jukka K. Korpela's excellent answer: the <xmp> tag seems the best option!
The 'forgotten' <xmp> holds CDATA, is intended for this purpose AND is indeed still in the current HTML 5 spec (and has been at least since HTML3.2); exactly what we need! It's also widely supported, even in IE6 (that is.. until it suffers from the same regression as the scrolling table-body).
Note: as Jukka pointed out, this will not work in true xhtml or polyglot (that will treat it as a pre) and the xmp tag must still adhere to rule no 1. But that's the 'only' rule.
Consider the following markup:
<!-- ATTENTION: replace any occurrence of </xmp with </xmp -->
<xmp id="snippet-container">
<div>
<div>this is an example div & holds an xmp tag:<br />
<xmp>
<html><head> <!-- indentation col 0!! -->
<title>My Title</title>
</head><body>
<p>hello world !!</p>
</body></html>
</xmp> <!-- note this encoded/escaped tag -->
</div>
This line is also part of the snippet
</div>
</xmp>
The above codeblok illustrates a raw piece of markup where <xmp id="snippet-container"> contains an (almost raw) code-snippet (containing div>div>xmp>html-document).
Notice the encoded closing tag in this markup? To comply with rule no 1, this was encoded/escaped).
So embedding/transporting the (sometimes almost) raw code is/seems solved.
What about displaying/rendering the snippet (and that encoded </xmp>)?
The browser will (or it should) render the snippet (the contents inside snippet-container) exactly the way you see it in the codeblock above (with some discrepancy amongst browsers whether or not the snippet starts with a blank line).
That includes the formatting/indentation, entities (like the string &), full tags, comments AND the encoded closing tag </xmp> (just like it was encoded in the markup). And depending on browser(version) one could even try use the property contenteditable="true" to edit this snippet (all that without javascript enabled). Doing something like textarea.value=xmp.innerHTML is also a breeze.
So you can... if the snippet doesn't contain the containers closing character-sequence.
However, should a raw snippet contain the closing character-sequence </xmp (because it is an example of xmp itself or it contains some regex, etc), you must accept that you have to encode/escape that sequence in the raw snippet AND need a javascript handler to translate that encoding to display/render the encoded </xmp> like </xmp> inside a textarea (for editing/posting) or (for example) a pre just to correctly render the snippet's code (or so it seems).
A very rudimentary jsfiddle example of this here. Note that getting/embedding/displaying/retrieving-to-textarea worked perfect even in IE6. But setting the xmp's innerHTML revealed some interesting 'would-be-intelligent' behavior on IE's part. There is a more extensive note and workaround on that in the fiddle.
But now comes the important kicker (another reason why you only get very close):
Just as an over-simplified example, imagine this rabbit-hole:
Intended raw code-snippet:
<!-- remember to translate between </xmp> and </xmp> -->
<xmp>
<p>a paragraph</p>
</xmp>
Well, to comply with rule 1, we 'only' need to encode those </xmp[> \n\r\t\f\/] sequences, right?
So that gives us the following markup (using just a possible encoding):
<xmp id="container">
<!-- remember to translate between </xmp> and </xmp> -->
<xmp>
<p>a paragraph</p>
</xmp>
</xmp>
Hmm.. shalt I get my crystal ball or flip a coin? No, let the computer look at its system-clock and state that a derived number is 'random'. Yes, that should do it..
Using a regex like: xmp.innerHTML.replace(/<(?=\/xmp[> \n\r\t\f\/])/gi, '<');, would translate 'back' to this:
<!-- remember to translate between </xmp> and </xmp> -->
<xmp>
<p>a paragraph</p>
</xmp>
Hmm.. seems this random generator is broken... Houston..?
Should you have missed the joke/problem, read again starting at the 'intended raw code-snippet'.
Wait, I know, we (also) need to encode .... to ....
Ok, rewind to 'intended raw code-snippet' and read again.
Somehow this all begins to smell like the famous hilarious-but-true rexgex-answer on SO, a good read for people fluent in mojibake.
Maybe someone knows a clever algorithm or solution to fix this problem, but I assume that the embedded raw code will get more and more obscure to the point where you'd be better of properly escaping/encoding just your <, & (and >), just like the rest of the world.
Conclusion: (using the xmp tag)
it can be done with known snippets that do not contain the container's closing character-sequence,
we can get very close to the original objective with known snippets that only use 'basic first-level' escaping/encoding so we don't fall in the rabbithole,
but ultimately it seems that one can't do this reliably in a 'production-environment' where people can/should copy/paste/edit 'any unknown' raw snippets while not knowing/understanding the implications/rules/rabbithole (depending on your implementation of handling/translating for rule 1 and the rabbit-hole).
Hope this helps!
PS:
Whilst I would appreciate an upvote if you find this explanation useful, I kind of think Jukka's answer should be the accepted answer (should no better option/answer come along), since he was the one who remembered the xmp tag (that I forgot about over the years and got 'distracted' by the commonly advocated PCDATA elements like pre, textarea, etc.).
This answer originated in explaining why you can't do it (with any unknown raw snippet) and explain some obvious pitfalls that some other (now deleted) answers overlooked when advising a textarea for embedding/transport. I've expanded my existing explanation to also support and further explain Jukka's answer (since all that entity and *CDATA stuff is almost harder than code-pages).

Cheap and cheerful answer:
<textarea>Some raw content</textarea>
The textarea will handle tabs, multiple spaces, newlines, line wrapping all verbatim.
It copies and pastes nicely and its valid HTML all the way. It also allows the user to resize the code box.
You don't need any CSS, JS, escaping, encoding.
You can alter the appearance and behaviour as well.
Here's a monospace font, editing disabled, smaller font, no border:
<textarea
style="width:100%; font-family: Monospace; font-size:10px; border:0;"
rows="30" disabled
>Some raw content</textarea>
This solution is probably not semantically correct. So if you need that, it might be best to choose a more sophisticated answer.

xmp is the way to go, i.e.:
<xmp>
# your code...
</xmp>

echo '<pre>' . htmlspecialchars("<div><b>raw HTML</b></div>") . '</pre>';
I think that's what you're looking for?
In other words, use htmlspecialchars() in PHP

#GitaarLAB and #Jukka elaborate that <xmp> tag is obsolete, but still the best. When I use it like this
<xmp>
<div>Lorem ipsum</div>
<p>Hello</p>
</xmp>
then the first EOL is inserted in the code, and it looks awful.
It can be solved by removing that EOL
<xmp><div>Lorem ipsum</div>
<p>Hello</p>
</xmp>
but then it looks bad in the source. I used to solve it with wrapping <div>, but recently I figured out a nice CSS3 rule, I hope it also helps somebody:
xmp { margin: 5px 0; padding: 0 5px 5px 5px; background: #CCC; }
xmp:before { content: ""; display: block; height: 1em; margin: 0 -5px -2em -5px; }
This looks better.

If you have jQuery enabled you can use an escapeXml function and not have to worry about escaping arrows or special characters.
<pre>
${fn:escapeXml('
<!-- all your code -->
')};
</pre>

<code> tag is the good way because <xmp> and <pre> tags not support line wrapping
echo '<code>' . htmlspecialchars("<div><b>hello world</b></div>") . '</code>';

Inserting HTML tag in the middle of Arabic word breaks word connection (cursive)

From wikipedia:
Cursive (from Latin curro, currere, cucurri, cursum, to run, hasten) is any style of handwriting that is designed for writing notes and letters quickly by hand. In the Arabic, Latin, and Cyrillic writing systems, the letters in a word are connected, making a word one single complex stroke.
In the above languages when we want to format one single word with e.g. <span> tag to apply custom css style it breaks word conection, so is there any solution for this.
example this is for example normal arabic word: كتب
but when we want to color last letter in other color using the span tag get this:
because first two letter are in one tag and last is in other to color it.
Is there something I can do to avoid word breaks.
Here is the full html:
<p>كت<span style="color: Red;">ب</span></p>

I'm not sure if there's any HTML way to do it, but you can fix it by adding a zero-width joiner Unicode character before the opening span tag:
<p>كت‍<span style="color: Red;">ب</span></p>
You can use the actual Unicode character instead of the HTML character entity, of course, but that wouldn't be visible here. Or you can use the prettier ‍ entity.
Here it is in action (using an invisible <b> tag, since I can't do color here), without the joiner:
كتب
and with the joiner:
كت‍ب
It's supposed to work without the joiner as far as I understand it, though, and it does in some browsers, but clearly not all of them.

Update 2020/5
Google Chrome (Checked version 81.0.4044.138) and Firefox (76.0.1) have solved this issue when rendreing Arabic and Farsi words and there is no more need to handle the situation manually. Simply wrap the keyword with <span style="color:red">Keyword</span> works fine with both connecting and non-connecting characters.
For this reason, you probably can not see the difference between Correct and Wrong examples below:
Main post:
After 7 years of accepted answer I would like to add a new answer with more practical details as my native language is Farsi. I assume that we want to replace a keyword within a long word. This answer considers the following details:
1- Sometimes it is not enough to add ‍ only to the previous character becase next character should also has a tail to complete the connection.
body{font-size:36pt;}
span{color:red}
Wrong: مک‍<span>انیک</span>
<br>
Correct: مک‍<span>‍انیک</span>
2- We may also need to add ‍ after the keyword to connect it to next character.
body{font-size:36pt;}
span{color:red}
Wrong: مک‍<span>‍انیک</span>ی
<br>
Correct: مک‍<span>‍انیک‍</span>‍ی
3- There are some characters that accept tail before but not after. So we have to exclude them from accepting tail after them. This is the list of non-connecting characters to next characters: ا آ د ذ ر ز ژ و
4- Finally to respect search engines and scrappers, I recommend using javascript (jquery) to replace keywords after DOM ready to keep the page source clean.
This is my final code with regards to all details above:
$(document).ready(function(){
var tail="\u200D";
var keyword="ستر";
$(".searchableContent").each(function(){
var htm=$(this).html();
/*
preserve keywords which have space both before and after
with a temp sign say #fullHolder#
*/
htm=htm.split(' '+keyword+' ').join(' #fullHolder# ');
/*
preserve keywords which have only space after
with a temp sign say #preHolder#
*/
htm=htm.split(keyword+' ').join('#preHolder#'+' ');
/*
preserve keywords which have only space before
with a temp sign say #nextHolder#
*/
htm=htm.split(' '+keyword).join(' '+'#nextHolder#');
/*
replace remaining keywords with marked up span.
Add tail to both side of span to make sure it is
connected to both letters before and after
*/
htm=htm.split(keyword).join(tail+'<span style="color:#ff0000">'+tail+keyword+tail+'</span>'+tail);
//Deal #preHolder# by adding tail only before the keyword
htm=htm.split('#preHolder#'+' ').join(tail+'<span style="color:#ff0000">'+tail+keyword+'</span>'+' ');
//Deal #nextHolder# by adding tail only after the keyword
htm=htm.split(' '+'#nextHolder#').join(' '+'<span style="color:#ff0000">'+keyword+tail+'</span>'+tail);
//Deal #fullHolder# by adding markup only without tail
htm=htm.split(' '+'#fullHolder#'+' ').join(' '+'<span style="color:#ff0000">'+keyword+'</span>'+' ');
//Remove all possible combination of added tails to non-connecting characters
var nonConnectings=['ا','آ','د','ذ','ر','ز','ژ','و'];
for (x = 0; x < nonConnectings.length; x++) {
htm=htm.split(nonConnectings[x]+tail).join(nonConnectings[x]);
htm=htm.split(nonConnectings[x]+'<span style="color:#ff0000">'+tail).join(nonConnectings[x]+'<span style="color:#ff0000">');
htm=htm.split(nonConnectings[x]+'</span>'+tail).join(nonConnectings[x]+'</span>');
}
$(this).html(htm);
})
})
div{font-size:26pt}
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<div class="searchableContent">
سترون - بستری - آستر - بستر - استراحت
</div>

Is there an upside down caret character?

I have to maintain a large number of classic ASP pages, many of which have tabular data with no sort capabilities at all. Whatever order the original developer used in the database query is what you're stuck with.
I want to to tack on some basic sorting to a bunch of these pages, and I'm doing it all client side with javascript. I already have the basic script done to sort a given table on a given column in a given direction, and it works well as long as the table is limited by certain conventions we follow here.
What I want to do for the UI is just indicate sort direction with the caret character ( ^ ) and ... what? Is there a special character that is the direct opposite of a caret? The letter v won't quite cut it. Alternatively, is there another character pairing I can use?

There's ▲: ▲ and ▼: ▼

Don't forget the ∧ (logical and) and ∨ (logical or) characters, that's what I use for indicating sort direction: HTML entities ∧ & ∨ respectively.

There's always a lowercase "v". But seriously, aside from Unicode, all I can find would be &darr, which looks like ↓.

An upside-down circumflex is called a caron, or a háček.
It has an HTML entity in the TADS Latin-2 extension to HTML: &caron; and looks like this: &caron; which unfortunately doesn't display in the same size/proportion as the ^ caret.
Or you can use the unicode U+30C.

˅˅˅˅˅˅˅˅˅˅˅˅˅˅˅˅˅˅˅˅˅
˅˅˅ Hǝɹǝ,s ɐ ɯɐʇɔɥᴉuƃ sǝʇ˙ ˅˅˅
˄˄˄ Here's a matching set. ˄˄˄
˄˄˄˄˄˄˄˄˄˄˄˄˄˄˄˄˄˄˄˄˄
"Actual size": ˅˄˅˄
 
(more info)
Edit: Another Option...
⋁⋁⋁⋁⋁⋁⋁⋁⋁⋁ Unicode #8897 / U+22C1 ("n-ary logical or")
⋀⋀⋀⋀⋀⋀⋀⋀⋀⋀ Unicode #8896 / U+22C0 ("n-ary logical and")
"Actual size": ⋁⋀⋁⋀

A powerful option – and one which also boosts creativity – is designing your own characters using box drawing characters.
Want a down pointing "caret"? Here's one: ╲╱
I've recently discovered them — and I take great pleasure at using such custom designed characters for labeling things all around :) .

You might be able to use the black triangles, Unicode values U+25b2 and U+25bc. Or the arrows, U+2191 and U+2193.

c# code
int i = 0;
char c = '↑';
i = (int)c;
Console.WriteLine(i); // prints 8593
int j = 0;
char d = '↓';
j = (int)d;
Console.WriteLine(j); // prints 8595

You might consider using Font Awesome instead of using the unicode or other icons
The code can be as simple as (a) including font-awesome e.g. <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css"> (b) making a button such as <button><i class="fa fa-arrow-down"></i></button>

I'd use a couple of tiny images. Would look better too.
Alternatively, you can try the Character Map utility that comes with Windows or try looking here.
Another solution I've seen is to use the Wingdings font for symbols. That has a lot fo arrows.

I did subscript capital & bolded V. It works perfectly (although it takes some effort, if it needs to be done repetitively)
Syntax:
<sub><strong>v</strong></sub>
Output:
v

U+2304 DOWN ARROWHEAD, in HTML as ⌄

The ^ (Caret - or Ascii Circumflex), produced by pressing shift + 6, does not appear to have an Ascii opposite, namely an Ascii Inverted Circumflex.
But for your alternative character pairing that also have keyboard combinations, you could use:
ˆ (Circumflex) shift + alt + i and
ˇ (Caron) shift + alt + t
Source: fileformat.info

There is no upside down caret character, but you can easily rotate the caret with CSS. This is a simple solution that looks perfect. Press 'Run code snippet' to see it in action:
.upsidedown {
transform:rotate(180deg);
-webkit-transform:rotate(180deg);
-o-transform:rotate(180deg);
-ms-transform:rotate(180deg);
}
.upsidedown.caret {
display: inline-block;
position:relative;
bottom: 0.15em;
}
more items <span class="upsidedown caret">^</span>
Please note the following...
I did a little correction for the positioning of the caret, as it is normally high (thus low in the rotated version). You want to move it a little up. This 'little' is relative to the font-size, hence the 'em'. Depending on your font choice, you might want to fiddle with this to make it look good.
This solution does not work in IE8. You should use a filter if you want IE8 support. IE8 support is not really required nor common in 2018.
If you want to use this in combination with Twitter Bootstrap, please rename the class 'caret' to something else, like 'caret_down' (as it collides with a class name from Twitter Bootstrap).

So I wanted the caret exactly as in OWA, so I downloaded office365icons.woff from
https://owa.example.com/owa/prem/15.1.1913.10/resources/styles/fonts/office365icons.woff (have to be logged in to do it, so did it through browser) and then, copying the boiled-down style from the website:
#font-face {
font-family: 'Office365Icons';
src: url('/fonts/office365icons.woff') format('woff');
font-weight: normal;
font-style: normal;
}
span.o-icon {
font-family: 'Office365Icons';
font-size: 14pt;
line-height: 21px;
color: #666;
}
And finally:
<span class="o-icon"></span>

Could you just draw an svg path inside of a span using document.write? The span isn't required for the svg to work, it just ensures that the svg remains inline with whatever text the carat is next to. I used margin-bottom to vertically center it with the text, there might be another way to do that though. This is what I did on my blog's side nav (minus the js). If you don't have text next to it you wouldn't need the span or the margin-bottom offset.
<div id="ID"></div>
<script type="text/javascript">
var x = document.getElementById('ID');
// your "margin-bottom" is the negative of 1/2 of the font size (in this example the font size is 16px)
// change the "stroke=" to whatever color your font is too
x.innerHTML = document.write = '<span><svg style="margin-bottom: -8px; height: 30px; width: 25px;" viewBox="0,0,100,50"><path fill="transparent" stroke-width="4" stroke="black" d="M20 10 L50 40 L80 10"/></svg></span>';
</script>

If you are needing font-awesome for React Apps then React Icons is a very good resource and very easy to implement. It includes a lot more libraries than just font-awesome.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008