I'm dumping the schema from the table into a Tag Annotation for the package.
<Annotation AnnotationType="Tag" Tag="PackageSchema">
<#=Table.Schema#>
</Annotation>
In the BIML for creating a master package I'm creating a sequence container for each schema and putting the packages in the corresponding container. At least that's what I'm asking it to do.
<Package Name="01-Master" ConstraintMode="Linear">
<Tasks>
<# foreach (var SchemaNode in RootNode.Schemas) { #>
<Container Name="SEQC <#=SchemaNode.Name#>" ConstraintMode = "Parallel">
<Tasks>
<# foreach (var Pckg in RootNode.Packages.Where(pkgschema => pkgschema.GetTag("PackageSchema")==SchemaNode.Name)) { #>
<ExecutePackage Name="EP <#=Pckg.Name#>" DelayValidation="true">
<ExternalProjectPackage Package="<#=Pckg.Name#>.dtsx">
</ExternalProjectPackage>
</ExecutePackage>
<# } #>
</Tasks>
</Container>
<# } #>
</Tasks>
When that runs I get a Master package with empty sequence containers. I took the where out of the package foreach, and it generates but puts all packages in every container. I put the GetTag in the name of the package just to make sure it picked it up correctly.
<# foreach (var Pckg in RootNode.Packages) { #>
<ExecutePackage Name="EP <#=Pckg.Name#>" DelayValidation="true">
<ExternalProjectPackage Package="<#=Pckg.Name#>.dtsx--<#=Pckg.GetTag("PackageSchema")#>--<#=SchemaNode.Name#>">
The tag was put into the package name but it is padded with lots of space around it.
<ExecutePackage Name="EP Application_TransactionTypes" DelayValidation="true">
<ExternalProjectPackage Package="Application_TransactionTypes.dtsx-- Application --Application" />
</ExecutePackage>
<ExecutePackage Name="EP Purchasing_PurchaseOrderLines" DelayValidation="true">
<ExternalProjectPackage Package="Purchasing_PurchaseOrderLines.dtsx-- Purchasing --Application" />
</ExecutePackage>
So I'm guessing that the padded value is why the RootNode.Packages.Where is not matching up to the schema name. I can't figure out how to trim off the spaces though. I tried putting a trim() in different places but the BIML engine complains about it. I was able to get rid of the leading spaces by taking the tabs out in front of the actual annotation in the BIML but it still pads the end.
Any ideas on why the tag is getting padded, or maybe I'm completely off base here and it's not the spaces around the tag.
This is one of those "rock and a hard place" situations. In a much earlier version, we actually automatically trimmed annotation tag values to remove the leading and trailing whitespace. This caused issues for users in scenarios where they really needed that whitespace.
There are a few workarounds for this:
As Bill pointed out, just delete the whitespace in your BimlScript.
If you really want the whitespace in the BimlScript for readability, wrap the value in a CDATA block so that the newlines outside of the CDATA block are ignored. The syntax for this would be:
<Annotation AnnotationType="Tag" Tag="PackageSchema">
<![CDATA[<#=table.Schema#>]]>
</Annotation>
Alternatively, if you want to keep the whitespace for readability and don't like CDATA, you can trim the whitespace from the annotation value at the point of use. The syntax for this would be:
<#=Pckg.GetTag("PackageSchema").Trim()#>
This is one of the ugly places where Biml/XML formatting is biting you in the backside
<Annotation AnnotationType="Tag" Tag="PackageSchema">
<#=Table.Schema#>
</Annotation>
If you change that definition to the following, does everything "magically" work?
<Annotation AnnotationType="Tag" Tag="PackageSchema"><#=Table.Schema#></Annotation>
I assume so because I ran into a similar issue with package parameters...
Related
Background:
I am trying to implement a simple (?) markup language to be used to write novels.
This is quite different from usual markups because semantic is centered on different primitives, in particular direct speech is far from being similar to a list.
The basic structure is well-known: #part{title}, #chapter{title} and #scene[{title}] have the usual meanings and double-\n indicates a paragraph break.
Specific features include:
#speach[speaker]{utterance, possibly complex}
#stress{something that should be visually enhanced}
#standout{some part that should have a different visual enhancement}
#quotation[original author]{possibly long block quotation}
This should be parsed and translated to different output formats (e.g.: html and LaTeX).
I have a pyparsing grammar able to parse a non-trivial input.
Problem is generation of paragraphs for HTML:
As said a paragraph ends with double-newline, but essentially starts from end of previous paragraph unless some top-level constucts (e.g.: #chapter) intervene to break sequence.
First naive attempt was to accumulate text fragments in a global buffer and to emit them at selected points; this wold logically work, but it seems pyparsing calls it's ParseActions multiple times, so my global buffer ends up holding the same fragment duplicated.
I have not found a way to either avoid such duplication or to mark the "start of paragraph" in such a way I can come back to it later to generate the well-known <p>Long line, maybe containing #speech{possibly nested with #standout{!} and other constructs}</p> (of course #standout should map to <b>!</b> and#speech to some specific <div class="speech"></div>)
What is the "best practice" to handle this kind of problems?
Note: LaTeX code generation is much less problematic because paragraphs are simply terminated (like in the markup) either with a blank line or with \par.
Is it possible for you recast this not as a "come back to the beginning later" problem but as a "read ahead as far as I need to get the whole thing" problem?
I think nestedExpr might be a way for you to read ahead to the next full markup, and then have a parse action re-parse the contents in order to process any nested markup directives. nestedExpr returns its parsed input as a nested list, but to get everything as a flattened string, wrap it in originalTextFor.
Here is a rework of the simpleWiki.py example from the pyparsing examples:
import pyparsing as pp
wiki_markup = pp.Forward()
# a method that will construct and return a parse action that will
# do the proper wrapping in opening and closing HTML, and recursively call
# wiki_markup.transformString on the markup body text
def convert_markup_to_html(opening,closing):
def conversionParseAction(s, l, t):
return opening + wiki_markup.transformString(t[1][1:-1]) + closing
return conversionParseAction
# use a nestedExpr with originalTextFor to parse nested braces, but return the
# parsed text as a single string containing the outermost nested braces instead
# of a nested list of parsed tokens
markup_body = pp.originalTextFor(pp.nestedExpr('{', '}'))
italicized = ('ital' + markup_body).setParseAction(convert_markup_to_html("<I>", "</I>"))
bolded = ('bold' + markup_body).setParseAction(convert_markup_to_html("<B>", "</B>"))
# another markup and parse action to parse links - again using transform string
# to recursively parse any markup in the link text
def convert_link_to_html(s, l, t):
t['link_text'] = wiki_markup.transformString(t['link_text'])
return '{link_text}'.format_map(t)
urlRef = ('link'
+ '{' + pp.SkipTo('->')('link_text') + '->' + pp.SkipTo('}')('url') + '}'
).setParseAction(convert_link_to_html)
# now inject all the markup bits as possible markup expressions
wiki_markup <<= urlRef | italicized | bolded
Try it out!
wiki_input = """
Here is a simple Wiki input:
ital{This is in italics}.
bold{This is in bold}!
bold{This is in ital{bold italics}! But this is just bold.}
Here's a URL to link{Pyparsing's bold{Wiki Page}!->https://github.com/pyparsing/pyparsing/wiki}
"""
print(wiki_markup.transformString(wiki_input))
Prints:
Here is a simple Wiki input:
<I>This is in italics</I>.
<B>This is in bold</B>!
<B>This is in <I>bold italics</I>! But this is just bold.</B>
Here's a URL to Pyparsing's <B>Wiki Page</B>!
Given your markup examples, I think this approach may get you further along.
I have a HTML Table of Contents page containing list of book chapters with hyperlinks:
Multimedia Implementation<br/>
Table of Contents<br/>
About the Author<br/>
About the Technical Reviewers<br/>
Acknowledgments<br/>
Part I: Introduction and Overview<br/>
Chapter 1. Technical Overview<br/>
...
I want create NCX file for a Kindle book which must contain details as follows:
<navPoint id="n1" playOrder="1">
<navLabel>
<text>Multimedia Implementation</text>
</navLabel>
<content src="final/main.html"/>
</navPoint>
<navPoint id="n2" playOrder="2">
<navLabel>
<text>Table of Contents</text>
</navLabel>
<content src="final/toc.html"/>
</navPoint>
<navPoint id="n3" playOrder="3">
<navLabel>
<text>About the Author</text>
</navLabel>
<content src="final/pref01.html"/>
</navPoint>
...
I'm using Notepad++: is it possible automate this process with regular expression?
You cannot do everything using regex.. you can split the problem into two parts..
generate strings like <navPoint id="n1" playOrder="1"> using program logic (increment variable)
remaining you can do with regex
Use the following regex to match:
<a\shref="([^"]*)">([^<]*)<\/a><br\/>
And replace with:
(generated string)<navLabel>\n<text>\2</text>\n<content src="\1"/>\n</navPoint>
See DEMO
Yes, it is possibly to replace the links with <navpoint> tags. The only thing I found no solution for is the incremental numbering of the <navpoint> attributes id and playOrder...
The following regex will do most of the work:
/^<a[^>]*href="([^"]+)"[^>]*([^<]+).*$/gm
substitute with:
<navpoint id="n" playOrder="">\n<navLabel><text>$2</text></navLabel>\n<content src="$1" />\n</navpoint>\n
Regex details
/^<a .. only parse lines that start with an `<a` tag
.*href=" .. find the first occurance of `href="`
([^"]+) .. capture the text and stop when a " is found
"[^>]*> .. find the end of the <a> tag
([^<]+) .. capture the text and stop when a < is found (i.e. the </a> tag)
.*$/ .. continue to end of the line
gm .. search the whole string and parse each line individually
More detailled (but also more confusing) explanation is here:
https://regex101.com/r/gA0yJ2/1
This link also demonstrates how the regex is working. You can test changes there if you like
Something that's really bothered me about XHTML, and XML in general is why it's necessary to indicate what tag you're closing. Things like <b>bold <i>bold and italic</b> just italic</i> aren't legal anyways. Thus I think using {} makes more sense. Anyway, here's what I came up with:
doctype;
html
{
head
{
title "my webpage"
javascript '''
// code here
// single quotes do not allow variable substitution, like PHP
// triple quotes can be used like Python
'''
}
body
{
table {
tr {
td "cell 1"
td "cell 2"
td #var|filter1|filter2:arg
}
}
p "variable #var in a string"
p "variable #{var|withfilter}"
input(type=password, value=secret); // attributes are specified like this
br; // semi-colons are used on elements that don't have content
p { "strings are" "automatically" "concatenated together" #andvars "too" }
}
}
Tags that only contain one element do not need to be enclosed in braces (for example td "cell 1" the td is closed immediately after the text). Strings are outputted directly, except double-quoted strings allow variable substitution, and single quotes do not. I'm adopting a filtering scheme similar to Django's. The thing I'm most concerned about, I think, is variable substitution in double-quotes.. I don't want people to have to open and close single quotes everywhere because the syntax things are being treated as vars that shouldn't. I don't think the # character is very commonly used in code. I was going to use $ like PHP, but jQuery uses that, and I want to allow people to do substitutions in their JS too (of course, if they don't need to, they should use single quotes!)
Templates will use "dictionaries". By default, it uses this HTML dict, with familiar tags, but you can easily add your own. "Tags" may consist of not just one, but multiple HTML tags.
Still need to decide how to do loops and including partials...
Edit: Started an open source project, for those interested.
I believe you can get close to that with the syntax of TCL script language.
The thing I like the most about your idea is the removal of the (to me very) redundant information in the closing tags of the (has it's roots in) SGML markup.
Another clean option IMO is to go the road of using indenting to specify scope, eliminating braces all together. With the assumption of a little editor support, I can imagine this happening.
I think it's possibly stiflling that globally used specifications cater to the theorhetical person using VI or Notepad to type out their markup...
I have some fixed strings inside my strings.xml, something like:
<resources>
<string name="somestring">
<B>Title</B><BR/>
Content
</string>
</resources>
and in my layout I've got a TextView which I'd like to fill with the html-formatted string.
<TextView android:id="#+id/formattedtext"
android:layout_width="fill_parent"
android:layout_height="wrap_content"
android:text="#string/htmlstring"/>
if I do this, the content of formattedtext is just the content of somestring stripped of any html tags and thus unformatted.
I know that it is possible to set the formatted text programmatically with
.setText(Html.fromHtml(somestring));
because I use this in other parts of my program where it is working as expected.
To call this function I need an Activity, but at the moment my layout is just a simple more or less static view in plain XML and I'd prefer to leave it that way, to save me from the overhead of creating an Activity just to set some text.
Am I overlooking something obvious? Is it not possible at all? Any help or workarounds welcome!
Edit: Just tried some things and it seems that HTML formatting in xml has some restraints:
tags must be written lowercase
some tags which are mentioned here do not work, e.g. <br/> (it's possible to use \n instead)
Just in case anybody finds this, there's a nicer alternative that's not documented (I tripped over it after searching for hours, and finally found it in the bug list for the Android SDK itself). You CAN include raw HTML in strings.xml, as long as you wrap it in
<![CDATA[ ...raw html... ]]>
Edge Cases:
Characters like apostrophe ('), double-quote ("), and ampersand (&) only need to be escaped if you want them to appear in the rendered text AS themselves, but they COULD be plausibly interpreted as HTML.
' and " can be represented as\' and \", or ' and ".
< and > always need to be escaped as < and > if you literally want them to render as '<' and '>' in the text.
Ampersand (&) is a little more complicated.
Ampersand followed by whitespace renders as ampersand.
Ampersand followed by one or more characters that don't form a valid HTML entity code render as Ampersand followed by those characters. So... &qqq; renders as &qqq;, but <1 renders as <1.
Example:
<string name="nice_html">
<![CDATA[
<p>This is a html-formatted \"string\" with <b>bold</b> and <i>italic</i> text</p>
<p>This is another paragraph from the same \'string\'.</p>
<p>To be clear, 0 < 1, & 10 > 1<p>
]]>
</string>
Then, in your code:
TextView foo = (TextView)findViewById(R.id.foo);
foo.setText(Html.fromHtml(getString(R.string.nice_html), FROM_HTML_MODE_LEGACY));
IMHO, this is several orders of magnitude nicer to work with :-)
August 2021 update: My original answer used Html.fromHtml(String), which was deprecated in API 24. The alternative fromHtml(String,int) form is suggested as its replacement.
FROM_HTML_MODE_LEGACY is likely to work... but one of the other flags might be a better choice for what you want to do.
On a final note, if you'd prefer to render Android Spanned text suitable for use in a TextView using Markdown syntax instead of HTML, there are now multiple thirdparty libraries to make it easy including https://noties.io/Markwon.
As the top answer here is suggesting something wrong (or at least too complicated), I feel this should be updated, although the question is quite old:
When using String resources in Android, you just have to call getString(...) from Java code or use android:text="#string/..." in your layout XML.
Even if you want to use HTML markup in your Strings, you don't have to change a lot:
The only characters that you need to escape in your String resources are:
double quotation mark: " becomes \"
single quotation mark: ' becomes \'
ampersand: & becomes & or &
That means you can add your HTML markup without escaping the tags:
<string name="my_string"><b>Hello World!</b> This is an example.</string>
However, to be sure, you should only use <b>, <i> and <u> as they are listed in the documentation.
If you want to use your HTML strings from XML, just keep on using android:text="#string/...", it will work fine.
The only difference is that, if you want to use your HTML strings from Java code, you have to use getText(...) instead of getString(...) now, as the former keeps the style and the latter will just strip it off.
It's as easy as that. No CDATA, no Html.fromHtml(...).
You will only need Html.fromHtml(...) if you did encode your special characters in HTML markup. Use it with getString(...) then. This can be necessary if you want to pass the String to String.format(...).
This is all described in the docs as well.
Edit:
There is no difference between getText(...) with unescaped HTML (as I've proposed) or CDATA sections and Html.fromHtml(...).
See the following graphic for a comparison:
Escape your HTML tags ...
<resources>
<string name="somestring">
<B>Title</B><BR/>
Content
</string>
</resources>
Android does not have a specification to indicate the type of resource string (e.g. text/plain or text/html). There is a workaround, however, that will allow the developer to specify this within the XML file.
Define a custom attribute to specify that the android:text attribute is html.
Use a subclassed TextView.
Once you define these, you can express yourself with HTML in xml files without ever having to call setText(Html.fromHtml(...)) again. I'm rather surprised that this approach is not part of the API.
This solution works to the degree that the Android studio simulator will display the text as rendered HTML.
res/values/strings.xml (the string resource as HTML)
<resources>
<string name="app_name">TextViewEx</string>
<string name="string_with_html"><![CDATA[
<em>Hello</em> <strong>World</strong>!
]]></string>
</resources>
layout.xml (only the relevant parts)
Declare the custom attribute namespace, and add the android_ex:isHtml attribute. Also use the subclass of TextView.
<RelativeLayout
...
xmlns:android_ex="http://schemas.android.com/apk/res-auto"
...>
<tv.twelvetone.samples.textviewex.TextViewEx
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:text="#string/string_with_html"
android_ex:isHtml="true"
/>
</RelativeLayout>
res/values/attrs.xml (define the custom attributes for the subclass)
<resources>
<declare-styleable name="TextViewEx">
<attr name="isHtml" format="boolean"/>
<attr name="android:text" />
</declare-styleable>
</resources>
TextViewEx.java (the subclass of TextView)
package tv.twelvetone.samples.textviewex;
import android.content.Context;
import android.content.res.TypedArray;
import android.support.annotation.Nullable;
import android.text.Html;
import android.util.AttributeSet;
import android.widget.TextView;
public TextViewEx(Context context, #Nullable AttributeSet attrs) {
super(context, attrs);
TypedArray a = context.obtainStyledAttributes(attrs, R.styleable.TextViewEx, 0, 0);
try {
boolean isHtml = a.getBoolean(R.styleable.TextViewEx_isHtml, false);
if (isHtml) {
String text = a.getString(R.styleable.TextViewEx_android_text);
if (text != null) {
setText(Html.fromHtml(text));
}
}
} catch (Exception e) {
e.printStackTrace();
} finally {
a.recycle();
}
}
}
Latest update:
Html.fromHtml(string);//deprecated after Android N versions..
Following code give support to android N and above versions...
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.N) {
textView.setText(Html.fromHtml(yourHtmlString,Html.FROM_HTML_MODE_LEGACY));
}
else
{
textView.setText(Html.fromHtml(yourHtmlString));
}
String termsOfCondition="<font color=#cc0029>Terms of Use </font>";
String commma="<font color=#000000>, </font>";
String privacyPolicy="<font color=#cc0029>Privacy Policy </font>";
Spanned text=Html.fromHtml("I am of legal age and I have read, understood, agreed and accepted the "+termsOfCondition+commma+privacyPolicy);
secondCheckBox.setText(text);
I have another case when I have no chance to put CDATA into the xml as I receive the string HTML from a server.
Here is what I get from a server:
<p>The quick brown <br />
fox jumps <br />
over the lazy dog<br />
</p>
It seems to be more complicated but the solution is much simpler.
private TextView textView;
protected void onCreate(Bundle savedInstanceState) {
.....
textView = (TextView) findViewById(R.id.text); //need to define in your layout
String htmlFromServer = getHTMLContentFromAServer();
textView.setText(Html.fromHtml(htmlFromServer).toString());
}
Hope it helps!
Linh
If you want to show html scrip in android app Like TextView
Please follow this code
Kotlin
var stringvalue = "Your Sting"
yourTextVew.text = if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.N) {
Html.fromHtml(stringvalue, Html.FROM_HTML_MODE_COMPACT)
} else {
Html.fromHtml(stringvalue)
}
Java
String stringvalue = "Your String";
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.N) {
yourTextVew.setText(Html.fromHtml(stringvalue, Html.FROM_HTML_MODE_COMPACT))
} else {
yourTextVew.setText( Html.fromHtml(stringvalue))
}
I have a problem creating a regular expression for the following task:
Suppose we have HTML-like text of the kind:
<x>...<y>a</y>...<y>b</y>...</x>
I want to get a collection of values inside <y></y> tags located inside a given <x> tag, so the result of the above example would be a collection of two elements ["a","b"].
Additionally, we know that:
<y> tags cannot be enclosed in other <y> tags
... can include any text or other tags.
How can I achieve this with RegExp?
This is a job for an HTML/XML parser. You could do it with regular expressions, but it would be very messy. There are examples in the page I linked to.
I'm taking your word on this:
"y" tags cannot be enclosed in other "y" tags
input looks like: <x>...<y>a</y>...<y>b</y>...</x>
and the fact that everything else is also not nested and correctly formatted. (Disclaimer: If it is not, it's not my fault.)
First, find the contents of any X tags with a loop over the matches of this:
<x[^>]*>(.*?)</x>
Then (in the loop body) find any Y tags within match group 1 of the "outer" match from above:
<y[^>]*>(.*?)</y>
Pseudo-code:
input = "<x>...<y>a</y>...<y>b</y>...</x>"
x_re = "<x[^>]*>(.*?)</x>"
y_re = "<y[^>]*>(.*?)</y>"
for each x_match in input.match_all(x_re)
for each y_match in x_match.group(1).value.match_all(y_re)
print y_match.group(1).value
next y_match
next x_match
Pseudo-output:
a
b
Further clarification in the comments revealed that there is an arbitrary amount of Y elements within any X element. This means there can be no single regex that matches them and extracts their contents.
Short and simple: Use XPath :)
It would help if we knew what language or tool you're using; there's a great deal of variation in syntax, semantics, and capabilities. Here's one way to do it in Java:
String str = "<y>c</y>...<x>...<y>a</y>...<y>b</y>...</x>...<y>d</y>";
String regex = "<y[^>]*+>(?=(?:[^<]++|<(?!/?+x\\b))*+</x>)(.*?)</y>";
Matcher m = Pattern.compile(regex).matcher(str);
while (m.find())
{
System.out.println(m.group(1));
}
Once I've matched a <y>, I use a lookahead to affirm that there's a </x> somewhere up ahead, but there's no <x> between the current position and it. Assuming the pseudo-HTML is reasonably well-formed, that means the current match position is inside an "x" element.
I used possessive quantifiers heavily because they make things like this so much easier, but as you can see, the regex is still a bit of a monster. Aside from Java, the only regex flavors I know of that support possessive quantifiers are PHP and the JGS tools (RegexBuddy/PowerGrep/EditPad Pro). On the other hand, many languages provide a way to get all of the matches at once, but in Java I had to code my own loop for that.
So it is possible to do this job with one regex, but a very complicated one, and both the regex and the enclosing code have to be tailored to the language you're working in.