Can I automatically generate a Table of Contents (.hhc) using hhc.exe or other commandline tool? - chm

The HTML Help Workshop gui (hhw.exe) has an option 'Automatically create contents file (.hhc) when compiling'. Is there a way to get the same behavior from a commandline tool (e.g., hhc.exe)?

Afaik the cmdline tool still uses the .hhp?
From
http://www.nongnu.org/chmspec/latest/INI.html#HHP
put in your .hhp:
Auto TOC= number This uses the heading tags in your HTML files to generate the contents. The number is the maximum level of tag to place in the contents. eg 1 = , 2 = , 9 = WARNING: HHW modifies your HHC file if you specify this. IMO this is a bug: HHW should just slurp through a pipe or temporary file.

Related

Sublime Text - find all instances of an html class name project-wide

I want to find all instances of a class named "validation" in all of my html files project wide. It's a very large project and a search for the word "validation" gives me hundreds of irrelevant results (js functions, css, js/css minified, other classes, functions and html page content containing the word validation, etc). It can sometimes be the second, third, or fourth class declared so searching for "class='validation" doesn't work.
Is there a way to specify that I only want results where validation is a class declared on an html block?
Yes. In the sublime menu go to Find --> Find in Files...
Then match what is in the following image.
The first thing you will want to do is consider other possibilities with how you can solve this problem. Currently, it sounds like you are only using sublime text. Have you considered trying to use a command-line tool like grep?
Here is an example of how it could be used.
I have a project called enfold-child with a bunch of frontend assets for a wordpress project. Let's say, I want to find all of my scss files with the class "home" listed in them somewhere, but I do NOT want to pull in built css files, or anything in my node_modules folder. The way i would do that is as follows:
Folder structure:
..
|build
|scss_files
|node_modules
|css_files
|style.css
grep -rnw build --exclude=*{.css} --exclude-dir=node_modules -e home
grep = handy search utility.
-r = recursive search.
-n = provide line numbers for each match
-w = Select only those lines containing matches that form whole words.
-e = match against a regular expression.
home = the expression I want to search for.
In general, the command line has most anything one could want/need to do most of the nifty operations offered by most text-editors -- such as Sublime. Becoming familiar with the command line will save you a bunch of time and headaches in the future.
In SublimeText, right-click on the folder you want to start the search from and click on Find in Folder. Make sure regex search is enabled (the .* button in the search panel) and use this regex as the search string:
class="([^"]+ )?validation[ "]
That regex will handle cases where "validation" is the only classname as well as cases where its one of several classnames (in which case it can be anywhere in the list).
If you didn't stick to double quotes, this version will work with single or double quotes:
class=['"]([^'"]+ )?validation[ '"]
If you want to use these regexes from the command line with grep, you'll need to include a -E argument for "extended regular expressions".

Taking count in Rapidminer

How to take a row count of a list which is in word document?? If the same list is in excel I am able to take the count using aggregate operator but in word document it is not happening.
I recommend the answer from #awchisholm as it's the easiest solution. However, if you have several word documents this might become impractical.
In this case you can use the operator Loop Zip files to unzip the word document and look inside the for the file /word/document.xml and using RapidMiner's text functions (or Read XML) look for each instance of <w:p ...>...</w:p>, this represents a new line so you can count them from there.
There is also an xml doc in the unzipped directory called /docProps/app.xml you can read this in to find some meta information about the document such as number of words, characters & pages. Unfortunately I've found that unreliable for number of lines which is why I recommend using the <w:p> tag to search.
RapidMiner cannot easily read Word documents. You have to save the document as a text file and use the Read CSV operator to read the file.

PanDoc: How to assign level-one Atx-style header (markdown) to the contents of html title tag

I am using PanDoc to convert a large number of markdown (.md) files to html. I'm using the following Windows command-line:
for %%i in (*.md) do pandoc -f markdown -s %%~ni.md > html/%%~ni.html
On a test run, the html looks OK, except for the title tag - it's empty. Here is an example of the beginning of the .md file:
#Topic Title
- [Anchor 1](#anchor1)
- [Anchor 2](#anchor2)
<a name="anchor1"></a>
## Anchor 1
Is there a way I can tell PanDoc to parse the
#Topic Title
so that, in the html output files, I will get:
<title>Topic Title</title>
?
There are other .md tags I'd like to parse, and I think solving this will help me solve the rest of it.
I don't believe Pandoc supports this out-of-the-box. The relevant part of the Pandoc documentation states:
Templates may contain variables. Variable names are sequences of alphanumerics, -, and _, starting with a letter. A variable name surrounded by $ signs will be replaced by its value. For example, the string $title$ in
<title>$title$</title>
will be replaced by the document title.
It then continues:
Some variables are set automatically by pandoc. These vary somewhat depending on the output format, but include metadata fields (such as title, author, and date) as well as the following:
And proceeds to list a bunch of variables (none of which are relevant to your question). However, the above quote indicates that the title variable is a metadata field. The metadata field can be defined in a pandoc_title_block, a yaml_metadata_block, or passed in as a command line option.
The docs note that:
... you may also keep the metadata in a separate YAML file and pass it to pandoc as an argument, along with your markdown files ...
So you have a couple options:
Edit each document to add metadata defining the title for each document (this could possibly be scripted).
Write your script to extract the title (perhaps a regex which looks for #header in the first line) and passes that in to Pandoc as a command line option.
If you intend to start including the metadata in new documents you create going forward, then the first option is probably the way to go. Run a script once to batch edit your documents and then your done. However, if you have no intention of adding metadata to any documents, I would consider the second option. You are already running a loop, so just get the title before calling Pandoc within your loop (although I'm not sure how to do that in a windows script).

In Stata, how do I add variable labels from a separate csv file?

I have a set of csv files that are very simple to load into Stata using the -insheet- command. But they have very uninformative variable names. For each of these files, I also have a file of metadata consisting of two columns: the original (uninformative) variable names, and a description of what the variables actually mean. I'd like to use these metadata files to create variable labels, preferably without going through and typing up all the separate label commands or turning the metadata file into a dictionary for each file. It seems like there must be a quick way of loading the metadata file into Stata and looping through it to generate the label commands, but I don't know what it is. Any thoughts?
Ideally each line of the metadata is something like
varname1 "more interesting description"
in which case you can prefix each line with
label var
and then run the file as if it were a do-file using do. See the help for label. That is easy in a decent text editor, as for example searching for the start of each line and replacing it with label var (note the need for the space).
What could bite here includes:
You don't have double quotes " " as delimiters, in which case you need to insert them.
The extra information does not qualify as a variable label because it is more than 80 characters long. See help limits.
There are other ways to do this with Stata. You could write a program to read in the metadata and write out a do-file using file, but if this were my problem I would reach first for my text editor. (Most experienced Stata programmers use something else as well as doedit.)

Natural ordering files in directory into a cell array using Octave

I have files being generated by another program/user that have names such as "jh-1.txt, jh-2.txt, ..., jh-100.txt, ..., jh-1024.txt". I'm extracting a column from these files, manipulating the data, and outputting to a new matrix. The only problem is that Octave is using ASCII ordering and not natural ordering when reading in the files. Thus, the output matrix is not ordered in a natural way. My question is, can Octave sort file names in a natural order? I'm getting file names in the standard method:
fileDirectory = '/path/to/directory';
filePattern = fullfile(fileDirectory, '*.txt'); % Selects only the txt files.
dataFiles = dir(filePattern); % Gets the info from the txt files in the directory.
baseFileName = {dataFiles.name}'; % Gets all the txt file names.
I can't rename the files because this is a script for another user. They are on a Windows machine and already have Octave installed with Cygwin and I don't want to make them use the command line more than they have to because they are unfamiliar with it. Alternatively, it would be nice to have the output with the file names in a column but, I haven't figured that one out either (bit of a noob with Octave myself). That way the user could use Excel (which they are familiar with) to sort the columns.
I don't think there's a built in natural sort in Octave. However, there is a natural sort submission on Mathwork's File Exchange. I've not used it, but the comments imply it works in Octave too.