Excluding files based on parent directory in RubCop

Excluding files based on parent directory in RubCop - rubocop

I've been scratching my head a bit on how to exclude files based on the name of one of the parent directories. I'm not sure if this is something that's not possible or if I'm just missing some real obvious syntax here.
I have a cop that's examining large repositories where I don't control the exact directory structure. I need to skip examining files in any directory named test though no matter where it occurs in the directory structure. For instance my-repo/foo/bar/test/foo/bar/file.rb should not be examined and my-repo/test/another_file.rb should also not be examined. Is there any way to define an exclude that's basically '**/test/**/*.rb?

Yes, your assumption is correct. You can find an example in e.g. in rubocop-rspec extension:
Include:
- "**/spec/**/*"
It would work the same way with Exclude.
It works both for AllCops, and specific cops can be configured as well.

Related

How configure the qooxdoo generator to include a dynamically referenced class?

First, some context: I drive qooxdoo from other languages such as Lisp and ClojureScript, and I dynamically generate code to reference individual classes.
This normally fails because the qooxdoo generator looks through the static source to see which classes to include.
In the past I have just whomped explicit mentions of classes into Application.js. This works great, but recently I started to grok the config.json syntax and thought it would be nice to take a less kludgy approach.
I managed to add code like this to the "source-build" job and that build then worked:
"include" : ["qx.ui.mobile.page.Manager"]
But I use many classes in an app, so adding that to each job would be error-prone and still ugly.
I tried adding the "include" to the "mobile-common" job which the other jobs extend but to my surprise that did not work. Hmm.. could there be a bug in the job "extend" logic?
I could just add "include" : ["qx.ui.mobile.*"] to all the jobs but that is still ugly and excessive (and I would have still to pull in multiple other classes in each job).
Looking back at all this, it seems there would be no problem if the job "extends" mechanism successfully picked up the "include" option. I just ran the generator with the verbose option -v and can confirm the page manager class is not included if I add the "include" to mobile-common, but it is if I do so on the specific job.
Am I missing something?

Kenny,
you're quite right using the "mobile-common" job, and it is really strange that it doesn't work. As I don't know your exact config.json file I can only provide some guesses here:
The default "mobile-common" job provided with the mobile skeleton already contains an "include" key. You did not by any chance add a second one to the job?!
Are you using the mobile config.json directly, or did you create another config file and are including the one that contains the default "mobile-common"? If you use job shadowing (i.e. define "mobile-common" in one config file but also in another which is included by the first), this will influence the content of the resulting job definition (maybe in an unexpected way).
The default "mobile-common" job has (for whatever reason) a = in front of the include key, to protect from overriding. You might want to remove that and see what happens.
If all fails you can still create your own includer job (like "my-includes"), add an "include" key to it, and then add this job to the "extend" list of the relevant source* and build* jobs. Make sure to add it before the mobile-common entry. This way you can at least maintain your additional include patterns in a single place.

Searching for the change history of partial file or path in Mercurial or TortoiseHg

Each time I need anything beyond the standard search, I find myself trying several things, searching Google and in the end terribly failing. Apparently, the Hg search syntax is pretty extensive and I would like to use its power, but I don't seem to be able to find a good reference.
For instance, quite often I want to find all changes in the repository related to a partial path match. I know that the following works:
file('path:full/path/file.txt')
But I would like to search for files by partial match, and neither of the following worked:
jquery -- seems to find everything
file(jquery*) -- finds nothing
file('jquery*') -- finds nothing
file('path:jquery.*') -- finds nothing
file('name:jquery.*') -- finds nothing
file('path:jquery.js') -- finds every revision, it seems
From the popup in TortoiseHg I see that there are a gazillion options, but no hint on how to use them (the help link shows a little bit more, but nothing on what a pattern should look like in file(pattern)):
In the end I usually find what I want using other ways of searching, but it would be so nice to be able to use this power of expression, and it's quite a shame that after so many years, I've never found out how to leverage this.

I can very much advise using the hg help system for this. The most useful pages to look at (in my view):
hg help revsets
hg help filesets
hg help patterns
In the page about patterns, you can find about 'path:':
To use a plain path name without any pattern matching, start it with
"path:". These path names must completely match starting at the current
repository root.
In other words: using 'path:' is not suitable for this purpose. Slightly below, 'glob:' is mentioned:
To use an extended glob, start a name with "glob:". Globs are rooted at
the current directory; a glob such as "*.c" will only match files in the
current directory ending with ".c".
The supported glob syntax extensions are "**" to match any string across
path separators and "{a,b}" to mean "a or b".
In other words, it should be possible to use the pattern file('glob:**jquery*').
In fact, the above pattern would also work without the glob prefix, so: file('**jquery*'). See part of the page about revsets:
"file(pattern)"
Changesets affecting files matched by pattern.
For a faster but less accurate result, consider using "filelog()"
instead.
This predicate uses "glob:" as the default kind of pattern.

How to organize code so that we can move and update it without having to edit the location of the configuration file?

The issue that I consider is how to write code that can easily know the location of a required config file and yet is portable, without any edit, from an environment to another. We don't want to edit the location of the configuration file to adapt the code to each new environment, say each time we move the code from a development environment to production. The method should not rely on resources that are not universally available, such as an access to user-defined environment variables or an access to a specific directory. For example, it may seem that using the DOCUMENT_ROOT as a base location for the config file is the way to go, but that is not universal. First, in a command line environment the DOCUMENT_ROOT makes no sense. Second, a programmer might be given access to a sub-folder of the DOCUMENT_ROOT only. Another requirement is that the configuration file could depend on values known at run time, say the user who call the application, as in this question How to load a config file based on user selection from "unknown" location .
The question is not what is the best location of the configuration file in specific environments, such as Location to put user configuration files in windows . The programmers would still have to figure out the best location so that end users could easily find the configuration file. The question is how this location, whatever it is, even if it depends on values known at run time, can be passed to the code in a portable manner.

One approach is to design any script file with in mind that it is to be included in another file and so on until we get to a wrapper script that only defines the directory of the config file to the benefit of the included file and other included files therein. Once this directory path is known, other configuration values can be obtained from a named configuration file within it. This works because the wrapper scripts are not updated when we update the code from a repository or testing environment. This approach seems universally applicable : no special support of any kind such as an access to user defined environment variables or to some specific directory in the server is needed. As long as you have access to the code, which is a strict minimum to expect, it works. Also, scripts are often naturally designed to be included in another file - so it is natural.
The approach only requires that we agree on a convention for the name of the constant, say CONFIG_DIRECTORY. If every programmer would agree to search at the location specified by this constant for the config file, then any user of the code could put the config file anywhere and just define this constant accordingly.
In Linux, they have the folder /etc for config files. So, the notion of an universally agreed standard in a very large context is already there. This is the same idea than the one proposed here, except that it is the same constant for all machines and someone might not have access to that level of the server. Moreover, we lose the possibility to have different configuration directories for different wrapper scripts. Allowing the universal standard to be a constant name, say 'CONFIG_DIRECTORY', instead of being the fixed constant '/etc', seems just an extra flexibility with no additional inconvenient. It does require that we define this constant in some wrapper script, but we could fall back to the old approach if it is not defined. The outcome, if the approach is strictly applied, would be that all the scripts required in the server document root would only be simple wrappers that define a configuration directory. That seems cool. Often people say that it is safer to have important code outside the document root.

Can I generate content in Jekyll from two different directories

I have two pieces of somewhat unrelated source that I want to turn into one "site" using Jekyll. But they are in two directories. Let's say PROJECT/site/ has the homepage and copy and so forth, and PROJECT/clientlib/ has a bunch of libraries. I'd like, for example, PROJECT/site/index.md to become /index.html and PROJECT/clientlib/foo.js to become /clientlib/foo.js
This is an open source project so I'd really like to avoid fooling around with symlinks or submodules that might make it harder for someone to check the project out and start using it. And I want to be able to use the Jekyll dev server, without doing fixup on the generated files.
Is there a way to configure (or hack) Jekyll to get the layout I'm hoping for?

I just finished publishing a custom gem that accomplishes this. It lets you specify a shared_dir that can be used between multiple Jekyll configurations for a common base:
https://github.com/sumdog/jekyll-multisite

You can give them the same destination path in your _config.yaml file (in place of the default _site, see https://github.com/mojombo/jekyll/wiki/Configuration
e.g.
destination: ../_site
but you will overwrite filenames, etc if they are duplicated between the two.

What are MonoDevelop's .pidb files?

MonoDevelop creates those for every project. Should I include them in source control?

From a MonoDevelop blog post:
There were several long time pending
bug reports, and I also wanted to
improve a bit the performance and
memory use. MonoDevelop creates a
Parser Information Database (pidb)
file for each assembly or project.
This file contains all the information
about classes implemented in an
assembly, together with documentation
pulled from Monodoc. A pidb file has
trhee sections: the first one is a
header which contains among other
things the version of the file format
(that version is checked when loading
the pidb, and the file will be
regenerated if it doesn't match the
current implementation version). The
second section is the index of the
pidb file. It contains an index of all
classes in the database. The index is
always fully loaded in memory to be
able to quickly locate classes. The
third section of the file contains all
the class information: list of
methods, fields, properties,
documentation for each of those, and
so on. Each entry in the index has a
file offset field, which can be used
to completely load all the information
of a class (the index only contains
the name).
So it sounds like it's really just an optimization. I would personally not include it in source control unless you find it makes a big difference to performance: my guess is it will only really stay valid if only one person is working on the project at a time. (If it's big and changes regularly, you could find it adds significant overhead to the repository too. I haven't checked to see what the size is actually like, but it's worth checking.)

They're just cached code completion data. As the post Jon linked explains, the main reason is to save memory, though they do also save you from waiting for MD to parse all the source files and referenced assemblies when you open a project.
The pidb files can be regenerated pretty quickly, so there's no advantage to keeping them in the VCS. Indeed, as well as the VCS repository overhead, it could also cause problem if people are using different versions of MD with different pidb formats, so I'd strongly recommend against keeping them in source control.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008