Sublime search, where filter expression - AND(?) instead of OR(,) - sublimetext2

I'm trying to figure out how to search within files containing only a certain extension and within a certain directory. I've tried */app/*,*.jade, but that seems to search for anything within the app directory OR in any other directory given its extension is .jade. I just need a method to filter for files that satisfy both conditions. I feel like the obvious solution would simply be an AND character in place of the comma (which seems to act as an OR), but I'm not sure if the syntax incorporates such a feature. Anyone know if it does, or can offer an alternative solution for my use case? Thanks.

You are indeed correct that the Find in Files functionality allows you to provide a comma separated list of search terms and they are effectively an OR operation.
Note that you can also prefix a term with - to make it mean BUT NOT, which you can use to filter results out.
In order to find only files of a certain extension AND in a certain path, you need to create a search term that includes both of them at the same time:
*/app/*.jade

Related

what are 'NET USE' possible outputs?

The question is : what are the NET USE possible outputs?
You can drown yourself with websites explaining how to use NET USE command, but not a single one about what is coming out of it.
In my case I'm interested in the various error messages, and the interaction with the Powershell automatic variable $LASTEXITCODE. I want to handle its output correctly but I don't know what can even happen (and no, I won't use New-PSDrive).
Does someone knows the what or where I can find the information ?
Thanks
You can use the example in https://www.compatdb.org/forums/topic/20487-net-use-return-code/ to obtain a list of the numerical codes for your evaluation.
If you want to dig deeper take you need to download the Win32 SDK and go through the definitions in the header files (see https://learn.microsoft.com/en-us/windows/win32/api/winnetwk/ns-winnetwk-netresourcea etc).

Hide my primefaces version from the inspector [duplicate]

I'm using primefaces and primefaces-extensions in my application. For each and every resources like .css and .js files there's also an "ln" and "v" query parameters in the GET request for that resource, like below:
primefaces-extensions.js?ln=primefaces-extension&v=6.1
validation.js?ln=primefaces&v=6.1
As a security concern, since these parameters shows the exact version of the framework I'm using, how can I hide them?
Hiding the 'ln' is kind of useless since with a very small amount of effort, you can get the same information from the javascript files and the source of the page too ('PF() is all over the place)
The 'v' however is a slightly different issue. If you use the non-modified PF source, hiding it is sort of useless too since with very little effort (creating a hash) the possible hackers can download your sources, create a hash and compare the resulting hashes with a dictionary they can easily create of existing PrimeFaces sources and then know which version you use. So the only thing to do here is to modify the source to have it not turn up 'known or comparable' hashes by making some slight modifications (adding whitespace should already help).
But if you really want the version not to be show, you can download the PrimeFaces sources and replace the version info with some ofuscated number and build that custom version. Keep in mind that if you don't make any changes in the sources, the dictionary lookups mentioned above are still working. So it is only some minor inconvenince for hackers.

searching in html/txt without loading it into program [duplicate]

I have a FindFile routine in my program which will list files, but if the "Containing Text" field is filled in, then it should only list files containing that text.
If the "Containing Text" field is entered, then I search each file found for the text. My current method of doing that is:
var
FileContents: TStringlist;
begin
FileContents.LoadFromFile(Filepath);
if Pos(TextToFind, FileContents.Text) = 0 then
Found := false
else
Found := true;
The above code is simple, and it generally works okay. But it has two problems:
It fails for very large files (e.g. 300 MB)
I feel it could be faster. It isn't bad, but why wait 10 minutes searching through 1000 files, if there might be a simple way to speed it up a bit?
I need this to work for Delphi 2009 and to search text files that may or may not be Unicode. It only needs to work for text files.
So how can I speed this search up and also make it work for very large files?
Bonus: I would also want to allow an "ignore case" option. That's a tougher one to make efficient. Any ideas?
Solution:
Well, mghie pointed out my earlier question How Can I Efficiently Read The First Few Lines of Many Files in Delphi, and as I answered, it was different and didn't provide the solution.
But he got me thinking that I had done this before and I had. I built a block reading routine for large files that breaks it into 32 MB blocks. I use that to read the input file of my program which can be huge. The routine works fine and fast. So step one is to do the same for these files I am looking through.
So now the question was how to efficiently search within those blocks. Well I did have a previous question on that topic: Is There An Efficient Whole Word Search Function in Delphi? and RRUZ pointed out the SearchBuf routine to me.
That solves the "bonus" as well, because SearchBuf has options which include Whole Word Search (the answer to that question) and MatchCase/noMatchCase (the answer to the bonus).
So I'm off and running. Thanks once again SO community.
The best approach here is probably to use memory mapped files.
First you need a file handle, use the CreateFile windows API function for that.
Then pass that to CreateFileMapping to get a file mapping handle. Finally use MapViewOfFile to map the file into memory.
To handle large files, MapViewOfFile is able to map only a certain range into memory, so you can e.g. map the first 32MB, then use UnmapViewOfFile to unmap it followed by a MapViewOfFile for the next 32MB and so on. (EDIT: as was pointed out below, make sure that the blocks you map this way overlap by a multiple of 4kb, and at least as much as the length of the text you are searching for, so that you are not overlooking any text which might be split at the block boundary)
To do the actual searching once the (part of) the file is mapped into memory, you can make a copy of the source for StrPosLen from SysUtils.pas (it's unfortunately defined in the implementation section only and not exposed in the interface). Leave one copy as is and make another copy, replacing Wide with Ansi every time. Also, if you want to be able to search in binary files which might contain embedded #0's, you can remove the (Str1[I] <> #0) and part.
Either find a way to identify if a file is ANSI or Unicode, or simply call both the Ansi and Unicode version on each mapped part of the file.
Once you are done with each file, make sure to call CloseHandle first on the file mapping handle and then on the file handling. (And don't forget to call UnmapViewOfFile first).
EDIT:
A big advantage of using memory mapped files instead of using e.g. a TFileStream to read the file into memory in blocks is that the bytes will only end up in memory once.
Normally, on file access, first Windows reads the bytes into the OS file cache. Then copies them from there into the application memory.
If you use memory mapped files, the OS can directly map the physical pages from the OS file cache into the address space of the application without making another copy (reducing the time needed for making the copy and halfing memory usage).
Bonus Answer: By calling StrLIComp instead of StrLComp you can do a case insensitive search.
If you are looking for text string searches, look for the Boyer-Moore search algorithm. It uses memory mapped files and a really fast search engine. The is some delphi units around that contain implementations of this algorithm.
To give you an idea of the speed - i currently search through 10-20MB files and it takes in the order of milliseconds.
Oh just read that it might be unicode - not sure if it supports that - but definately look down this path.
This is a problem connected with your previous question How Can I Efficiently Read The First Few Lines of Many Files in Delphi, and the same answers apply. If you don't read the files completely but in blocks then large files won't pose a problem. There's also a big speed-up to be had for files containing the text, in that you should cancel the search upon the first match. Currently you read the whole files even when the text to be found is in the first few lines.
May I suggest a component ? If yes I would recommend ATStreamSearch.
It handles ANSI and UNICODE (and even EBCDIC and Korean and more).
Or the class TUTBMSearch from the JclUnicode (Jedi-jcl). It was mainly written by Mike Lischke (VirtualTreeview). It uses a tuned Boyer-Moore algo that ensure speed. The bad point in your case, is that is fully works in unicode (widestrings) so the trans-typing from String to Widestring risk to be penalizing.
It depends on what kind of data yre you going to search with it, in order for you to achieve a real efficient results you will need to let your programm parse the interesting directories including all files in there, and keep the data in a database which you can access each time for a specific word in a specific list of files which can be generated up to the searching path. A Database statement can provide you results in milliseconds.
The Issue is that you will have to let it run and parse all files after the installation, which may take even more than 1 hour up to the amount of data you wish to parse.
This Database should be updated eachtime your programm starts, this can be done by comparing the MD5-Value of each file if it was changed, so you dont have to parse all your files each time.
If this way of working can be interesting if you have all your data in a constant place and you analyse data in the same files more than each time totally new files, some code analyser work like this and they are real efficient. So you invest some time on parsing and saving intresting data and you can jump to the exact place where a searching word appears and provide a list of all places it appears on in a very short time.
If the files are to be searched multiple times, it could be a good idea to use a word index.
This is called "Full Text Search".
It will be slower the first time (text must be parsed and indexes must be created), but any future search will be immediate: in short, it will use only the indexes, and not read all text again.
You have the exact parser you need in The Delphi Magazine Issue 78, February 2002:
"Algorithms Alfresco: Ask A Thousand Times
Julian Bucknall discusses word indexing and document searches: if you want to know how Google works its magic this is the page to turn to."
There are several FTS implementation for Delphi:
Rubicon
Mutis
ColiGet
Google is your friend..
I'd like to add that most DB have an embedded FTS engine. SQLite3 even has a very small but efficient implementation, with page ranking and such.
We provide direct access from Delphi, with ORM classes, to this Full Text Search engine, named FTS3/FTS4.

Searching for the change history of partial file or path in Mercurial or TortoiseHg

Each time I need anything beyond the standard search, I find myself trying several things, searching Google and in the end terribly failing. Apparently, the Hg search syntax is pretty extensive and I would like to use its power, but I don't seem to be able to find a good reference.
For instance, quite often I want to find all changes in the repository related to a partial path match. I know that the following works:
file('path:full/path/file.txt')
But I would like to search for files by partial match, and neither of the following worked:
jquery -- seems to find everything
file(jquery*) -- finds nothing
file('jquery*') -- finds nothing
file('path:jquery.*') -- finds nothing
file('name:jquery.*') -- finds nothing
file('path:jquery.js') -- finds every revision, it seems
From the popup in TortoiseHg I see that there are a gazillion options, but no hint on how to use them (the help link shows a little bit more, but nothing on what a pattern should look like in file(pattern)):
In the end I usually find what I want using other ways of searching, but it would be so nice to be able to use this power of expression, and it's quite a shame that after so many years, I've never found out how to leverage this.
I can very much advise using the hg help system for this. The most useful pages to look at (in my view):
hg help revsets
hg help filesets
hg help patterns
In the page about patterns, you can find about 'path:':
To use a plain path name without any pattern matching, start it with
"path:". These path names must completely match starting at the current
repository root.
In other words: using 'path:' is not suitable for this purpose. Slightly below, 'glob:' is mentioned:
To use an extended glob, start a name with "glob:". Globs are rooted at
the current directory; a glob such as "*.c" will only match files in the
current directory ending with ".c".
The supported glob syntax extensions are "**" to match any string across
path separators and "{a,b}" to mean "a or b".
In other words, it should be possible to use the pattern file('glob:**jquery*').
In fact, the above pattern would also work without the glob prefix, so: file('**jquery*'). See part of the page about revsets:
"file(pattern)"
Changesets affecting files matched by pattern.
For a faster but less accurate result, consider using "filelog()"
instead.
This predicate uses "glob:" as the default kind of pattern.

How can I extract addresses and phone number from HTML?

Is there a library that specializes in parsing such data?
You could use something like Google Maps. Geocode the address and, if successful, Google's API will return an XML representation of the address with all of the elements separated (and corrected or completed).
EDIT:
I'm being voted down and not sure why. Parsing addresses can be a little difficult. Here's an example of using Google to do this:
http://blog.nerdburn.com/entries/code/how-to-parse-google-maps-returned-address-data-a-simple-jquery-plugin
I'm not saying this is the only way or necessarily the best way. Just a way to parse addresses on a web site.
There are 2 parts to this: extract the complete address from the page, and parse that address into something you can use (store the various parts in a DB for example).
For the first part you will need a heuristic, most likely country-dependant: for US addresses [A-Z][A-Z],?\s*\d\d\d\d\d should give you the end of an address, provided the 2 letters turn out to be a state. Finding the beginning of the string is left as an exercise.
The second part can be done either through a call to Google maps, or as usual in Perl, using a CPAN module: Lingua::EN::AddressParse (test it on your data to see if it works well enough for you).
In any case this is a difficult task, and you will most likely never get it 100% right, so plan for manually checking the addresses before using them.
You don't need regular expressions (yet) or a general parser like pyparsing (at all). Look at something like Beautiful Soup, which will parse even bad HTML into something like a tree of tags. From there, you can look at the source of the page, and find out what tags to drill down through to get to the data. Then, from Beautiful Soup's tree, you can search for these nodes using XPath (in recent versions), and directly loop over the tags you're interested in, getting to the actual data easily. From there, you can parse the data using a quick regex or something. This will be more flexible and more future proof, and also possibly less head-exploding, than just trying to do it in pure regular expressions.