Regex To Exclude Email-Expression - html

I have 430 HTML files of different organization's contact us web pages, I was given this files to extract emails from.
This regex simple code I came up with detects and finds emails throughout the files
\S*#\S*
My Problem
I'm trying to select everything besides the emails so I can use Notepad++'s "Replace All in All Opened Documents" function to delete everything besides the emails. Is this possible with regular expressions?
Is there anyway I can select everything outside of the regular expression provided above?

Make sure you have a recent version of Notepad++ installed to have the necessary regex support:
Find what : (^|\s+)[^#]+(\s+|$)
Replace with : \n
🔘 Regular expression
The . matches newline option does not influence the action.

You need to remove all text that does not match some pattern.
You need to match and capture the emails with a (...) capture group and then you need to just match everything else.
Use a pattern like this: ( + your_pattern + )|., and replace with $1.
Or, use:
([^\s<>"]+#[^\s<>"]+)|.
or
(\b[A-Za-z0-9._%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,4}\b)|.
Replace with: $1
Then, you might want to use Edit -> Blank Operations -> Remove Unnecessary Blank and EOL menu option.

Related

How to convert text with things like %C3%A0

I have many list of names with things like %C3%A0 which I believe stands for a with an apostrophe.
M%C3%A0rius Torres should be Marius Torres.
But the problems is that there are many different kinds of these and I cannot change them manually anymore. There are about 13,000 unique names.
How can I convert it into their correct names in excel?
As a reference, I queried many names in wikipedia database. Here is the link
The names have most likely been URL encoded. This is done to anything that is included in an URL. For example, if I try to search for " it's " on Google, my browser goes to the address https://www.google.com.au/search?hl=en&q=it%27s. As you can see, the " it's " has been changed to " it%27s ".
All that you need to do in PHP to undo this is to put the string through the urldecode() function. You'd do the following:
$string = "M%C3%A0rius Torres";
$decoded = urldecode($string);
echo $decoded;
That should give you the decoded string. Read more about the urldecode() function at http://au1.php.net/urldecode.
Your text is URL encoded. Since you have php as one of your tags, I'm assuming the output goes through some form of PHP processing. In that case, you'll want to use the urldecode function. Documentation can be found here.
In Excel, like Word, you should be able to use "find"/"find & select" followed by "replace".
You can also "sort and filter" to group things.
Both in top right hand corner of the Home page in Excel.

Mediawiki blank all pages per namespace. I want to blank all User_talk pages

I want to know if there is a way to blank all user_talk pages enmass. Not delete them, just blank them. I don't know how to write bots, so I'm really asking if there is an extension or pre written bot for this. Thank you
You could write a simple SQL to do this, just look into the page table, for my installation the namespace value for User talk: is 3, so I could just delete all pages with namespace=3.
Deleting the row from the database, will leave the page as blank (not created)
I suggest using AWB. You can easy have it build a list based on a names space and then use a simple ReGeX replace such as: Search: (.*)* Replace with: (empty space).

HTML Search Function

I need a search function for on an intranet , can I do this purely in HTML ? and if so could you possibly point me in the right direction , I tried googeling it but it came up with searching via search engines.
Cheers
EDIT: I only need it to search for text on 1 page , its not a whole website , just one page.
You will need some server-side scripting in order to provide the directory listings.
I recommend using PHP's glob function recursively, but there might be a better option.
Edit:
For one page, using JavaScript, you could get the contents of all of the elements, and use regex or indexOf to determine if the string exists within the text, and if so, where.
If you are to use the indexOf function, as the function only returns the index of the first occurrence of the string, you will need to repeat the search until you've gathered all occurrences.
You may specify the start parameter to snip the front of the searching area, to begin the new search after your last found occurrence.

separating values in a URL, not with an &

Each parameter in a URL can have multiple values. How can I separate them? Here's an example:
http://www.example.com/search?queries=cars,phones
So I want to search for 2 different things: cars and phones (this is just a contrived example). The problem is the separator, a comma. A user could enter a comma in the search form as part of their query and then this would get screwed up. I could have 2 separate URL parameters:
http://www.example.com/login?name1=harry&name2=bob
There's no real problem there, in fact I think this is how URLs were designed to handle this situation. But I can't use it in my particular situation. Requires a separate long post to say why... I need to simply separate the values.
My question is basically, is there a URL encodable character or value that can't possibly be entered in a form (textarea or input) which I can use as a separator? Like a null character? Or a non-visible character?
UPDATE: thank you all for your very quick responses. I should've listed the same parameter name example too, but order matters in my case so that wasn't an option either. We solved this by using a %00 URL encoded character (UTF-8 \u0000) as a value separator.
The standard approach to this is to use the same key name twice.
http://www.example.com/search?queries=cars&queries=phones
Most form libraries will allow you to access it as an array automatically. (If you are using PHP (and making use of $_POST/GET and not reinventing the wheel) you will need to change the name to queries[].)
You can give them each the same parameter name.
http://www.example.com/search?query=cars&query=phones
The average server side HTTP API is able to obtain them as an array. As per your question history, you're using JSP/Servlet, so you can use HttpServletRequest#getParameterValues() for this.
String[] queries = request.getParameterValues("query");
Just URL-encode the user input so that their commas become %2C.
Come up with your own separator that is unlikely to get entered in a query. Two underscores '__' for example.
Why not just do something like "||"? Anyone who types that into a search area probably fell asleep on their keyboard :} Then just explode it on the backend.
easiest thing to do would be to use a custom separator like [!!ValSep!!].

Show Vim omnicomplete on certain characters instead of Ctrl-X Ctrl-O?

In Vim 7, Ctrl-X Ctrl-O shows a list of possible values but I find this sequence of keys to be too long when I frequently use the autocomplete feature. For instance, in an HTML file, I'd like to see the list automatically popup after I type a < followed by one or two letters. In a CSS file, I'd like to see the list after I hit the ":" key. Is there a way to set this up?
To activate the omnicompletion on typing a ":" you can use
the following mapping.
imap : :<c-x><c-o>
The disadvantage is that each time you press ":" omnicompletion will
be activated, even when typing ":" in comments or in any other context
in which you do not want omnicompletion.
I have mapped ctrl-space to active omnicompletion:
imap <c-space> <c-x><c-o>
This gives me the choice to activate omni whenever I need it.
Another alternative that I found easier is just to press tab two times when you want autocomplete, and one time for regular tab.
Add the following line to your ~/.vimrc
imap <tab><tab> <c-x><c-o>