How to find only English language words using Wiktionary search box? - mediawiki

How to find only english language words using Wiktionary's search box? I'm using d*e as a query in the search box (words that begin with d letter and end at e letter).
By default Wiktionary's search box returns words also from non-english languages, words like dainuojanĨiosiose or daiktavardžiuose

An option would be to filter the results using incategory (and possibly intitle, if you want to be more specific). For example:
intitle:d*e incategory:"English nouns"
intitle:d*e incategory:"English idioms"
One problem with this approach is that you need to pick one of the many very specific "English *" categories which exist on Wiktionary.

Related

Word unscrambler wildcards using mysql

I'm building a word unscrambler using MySQL, Think about it like the SCRABBLE game, there is a string which is the letter tiles and the query should return all words that can be constructed from these letters, I was able to achieve that using this query:
SELECT * FROM words
WHERE word REGEXP '^[hello]{2,}$'
AND NOT word REGEXP 'h(.*?h){1}|e(.*?e){1}|l(.*?l){2}|l(.*?l){2}|o(.*?o){1}'
The first part of the query makes sure that the output words are constructed from the letter tiles, the second part takes care of the words occurrences, so the above query will return words like: hello, hell, hole, etc..
My issue is when there is a blank tile (a wildcard), so for example if the string was: "he?lo", the "?" Can be replaced with any letter, so for example it will output: helio, helot.
Can someone suggest any modification on the query that will make it support the wildcards and also takes care of the occurrence. (The blank tiles could be up to 2)
I've got something that comes close. With a single blank tile, use:
SELECT * FROM words
WHERE word REGEXP '^[acre]*.[acre]*$'
AND word not REGEXP 'a(.*?a){1}|r(.*?r){1}|c(.*?c){1}|e(.*?e){1}'
with 2 blank tiles use:
SELECT * FROM words
WHERE word REGEXP '^[acre]*.[acre]*.[acre]*$'
AND word NOT REGEXP 'a(.*?a){1}|r(.*?r){1}|c(.*?c){1}|e(.*?e){1}'
The . in the first regexp allows a character that isn't one of the tiles with a letter on it.
The only problem with this is that the second regexp prevents duplicates of the lettered tiles, but a blank should be allowed to duplicate one of the letters. I'm not sure how to fix this. You could add 1 to the counts in {}, but then it would allow you to duplicate multiple letters even though you only have one blank tile.
A possible starting point:
Sort the letters in the words; sort the letters in the tiles (eg, "ehllo", "acer", "aerr").
That will avoid some of the ORing, but still has other complexities.
If this is really Scrabble, what about the need to attach to an existing letter or letters? And do you primarily want to find a way to use all 7 letters?

Selecting all same html attributes with (different) values included in VSCode

I'm using VSCode for html editing. In VSCode it's very easy to select same occurences of a piece of code. What i need is selecting all ocuurances of an html attribute (like class, aria-label, etc.) with different values. Here's an example:
I want to select all "aria-label" occurences with the values included. So these will be selected:
aria-label="Apple"
aria-label="Oranges"
aria-label="Multiple Fruit Names"
aria-label=""
...
Is there a way to do that in VSCode?
I understood that regex knowledge essential so for last couple of days i studied Regex101, this is what worked for me on this question.
aria-[a-zA-Z]*="[A-Za-z\s]*"
You could use a regex for that:
^aria-label="[^"]*"
Explanation:
'^' ... matches newline
'aria-label=' ... that's your "search word"
'[^"]' ... any character
'*' ... zero or more occurrences of stuff within the group
Don't forget to enable regex search in search dialog (see below):
This is a good starting point to grasp the regex magic: https://en.wikipedia.org/wiki/Regular_expression#Basic_concepts .

MySQL: Search for words that may have interferring characters inbetween

I store lyrics of songs and also allow chords to be added by putting them between square brackets (e.g: [Dm7]). Here's an example of lyrics stored in my database:
Left my fear [Dm7]by the side of the [B]road
Hear You[C] speak won't let[E] go
Fall to my knees
...
What I want to do is search for lyrics in songs. For example I might want to search for the lyrics fear by the side . The problem is the [Dm7] in my example above does not allow a simple LIKE search.
Is it possible to do a search (REGEX?) that excludes text such as [Dm7] from a query? If so how? Please note that the chords between the square brackets can vary.
You might like to consider a fulltext index, and then use match() against() in your where clause. Example:
create fulltext index ftx on songs(lyrics);
select *
from songs
where match(lyrics) against('fear by the side');
demo here
The matching is a little fuzzy, and you can't use the boolean mode matching because the chords don't have whitespace on both sides, but the normal mode should be sufficient.
The 'fuzziness' of the match can be used to provide a match ranking - works best on english language, which this seems to be. For example:
select match(lyrics) against('fear by the side') rank,
lyrics from songs
where match(lyrics) against('fear by the side')
order by match(lyrics) against('fear by the side') desc;
Would sort the results by best match, and also return the matching rank.
updated demo
The fulltext index also has a boolean mode, which as the same suggests, can be used to force the results to include or exclude certain words like so:
match(column) against('+word -otherword' in boolean mode) would return all rows for which column contains word but does not have otherword.
your fulltext index can also be multi column, if you desire.
Thanks to #SvenB and his suggestion of this post, this was my answer.
REPLACE(col, SUBSTRING(col, (LOCATE('[', col)), LOCATE(']', col) - (LOCATE('[', col)) + 1), '') LIKE '%fear by the side%'
It's a bit messy but works! I think in the long term FULL TEXT search is the way to go based on others comments.

Full text search is not working if search for characters anywhere in the string

I have created a full text search index on ClientReference column but it doesn't work if i want to search for characters appearing any where in the string.
String = ' abcdef '
This won't work;
SELECT * FROM Proposals
WHERE CONTAINS([ClientReference], '"*bc*"')
But it works if i use prefix.
SELECT * FROM Proposals
WHERE CONTAINS([ClientReference], '"a*"')
ADDED
Someone has just mentioned that "it is not possible, You can only search based on word but not search based on alphabets within a word."
So why the following works and looks for '223' any where in the string?
select ClientReference1 from ClientReferences
where CONTAINS([ClientReference1], '"*223*"')
If you don't have lots of text and/or lots of rows (millions+), you may be better served just using LIKE instead of CONTAINS.
SELECT * FROM Proposals WHERE ClientReference LIKE '%re%'

Formatting a String Array to Display to Users

What is the best format to communicate an array of strings in one string to users who are not geeks?
I could do it like this:
Item1, Item2, Item3
But that becomes meaningless when the strings contain spaces and commas.
I could also do it this way:
"Item1", "Item2", "Item3"
However, I would like to avoid escaping the array elements because escaped characters can be confusing to the uninitiated.
Edit: I should have clarified that I need the formatted string to be one-line. Basically, I have a list of lists displayed in a .Net Winforms ListView (although this question is language-agnostic). I need to show the users a one-line "snapshot" of the list next to the list's name in the ListView, so they get a general idea of what the list contains.
You can pick a character like pipe (|) which are not used much outside programs. It also used in wiki markup for tables which may be intuitive to those who are familiar with wiki markup.
Item1| Item2| Item3
In a GUI or color TUI, shade each element individually. In a monochrome TUI, add a couple of spaces and advance to the next tab position (\t) between each word.
Using JSON, the above list would look like:
'["Item1", "Item2", "Item3"]'.
This is unambiguous and a syntax in widespread use. Just explain the nested syntax a little bit and they'll probably get it.
Of course, if this is to be displayed in a UI, then you don't necessarily want unambiguous syntax as much as you want it to actually look like something intended for the end user. In that case it would depend exactly how you are displaying this to the user.
Display each element as a cell in a table.
How about line breaks after each string? :>
Display each string on a separate line, with line numbers:
1. Make a list
2. Check it twice
3. Say something nice
It's the way people write lists in the real world, y'know :)
Use some kind of typographical convention, for example a bold hashmark and space between strings.
milk # eggs # bread # apples # lettuce # carrots
CSV. Because the very first thing your non-technical user is going to do with delimited data is import it into a spreadsheet.