How to select rows that start with a digit in Rails? - mysql

I have page that shows items in an index.
I'm able to get items by letter using the following:
scope :by_letter, lambda { |letter| where("name LIKE '#{letter}%'") }
But I can't figure out an elegant solution for names that start with a number (0-9).
How could I rewrite this or a separate scope that would let me search for names starting with a digit?
EDIT: I'm trying to get all rows that start with 0-9 in one go (not separately for each number).

this should work
scope :starts_with_number, where("name REGEXP '[0-9]%'")

Jacob, try this slightly rewritten version of what you ended up with:
#letter_merchants = (0..9).map { |d| Merchant.by_letter(d) }
Please note that this should only illustrate how awesome language Ruby is, not how the problem should be solved (there would be too many database calls).

Here's how I ended up doing it:
#letter_merchants = []
(0..9).to_a.each do |digit|
#letter_merchants |= Merchant.by_letter(digit)
end

One disadvantage of REGEXP is that it can't use indexes. however
scope :starts_with_number, where("name >= '0' and name < ':')
can use an index on name. It does rely on the characters 0-9: being in precisely that order, with nothing in between which will be the case in anything like ascii, utf8 but not if you used ebcdic or anything crazy like that

Related

mySQL replace string + additional string with a static value

Unlike PHP, I don't believe mySQL has any preg_replace() feature, only matching via REGEXP. Here are the strings I have in the code:
http://ourcompany.com/theapplestore/...
http://ourcompany.com/anotherstore/...
http://ourcompany.com/yetanotherstore/...
As you can see, there is a constant in there, http://ourcompany.com/, but there is also a variable string namely theapplestore, anotherstore, etc. etc.
I want to replace the constant string, plus the variable string(s), and then the trailing slash (/) after the variable string(s), with a single shortcode value, namely {{store url=''}}
EDIT
If it helps, the store codes are always the same length, they are going to be
sch131785
sch185399
sch634019
etc.
i.e., they are all 9 characters long
How would I do this? Thanks.
I thought this might be useful: there is currently NO WAY to do this in mysql. Find using REGEXP, yes; replace, no. That said, there is another post with an extension library mentioned, sagi:
Is there a MySQL equivalent of PHP's preg_replace?
MariaDB-10.0.5 has REGEXP_REPLACE(), REGEXP_INSTR() and REGEXP_SUBSTR()
You can use following regex,
(ourcompany.com\/\w+\/)
Demo
Uses the concept of Group Capture

What type of programming is this?

I just finished taking an entry placement test for computer science as in college. I passed, but missed a bunch of questions in a specific category: variable assignment. I want to make sure I understand this before moving on.
It started out with easy things, like "set age equal to age"
int age = 18, pretty simple
But then, it had a question which I had no clue how to approach. It went something like...
"Determine if character c is is in alphabet and assign to a variable"
I could easily do that with a function, but the issue is, it gave me literally a line to write my entire answer (so about 50 characters max). Here is how the answer box looked:
My first thought was to do something like
in_alphabet = function(c) {
var alphabet = ["a", "b" ... "z"]
if(alphabet.indexOf(c) != -1)
return true;
}
But this solution has two issues:
How can I set the "c" value when the whole function is equal to in_alphabet?
I can't fit this into the small answer box. I am 99% sure they were looking for something else. Does anybody know what they were looking for? I can't think of a one line solution for this
Language doesn't matter (although a solution in java/c++ would be preferred). I would appreciate any guidance (doesn't have to be a solution, I just don't even know where to begin)
The question "Determine if character c is is in alphabet and assign to a variable" does not ask you to create a function (although in many languages this would be the best way to do this).
In R you could do something like:
inAlphabet <- c %in% letters
So you can certainly do it in one line in some real-world languages. Note that letters is a built-in list of characters.
It's a VBA solution and returns C in the variable:
LetterC = Mid("ABCDEFGHIJKLMNOPQRSTUVWXYZ", InStr("ABCDEFGHIJKLMNOPQRSTUVWXYZ", "C"), 1)
Is that what you're after?
Many languages have a data type that represents a single character, and they often can be compared using binary operators like < > <= >=, wherein the characters are compared numerically.
So something like this should suffice:
in_alphabet = c >= 'a' && c <= 'z'
And some languages already have built in methods to do things similar to this (e.g., Character.isLetter).
I copied straight from How to check if character is a letter in Javascript?
in_alphabet = c.length === 1 && c.match(/[a-z]/i)? str : ""
In Java, Character.isLetter(c)
In .NET, Char.IsLetter(c)
Perhaps you were being tested on knowledge of basic data types and some of the facilities they provide.

NLTK letter 'u' in front of text result?

I'm learning NLTK with a tutorial and whenever I try to print some text contents, it returns with 'u' in front of it.
In the tutorial it looks like this,
firefox.txt Cookie Manager: "Don't allow sites that set removed cookies to se...
But in my result, it looks like this
(u'firefox.txt', u'Cookie Manager: "Don\'t allow sites that set removed cookies to se', '...')
I am not sure why. I followed exact way the tutorial is explaining. Can someone help me understand this problem? Thank you!
That leading u just means that that string is Unicode. All strings are Unicode in Python 3. The parentheses means that you are dealing with a tuple. Both will go away if you print the individual elements of the tuple, as with t[0], t[1], and so on (assuming that t is your tuple).
If you want to print the whole tuple as a whole, removing u's and parentheses, try the following:
print " ".join (t)
As mentioned in other answer the leading u just means that string is Unicode. str() can be used to convert unicode to str but there doesnt seem to be a direct way to convert all the values in a tuple from unicode to string.
Simple function as below and using it when ever you are referring to any tuple in nltk.
>>> def str_tuple(t, encoding="ascii"):
... return tuple([i.encode(encoding) for i in t])
>>> str_tuple(nltk.corpus.gutenberg.fileids())
('austen-emma.txt', 'austen-persuasion.txt', 'austen-sense.txt', 'bible-kjv.txt', 'blake-poems.txt', 'bryant-stories.txt', 'burgess-busterbrown.txt', 'carroll-alice.txt', 'chesterton-ball.txt', 'chesterton-brown.txt', 'chesterton-thursday.txt', 'edgeworth-parents.txt', 'melville-moby_dick.txt', 'milton-paradise.txt', 'shakespeare-caesar.txt', 'shakespeare-hamlet.txt', 'shakespeare-macbeth.txt', 'whitman-leaves.txt')
I guess you are using Python2.6 or any version before 3.0.
Python allows its users to do the same operation on 'str()' and 'unicode' in the early version. They tried to make conversion between 'str()' and 'unicode' directly in some case rely on default encoding, which on most platform is ASCII. That's probably the reason cause your problem. Here are two ways may solve it:
First, manually assign decoding method. For example:
>> for name in nltk.corpus.gutenberg.fileids():
>> name.decode('utf-8')
>> print(name)
The other way is to UPDATE your Python to version 3.0+ (Recommended). They fix this problem in Python3.0. Here is the link to update detail description:
https://docs.python.org/release/3.0.1/whatsnew/3.0.html#text-vs-data-instead-of-unicode-vs-8-bit
Hope this helps you.

How to compile a complete list of MySQL "Words"

Really getting into MySQL and one thought I've had on mastering one aspect of it is to gather a complete listing of MySQL words. One example of this might be the Reserved Words list, though it appears that's not a complete list; example: CONCAT, CRC32, etc.
Bizarre as it may seem, I was thinking that such a list might exist, or that there might even be a query that would yield it, and/or a way to extract it from the source code of MySQL.
It is a non-scientific method, but what I would do is:
extract all strings from Native_func_registry func_array. Lookup for it sql/item_create.cc , e.g in
http://bazaar.launchpad.net/~mysql/mysql-server/mysql-trunk/view/head:/sql/item_create.cc
Those should cover builtin functions.
extract strings from 'symbols' and 'functions' in lexer :
http://bazaar.launchpad.net/~mysql/mysql-server/mysql-trunk/view/head:/sql/lex.h
extract symbols from bison input http://bazaar.launchpad.net/~mysql/mysql-server/mysql-trunk/view/head:/sql/sql_yacc.yy from lines
%token SOMETOKEN
except when tokens have _SYM suffix (they are covered by sql/lex.h)
Combine all of those, and the resulting set might come near :)

How should substring() work?

I do not understand why Java's [String.substring() method](http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#substring(int,%20int%29) is specified the way it is. I can't tell it to start at a numbered-position and return a specified number of characters; I have to compute the end position myself. And if I specify an end position beyond the end of the String, instead of just returning the rest of the String for me, Java throws an Exception.
I'm used to languages where substring() (or substr()) takes two parameters: a start position, and a length. Is this objectively better than the way Java does it, and if so, can you prove it? What's the best language specification for substring() that you have seen, and when if ever would it be a good idea for a language to do things differently? Is that IndexOutOfBoundsException that Java throws a good design idea, or not? Does all this just come down to personal preference?
There are times when the second parameter being a length is more convenient, and there are times when the second parameter being the "offset to stop before" is more convenient. Likewise there are times when "if I give you something that's too big, just go to the end of the string" is convenient, and there are times when it indicates a bug and should really throw an exception.
The second parameter being a length is useful if you've got a fixed length of field. For instance:
// C#
String guid = fullString.Substring(offset, 36);
The second parameter being an offset is useful if you're going up to another delimited:
// Java
int nextColon = fullString.indexOf(':', start);
if (start == -1)
{
// Handle error
}
else
{
String value = fullString.substring(start, nextColon);
}
Typically, the one you want to use is the opposite to the one that's provided on your current platform, in my experience :)
I'm used to languages where
substring() (or substr()) takes two
parameters: a start position, and a
length. Is this objectively better
than the way Java does it, and if so,
can you prove it?
No, it's not objectively better. It all depends on the context in which you want to use it. If you want to extract a substring of a specific length, it's bad, but if you want to extract a substring that ends at, say, the first occurrence of "." in the string, it's better than if you first had to compute a length. The question is: which requirement is more common? I'd say the latter. Of course, the best solution would be to have both versions in the API, but if you need the length-based one all the time, using a static utility method isn't that horrible.
As for the exception, yeah, that's definitely good design. You asked for something specific, and when you can't get that specific thing, the API should not try to guess what you might have wanted instead - that way, bugs become apparent more quickly.
Also, Java DOES have an alternative substring() method that returns the substring from a start index until the end of the string.
second parameter should be optional, first parameter should accept negative values..
If you leave off the 2nd parameter it will go to the end of the string for you without you having to compute it.
Having gotten some feedback, I see when the second-parameter-as-index scenario is useful, but so far all of those scenarios seem to be working around other language/API limitations. For example, the API doesn't provide a convenient routine to give me the Strings before and after the first colon in the input String, so instead I get that String's index and call substring(). (And this explains why the second position parameter in substr() overshoots the desired index by 1, IMO.)
It seems to me that with a more comprehensive set of string-processing functions in the language's toolkit, the second-parameter-as-index scenario loses out to second-parameter-as-length. But somebody please post me a counterexample. :)
If you store this away, the problem should stop plaguing your dreams and you'll finally achieve a good night's rest:
public String skipsSubstring(String s, int index, int length) {
return s.subString(index, index+length);
}