I'm having the following situation:
I have a table with a list of postcodes with the format:
1234 AA (Dutch postcode)
2345 ZF
B-2345 (Belgium postcode)
B-4355
I have another table which contains postcoderanges:
PostcodeFrom
1000 AF
2000 ZF
B-1234
PostcodeTo
1999 ZX
2999 ZF
B-1889
I am looking for a solution how to look up the postcode value between the several ranges.
First I was thinking of
SUBSTRING(MyPostcode,1,4) BETWEEN SUBSTRING(PostcodeFrom,1,4) AND SUBSTRING(PostcodeTo,1,4)
.. but then there is still the problem with the characters (not even thinking about the belgium postcodes aswell).
Could anyone help me?
Yours,
Thanks for your reply!
The table you drew, needs one more field: RegionCode.
RangeTable:
| RCode | PCodeFrom | PCodeTo |
| 001 | 1000 BA | 1999 ZZ |
| 002 | 1000 AA | 1999 AZ |
Notice that if a postcode is 1234 AC, it must return RegionCode: 002 To compare numbers is not hard, but how to compare characters? I had an idea of making a table with AA - ZZ where each combination has a certain INT value, but I hope there is another, easier way.
You can only do this reliably (ignoring the potential un-reliability of doing this sort of range matching with postcodes) by splitting the portions of the postcode into different columns by character type.
I don't know much about Dutch postcodes, but if your formats are correct, you could create a table like:
+-------+------+
| code | city |
+-------+------+
| 1234 | AA |
+-------+------+
Splitting the postcodes up will allow you to do more fine-grained sorting.
Update:
Having looked at the Wikipedia page on Dutch postcodes it looks like this should work for all of them. My labels of code and city are inaccurate though.
Aside: I'm impressed that the Netherlands has such a sane postcode format, unlike the UK one where you need a huge regex to even decide if the format is valid.
Update 2:
Your checking will work with characters too, but you'll be better off storing the postcodes in a separate table, with an ID. The example above was just to show splitting up the characters from the numbers, so what you'll actually want is more like:
mysql> select * from postcodes;
+------+-------+-------+
| id | part1 | part2 |
+------+-------+-------+
| 1 | 1234 | AA |
| 2 | 5678 | BB |
+------+-------+-------+
When you're storing the ranges, don't store the postcodes in the ranges table, store the id for the entry in the postcodes table, like:
mysql> select * from ranges;
+-------------+---------------+-------------+
| region_code | postcode_from | postcode_to |
+-------------+---------------+-------------+
| 1 | 1 | 2 |
+-------------+---------------+-------------+
That record says "region 1 is 1234 AA to 5678 BB"
For an example, I'll say that postcodes start 0001 AA, then move to 0001 AB, all the way to 0001 ZZ, then 0002 AA and so on. This obviously isn't right but it demonstrates the theory. You need to substitute this for the algorithm you're using to define how postcodes are incremented and decremented.
When you want to find out "does postcode 3456 XY fit into region 89?", you split it into character and number, and check whether the values fit into a range. Using my algorithm, I check:
Is the number portion greater or less than the number portion of postcode_from?
If it's greater, then is it less than the number portion of postcode_to?
If you satisfy both conditions, check the letters - this is the important bit - MySQL's character set collation does allow you to say "is AB less than BC, you can have:
WHERE 'AB' < part2;
in your WHERE clause.
Using this method, you can figure out which of your regions has a start and an end that fit the value you're testing.
It's a bit long-winded but it will work without doing any conversions. You may need to check that the collation you're using fits the way the lettering sequence works for the specific type of postcode you're using though.
Related
Good day, newbie on php here.
I use phpmyadmin mqysql, my problem is i don't know what should i put in the encircle field shown in the picture below (also know what is this and how to use it)
I proceeded not giving any value on it and it happens whenever i make a primary key or unique key on a table i created. Is this what they call index size? i tried searching this on internet and see other tutorials but i don't see any mentions on this(maybe im googling it wrong?).
So what does this do?
what value should i put here?
what is the default value of this?
when using unique, what do veterans put on index name when selecting unique?
i hope you could enlighten or teach it to me because its quite vague now that im self studying it, thanks :)
That with the index size is very simple. imagine that you create an index on a VARCHAR (16) column. In this case the index entry is created with 16 characters. Now it can be that the strings already differ in the first characters. In such a case, the length of the index can e.g. 8 set.
This makes the index shorter, uses less memory and is therefore faster. If there are several entries in the column that are the same in the first 8 characters, all these rows are found via the index and the comparison which row really fits is then made by comparing the individual rows. So if the number of entries found is very high, the whole thing will be slower.
check how many equal entries in the table with a shorter index
+----+-----+---------------------------------------------------+
| id | rev | content |
+----+-----+---------------------------------------------------+
| 2 | 1 | One hundred angels can dance on the head of a pin |
| 1 | 1 | The earth is flat |
| 3 | 2 | The earth is flat and rests on a bull's horn |
| 5 | 5 | The earth is flat type |
| 4 | 3 | X The earth is like a ball. |
+----+-----+---------------------------------------------------+
SELECT d.*,count(*) as cnt
FROM docs as d
GROUP BY SUBSTRING(d.content,1,8)
ORDER BY cnt DESC;
+----+-----+---------------------------------------------------+-----+
| id | rev | content | cnt |
+----+-----+---------------------------------------------------+-----+
| 1 | 1 | The earth is flat | 3 |
| 2 | 1 | One hundred angels can dance on the head of a pin | 1 |
| 4 | 3 | X The earth is like a ball. | 1 |
+----+-----+---------------------------------------------------+-----+
3 rows in set (0.00 sec)
Alright, for starters I am not very experienced with mysql.
The situation is as following:
I have a data table with zip code records. I need a query that finds the correct row with a zip code and number, where the table has a number range, something like this.
Zipcode | NumberLower | NumberUpper | Street name
1234AB | 10 | 20 | Imaginary Drive
1234AB | 30 | 40 | Fantasy Street
7261XY | 2 | 4 | Rainbow Road
My current query is
SELECT * FROM zipcodetable WHERE zipcode="1234AB"
which returns the first two rows, as expected. What query should I use if I want to find the street name for the adress with zipcode 1234AB and number 34?
Add BETWEEN clause:
SELECT *
FROM zipcodetable
WHERE Zipcode='1234AB'
AND 34 BETWEEN NumberLower AND NumberUpper
Given is data which contains a period of time, spanning years. Just like this:
| ID | Name | Alive |
|----|--------------------|-----------------------|
| 1 | Washington, George | 1732-02-22/1799-12-14 |
| 2 | Adams, John | 1735-10-30/1826-07-04 |
| 3 | Jefferson, Thomas | 1743-04-13/1826-07-04 |
…
Is it possible to store this data in MySQL in a way that a search for an intermediate date (over all fields, just a year), like the search term 1788, yields results?
What I am looking for is something like this:
CREATE TABLE t (
id INT NOT NULL,
name VARCHAR(30),
alive DATERANGE
);
SELECT * FROM t WHERE * LIKE '%1788%'
The only solution I see is to add another column which contains a list of years, (1732,1733,…) but I guess there are better solutions. Do I need the date in one field or twos, and what’s the column type I need for this to work? Can I have under specified date ranges in that column (such as 1155/1227) or do I have to rewrite them before insert (like 1155-01-01/1227-12-31)?
Border matches shall be returned as well. A search for the string 1799 should still return George Washington, even though he was not alive from 1st of January until 31st of December inclusively. I guess this is rather simple since it is a string match already.
If you can edit your data then I suggest changing it to fields Born & Died if not then we can use the LEFT and INSTR functions for Born and the SUBSTRING_INDEX functions for Died.
SELECT ID, Name, Alive,
LEFT([ColName],INSTR([Alive],"/")-1) AS Born,
SUBSTRING_INDEX(Alive,'/',-1) AS Died
FROM t
Which will split out Born and Died dates:
| ID | Name | Alive | Born | Died |
|----|--------------------|-----------------------|------------|------------|
| 1 | Washington, George | 1732-02-22/1799-12-14 | 1732-02-22 | 1799-12-14 |
| 2 | Adams, John | 1735-10-30/1826-07-04 | 1735-10-30 | 1826-07-04 |
| 3 | Jefferson, Thomas | 1743-04-13/1826-07-04 | 1743-04-13 | 1826-07-04 |
Then you can use:
WHERE Alive LIKE '%1788%'
To search dates.
Or individually as Born:
WHERE LEFT([ColName],INSTR([Alive],"/")-1) LIKE '%1788%'
Died:
WHERE SUBSTRING_INDEX(Alive,'/',-1) LIKE '%1788%'
Or if you just wanted the years in the Born and Died fields use an additional LEFT function:
SELECT ID, Name, Alive,
LEFT(LEFT([ColName],INSTR([Alive],"/")-1),4) AS Born,
LEFT(SUBSTRING_INDEX(Alive,'/',-1),4) AS Died
FROM t
Which would give you:
| ID | Name | Alive | Born | Died |
|----|--------------------|-----------------------|------|------|
| 1 | Washington, George | 1732-02-22/1799-12-14 | 1732 | 1799 |
| 2 | Adams, John | 1735-10-30/1826-07-04 | 1735 | 1826 |
| 3 | Jefferson, Thomas | 1743-04-13/1826-07-04 | 1743 | 1826 |
EDIT:
you can use the BETWEEN function the other way around for that.
SELECT ID, Name, Alive,
LEFT(LEFT([ColName],INSTR([Alive],"/")-1),4) AS Born,
LEFT(SUBSTRING_INDEX(Alive,'/',-1),4) AS Died
FROM t
WHERE 1788 BETWEEN LEFT(LEFT([ColName],INSTR([Alive],"/")-1),4) AND LEFT(SUBSTRING_INDEX(Alive,'/',-1),4)
Do I need the date in one field or twos
Definitely two, birth and death, and use the predicate BETWEEN ... AND ... for your searches. It’s less expensive than to split a field in twos at every query and it makes better use of indexes.
and what’s the column type I need for this to work
That’s trickier. I would normally definitely agree with comments saying that you must use date fields, for a variety of well known good reasons. However, it is obvious from your question that you are interested only in years and effectively disregard the actual dates; furthermore, you are dealing with historic data that might be incomplete: missing days or even months are usual in this context; such incomplete dates can be stored in date fields but return NULL on some operations, which might create problems; when you have a date field you cannot create an index on the year, so your queries would all be full table scans. In short, in your particular case, I’d go for SMALLINT UNSIGNED for the years and CHAR(5) to store the less useful month-and-day info, just in case you might need it in the future, to build a real date on the fly with CAST(CONCAT(year,'-', month_and_day) AS DATE).
In conclusion, this is the design I propose:
CREATE TABLE t (
id INT NOT NULL,
name VARCHAR(30),
birth_year SMALLINT UNSIGNED,
birth_md CHAR(5),
death_year SMALLINT UNSIGNED,
death_md CHAR(5)
);
CREATE INDEX t_ndx ON t(birth_year, death_year);
SELECT * FROM t WHERE 1788 BETWEEN birth_year AND death_year;
Like #CBroe suggested - you should have two columns instead(startDate & endDate OR bornDate & DeathDate), you can then write your query this way:
select * from t where YEAR(startDate) >= 1788 OR YEAR(endDate) <= 1788
I am working for a travel site, where i need to store the tourist spots which tourists traveled to. I need the spots to be unique in the locations table so that i can know the popularity of a particular spot etc.
I will also need also need all countries, states, cities stored with me because i cannot depend on user input.
The database is MySQL.
Seeing the data sets available for such locations i see there is a problem of nesting of cities across countries which may use provinces, states, counties etc.
So, my question is how to design the schema so that i can store all the locations.
I was thinking about having tables for countries, states, cities, and spots.
the spots table will contain spot_name, cityId, stateId, countryId, and some fields to have longitude and latitude bounds.
This way i can identify same spots by their geopositions.
But again, this solution won't work because of the states/provinces/counties etc. problem.
Can you please suggest how to build the schema and go about seeding it with correct data so that dependency on user input is minimum.
you should use a geospatial database as then you can store your locations like countries and states as spatial entities and so can determine the nesting correctly.
If you can't use one you can simulate geospatial positions using strings in a normal table by dividing the world up into a grid, then subdividing each square of the grid recursively.
For example divide the world into 9 squares, numbered 1-9 from top left to bottom right. Anything which is in these large squares has only a single digit reference. Then divide each square into 9 and anything which is at this level has a 2 digit reference. so 11 is the top left square and 99 is the bottom right square.
Repeat this process until you have the precision that you need. a single feature might have a reference 10 digits long 5624357899 but you would know that this would be inside any larger feature which is fewer digits which starts with the same string like 5624357. So your countries would have fewer digits because they are larger, but your individual locations would have more because they are smaller and more accurately located.
This will only give you a course approximation of location (and will be bad for long thin features) but might be suitable enough
The first grid will look like this:
______________________________
| | | |
| 1 | 2 | 3 |
| | | |
|_________|_________|_________|
| | | |
| 4 | 5 | 6 |
| | | |
|_________|_________|_________|
| | | |
| 7 | 8 | 9 |
| | | |
|_________|_________|_________|
The second round looks like this (only first square completed for simplicity):
______________________________
|11|12 |13| | |
|---------| 2 | 3 |
|14|15 |16| | |
|---------| | |
|17|18 |19| | |
|_________|_________|_________|
| | | |
| 4 | 5 | 6 |
| | | |
|_________|_________|_________|
| | | |
| 7 | 8 | 9 |
| | | |
|_________|_________|_________|
you repeat this process until you have fine enough approximation for your purposes.
I think the schema part of your problem would be pretty simple. But the real problem is how you would get the data for your user to select - you are imagining the (almost) impossible! I don't think there is any database in existence which would translate a co-ordinate to a place name. Even Google can't (yet) do that for you - for example, a search for "Lat Long Taj Mahal" provides 27.1750, 78.0419 (Google have used their own and other people's experience to tell you that); but a search for "27.1750, 78.0419" just yields a pin on the map, and then our human eyes can see that the pin is 'pretty close' to a place named "Taj Mahal" (or ताज महल in Hindi, or તાજ મહેલ in Gujarati )...
Just imagine - how you would populate your schema? Think about how many co-ordinates you would need in your table if you wanted decent accuracy (needing at least 6 decimal places)! And who would be the authority on place names?
So I think your best approach might be to:
Use the publically available lists of country/city names translated
to their co-ordinates,
Build your app so it pre-populates the closest co-ordinate to the user's
precise location, and then
Allow the user to qualify the match with their own (more
specific) chosen place name.
Then YOU could store the precise co-ordinate gathered by your app, along with the place name the user specified; and sell the data for $millions! (I suspect Google are already doing this ;)
I have this problem.
One table with.
id | routename | usersid |
1 | route 1 | 1,2,3,5 2 |
2 | route 2 | 5,20,15 3 |
4 | route 4 | 10,15,7,5 |
I need, search ej. userid 5 in colum usersid... but I have no idea how to do, because there are multiple rows.
If you cannot change the schema then you will have to use the REGEXP operator to match on a regular expression. For example
where column REGEXP '(^|,)5(,|$)'
This matches the number 5 either at the beginning or end of the field or surrounded by commas (or any combination thereof), to avoid matching other numbers like 15, 55 or 1234567890.
If the table is large this will perform very slowly as it will require a full table scan
You might be looking for FIND_IN_SET().
select * from Table1
WHERE FIND_IN_SET(5,usersid)
SAMPLE FIDDLE