How to import aws public dataset into mysql instance? - mysql

I'm trying to use a public dataset in Amazon AWS, called Twilio/Wigle.net Street Vector Data Set. This dataset contains data of US street names and address ranges. Its size is about 10 GiB. When we take a look in linux, it looks like the following :
ubuntu#ip-172-31-xxx-xxx:/data-us-street$ pwd
/data-us-street
ubuntu#ip-172-31-xxx-xxx:/data-us-street$ ll
total 20576
drwxr-xr-x 6 27 sudo 4096 May 19 2009 ./
drwxr-xr-x 23 root root 4096 Apr 20 18:10 ../
-rw-r--r-- 1 root root 8339 May 19 2009 README
drwx------ 2 27 sudo 4096 Mar 18 2009 addresses/
-rw-rw---- 1 27 sudo 5242880 Mar 18 2009 ib_logfile0
-rw-rw---- 1 27 sudo 5242880 Mar 8 2009 ib_logfile1
-rw-rw---- 1 27 sudo 10485760 Mar 18 2009 ibdata1
drwx------ 2 27 sudo 16384 Mar 8 2009 lost+found/
drwx------ 2 27 sudo 4096 Mar 8 2009 mysql/
-rw-rw---- 1 27 sudo 117 Mar 18 2009 mysql-bin.000001
-rw-rw---- 1 27 sudo 19 Mar 18 2009 mysql-bin.index
drwx------ 2 27 sudo 4096 Mar 8 2009 test/
In order to use its data, I want to link it to a MySQL database in the same host. Can somebody tell me how to do it ?
What I've tried
I've tried to overwrite the mysql storage directory datadir located at /etc/mysql/my.cnf with the following change :
#datadir = /var/lib/mysql
datadir = /data-us-street
I stopped the server, changed to value then restarted the MySQL server. However, it doesn't work.
README
/*
Copyright (c) 2009 Twilio, Inc.
Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation
files (the "Software"), to deal in the Software without
restriction, including without limitation the rights to use,
copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following
conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
*/
1. The Twilio/Wigle.net public address dataset
This public dataset contains the street data for the U.S., based on
the geodata published by the U.S. Census Bureau's TIGER project.
We have reformated the data from GIS friendly shapefiles to a more
generally accessible MySQL database.
This dataset covers all the streets, roads, and highways in the U.S.
These streets are represented as "polylines" which are shapes made up
of individual line segments. Each polyline has its own unique ID,
and each segment that makes up the polyline has a sequence number.
The combination of this ID and sequence number is the primary key of
the address table. Each row in the database represents one of these
line segments.
2. Description of Columns
+-------------+------------+------+-----+------------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------+------+-----+------------+-------+
| id | char(10) | NO | PRI | | |
| seq | char(3) | NO | PRI | | |
| name | char(30) | YES | MUL | NULL | |
| prefix | char(2) | YES | MUL | NULL | |
| type | char(4) | YES | | NULL | |
| startlat | float(12,8)| NO | | 0.00000000 | |
| startlong | float(12,8)| NO | | 0.00000000 | |
| endlat | float(12,8)| NO | | 0.00000000 | |
| endlong | float(12,8)| NO | | 0.00000000 | |
| leftzip | int(5) | YES | MUL | NULL | |
| rightzip | int(5) | YES | MUL | NULL | |
| leftaddr1 | char(11) | YES | | NULL | |
| leftaddr2 | char(11) | YES | | NULL | |
| rightaddr1 | char(11) | YES | | NULL | |
| rightaddr2 | char(11) | YES | | NULL | |
| name_dtmf | char(30) | YES | | NULL | |
| prefix_dtmf | char(2) | YES | | NULL | |
+-------------+------------+------+-----+------------+-------+
id - this is the unique ID of a street, road, or highway polyline,
according to US census blocks
seq - this is the sequence number of this segment in the polyline,
used to correlate 2 segments of the same street within a US
census block
name - the name of this street, road, or highway. example: Main
prefix - the prefix of the street name. examples: N,S,E,W
type - the suffix of the street name. examples: Blvd, St, Rd
startlat/startlong - the lat/long pair of this segments the starting
point
endlat/endlong - the lat/long pair of this segments the ending point
leftzip/rightzip - the zipcode of the addresses on the corresponding
side of the street
leftaddr1/leftaddr2/rightaddr1/rightaddr2 - the starting and ending
addresses
numbers for each side of the street
name_dtmf/prefix_dtmf/type_dtmf - the name, prefix and type columns,
represented
as DTMF encoded numbers. For example, (A,B,C) = 2, (D,E,F) = 3,
etc.
Useful for telephony applications.
3. Example Segment
--------------------------------------------------------
id: 111710515
seq: 0
name: Main
prefix:
type: St
startlat: 41.49493408
startlong: -87.70324707
endlat: 41.49483490
endlong: -87.70324707
leftzip: 60466
rightzip: 60443
leftaddr1: 21801
leftaddr2: 21805
rightaddr1: 21800
rightaddr2: 21804
name_dtmf: 6246
prefix_dtmf:
--------------------------------------------------------
Name: Main St.
Left Zip: 60466
21801 21805
X------------------------------------>
21800 21804
Right Zip: 60443
4. Common Queries
a. Find the street segment record for 30 Rockefeller Plaza, NY, NY,
10020
SELECT * FROM address WHERE leftzip=10020 AND name LIKE 'Rockefeller'
AND 30 BETWEEN leftaddr1 AND leftaddr2
UNION
SELECT * FROM address WHERE rightzip=10020 AND name LIKE
'Rockefeller' AND 30 BETWEEN rightaddr1 AND rightaddr2
--------------------------------------------------------
id: 59657155
seq: 0
name: Rockefeller
prefix:
type: Plz
startlat: 40.7585
startlong: -73.979
endlat: 40.7592
endlong: -73.9786
leftzip: 10020
rightzip: 10020
leftaddr1: 22
leftaddr2: 38
rightaddr1: 21
rightaddr2: 39
name_dtmf: 7625335537
prefix_dtmf:
--------------------------------------------------------
b. Find all the street names within a zipcode.
SELECT DISTINCT prefix, name, type FROM address WHERE leftzip=10009
OR rightzip = 10009
+--------+----------+------+
| prefix | name | type |
+--------+----------+------+
| | 1st | Ave |
| | Avenue A | |
| E | 13th | St |
| E | 14th | St |
| E | 12th | St |
| E | 11th | St |
| E | 10th | St |
| E | 2nd | St |
| E | 3rd | St |
| E | 4th | St |
| E | 6th | St |
| E | 7th | St |
| | St Marks | Pl |
| E | 9th | St |
| | Avenue B | |
| E | 5th | St |
| | Avenue C | |
| E | 20th | St |
| E | 15th | St |
| E | 16th | St |
| E | 8th | St |
| | Avenue D | |
| | Szold | Pl |
+--------+----------+------+
c. Streets starting with F in a zipcode
SELECT DISTINCT prefix, name, type from address WHERE leftzip = 94117
AND name LIKE 'F%'
UNION
SELECT DISTINCT prefix, name, type from address WHERE rightzip =
94117 AND name LIKE 'F%'
+--------+------------+------+
| prefix | name | type |
+--------+------------+------+
| | Fulton | St |
| | Fell | St |
| | Frederick | St |
| | Farnsworth | Ln |
| | Fillmore | St |
| | Friendship | Ct |
+--------+------------+------+
d. Streets starting with "FRE" as DTMF digits in a zipcode:
SELECT DISTINCT prefix, name, type from address WHERE leftzip = 94117
AND name_dtmf LIKE '373%'
UNION
SELECT DISTINCT prefix, name, type from address WHERE rightzip =
94117 AND name_dtmf LIKE '373%'
+--------+------------+------+
| prefix | name | type |
+--------+------------+------+
| | Fulton | St |
| | Fell | St |
| | Frederick | St |
| | Farnsworth | Ln |
| | Fillmore | St |
| | Friendship | Ct |
+--------+------------+------+

Every sub-folder under the root folder /data-us-street represents a database. So if I want to use database addresses, I need to use folder ./addresses:
Copy the target database folder into the data directory of the current MySQL Database Server instance, e.g. /var/lib/mysql
$ cp -r /data-us-street/addresses /var/lib/mysql/addresses
Change the folder owner recursively
$ chown mysql:mysql /var/lib/mysql/addresses/*
Use database addresses and enjoy !

Related

mysql weirdly formatting output

mysql is weirdly formatting my output even though the table isnt overflowing with data in any way (only 30-4 rows, and 4 columns).
Is there something I can do to adjust this?
mysql> select id, city, state, zip from location;
+----+----------------+-------+-------+
| id | city | state | zip |
+----+----------------+-------+-------+
| 97227 |and | OR
| 95814 |mento | CA
| 94607 |nd | CA
| 90245 |gundo | CA
| 90015 |ngeles | CA
| 85004 |ix | AZ
| 84101 |Lake City | UT
| 80204 |r | CO
| 78219 |ntonio | TX
| 77002 |on | TX
| 75219 |s | TX
| 73102 |oma City | OK
| 70113 |rleans | LA
| 60612 |go | IL
| 55403 |apolis | MN
| 53203 |ukee | WI
| 48326 |n Hills | MI
| 46204 |napolis | IN
| 44115 |land | OH
| 38103 |is | TN
| 33132 | | FL
| 32801 |do | FL
| 30303 |ta | GA
| 28202 |otte | NC
| 20004 |ngton | DC
| 19148 |delphia | PA
| 11217 |lyn | NY
| 10121 |ork | NY
| 29 | Boston | MA | 2114 |
+----+----------------+-------+-------+
29 rows in set (0.00 sec)
Somehow you got carriage returns at the end of most of the state values. You can remove them with:
UPDATE location SET state = TRIM(TRAILING '\r' FROM state);
And you should investigate the code you use to add rows to this table, to see why it's leaving those characters in the data. You're probably using a file that was created on Windows and loading it into a program that runs on Unix. You can use the dos2unix command on Linux to fix all the newlines in a file. Or you can fix the program so it removes extraneous carriage return characters.

SQL Query to get most frequently visited place Laravel

I developed project at Laravel 5.2. I have database structure like this :
user_visited
id | user_id | latitude | longitude
01 | 1 | 140.5938388 | 36.3335513
02 | 1 | 140.2631739 | 36.3724621
03 | 1 | 140.0804782 | 36.083233
04 | 1 | 140.0855777 | 36.1048973
05 | 1 | 140.2215081 | 35.981243
06 | 1 | 140.577927 | 36.3114456
07 | 1 | 140.65826 | 36.6068145
08 | 1 | 140.109301 | 36.0865606
09 | 1 | 140.2055252 | 35.926693
10 | 1 | 139.7540075 | 36.1662458
11 | 1 | 140.2637594 | 36.241148
12 | 1 | 139.8043185 | 36.1115211
13 | 1 | 140.2183821 | 36.0601167
14 | 1 | 139.7540075 | 36.1662458
15 | 1 | 140.0309725 | 36.0381176
lcoations
id | Location name | Type | Address | Latitude | Longitude
31 | Murse Park | Theme Park | 552-18 | 140.6066128 | 36.3985857
32 | Dom Park | Theme Park | 552-12 | 140.6417064 | 36.5436575
33 | Football Park | Theme Park | 588-1 | 140.3690094 | 36.4195418
34 | Istanbul Park | Theme Park | 37 | 140.3330587 | 36.5449685
This is user's location history that get from mobile app that get location information every n seconds.
And from user's visited location history I want to know which place that users most frequently visited. How to do that?
The fact is, user can visit same location, but the longitude and latitude isn't perfect same.
Maybe we must set radius for 500 mill or what?
How to query that?
answering so that someone might get help.
According to Geo location, we can consider a place same if its latitude and longitude are similar at 5 to 6 precision.
Let's say if lat and long of a place are (140.5938388,36.3335513) and a user visits a place with lat and long (140.59383723,36.33355213) then by querying with precision of 5 i.e (140.59383,36.33355) we get the same block where your location exists and taking count of those places/locations will give you the frequently visited places.
Edited:
SELECT count(location_name) FROM locations
LEFT JOIN user_visited ON TRUNCATE(user_visited.latitude,5) = TRUNCATE(locations.latitude,5) AND TRUNCATE(user_visited.longitude,5) = TRUNCATE(locations.longitude,5)
WHERE user_visited.user_id = 1

How do I sort a master table based on metadata in a separate table in MySQL 5.5.x?

I'm hoping to get some advise on an SQL problem...
We have a master table (MySQL 5.5.x) that contains very little information. We we also have a metadata table that stores variable/value pairs and references the master table. The issue I'm having is that we need to retrieve the information using a JOIN to combine both tables, but we need to sort the output based on a particular meta-datum. The following trivial example will illustrate.
Here's a super-distilled version of the schema:
CREATE TABLE fundraise (
id INTEGER NOT NULL,
charity TEXT NOT NULL,
PRIMARY KEY(id)
);
CREATE TABLE meta (
master_id INTEGER REFERENCES fundraise(id),
variable TEXT NOT NULL,
value TEXT NOT NULL
);
We then enter some information for all three charities:
INSERT INTO fundraise(id, charity) VALUES
(1, 'save the dolphins'),
(2, 'feed the kids'),
(3, 'cloth the homeless');
We also insert some metadata:
INSERT INTO meta(master_id, variable, value) VALUES
(1, 'name', 'Mike'), (1, 'priority', 'high'), (1, 'start','2016'),
(2, 'name', 'Barb'), (2, 'priority', 'veryhigh'), (2, 'start','2012'),
(3, 'name', 'Sam'), (3, 'priority', 'veryhigh'), (3, 'start','2013');
Note that the metadata variable 'start' is intended to be used as the sort order of the required report. Here's the SQL statement I'm using to generate the report (unsorted):
SELECT f.charity, m.variable, m.value
FROM fundraise f
LEFT OUTER JOIN meta m ON (f.id = m.master_id);
The output I'm getting seems correct, for the most part, except that we haven't sorted yet:
+--------------------+----------+----------+
| charity | variable | value |
+--------------------+----------+----------+
| save the dolphins | name | Mike |
| save the dolphins | priority | high |
| save the dolphins | start | 2016 |
| feed the kids | name | Barb |
| feed the kids | priority | veryhigh |
| feed the kids | start | 2012 |
| cloth the homeless | name | Sam |
| cloth the homeless | priority | veryhigh |
| cloth the homeless | start | 2013 |
+--------------------+----------+----------+
But what I really need is for it to display sorted on the "start" year, while keeping all the details about a particular charity together. In other words, I need to see the report order by year, like this:
+--------------------+----------+----------+
| charity | variable | value |
+--------------------+----------+----------+
| feed the kids | name | Barb |
| feed the kids | priority | veryhigh |
| feed the kids | start | 2012 |
| cloth the homeless | name | Sam |
| cloth the homeless | priority | veryhigh |
| cloth the homeless | start | 2013 |
| save the dolphins | name | Mike |
| save the dolphins | priority | high |
| save the dolphins | start | 2016 |
+--------------------+----------+----------+
But I'm at a loss as to how to do this... Anyone has any suggestions on how to do this using SQL, exclusively?!?!
May thanks in advance!
p.s., I'd like to point out that the actual system I'm using is much much more complex, and the above is a rather contrived demo to simplify the asking of the question.
Try this.
SELECT * FROM (SELECT f.id AS id,f.charity, m.variable, m.value FROM fundraise f RIGHT OUTER JOIN meta m ON (f.id = m.master_id) GROUP BY value HAVING (variable = 'start') ORDER BY value) as sorted_table LEFT JOIN meta m2 ON sorted_table.id = m2.master_id ORDER BY sorted_table.value
This is my result using that query.
MariaDB [fbb]> SELECT * FROM (SELECT f.id AS id,f.charity, m.variable, m.value FROM fundraise f RIGHT OUTER JOIN meta m ON (f.id = m.master_id) GROUP BY value HAVING (variable = 'start') ORDER BY value) as sorted_table LEFT JOIN meta m2 ON sorted_table.id = m2.master_id ORDER BY sorted_table.value
-> ;
+------+--------------------+----------+-------+-----------+----------+----------+
| id | charity | variable | value | master_id | variable | value |
+------+--------------------+----------+-------+-----------+----------+----------+
| 2 | feed the kids | start | 2012 | 2 | name | Barb |
| 2 | feed the kids | start | 2012 | 2 | priority | veryhigh |
| 2 | feed the kids | start | 2012 | 2 | start | 2012 |
| 3 | cloth the homeless | start | 2013 | 3 | name | Sam |
| 3 | cloth the homeless | start | 2013 | 3 | priority | veryhigh |
| 3 | cloth the homeless | start | 2013 | 3 | start | 2013 |
| 1 | save the dolphins | start | 2016 | 1 | name | Mike |
| 1 | save the dolphins | start | 2016 | 1 | priority | high |
| 1 | save the dolphins | start | 2016 | 1 | start | 2016 |
+------+--------------------+----------+-------+-----------+----------+----------+
9 rows in set (0.01 sec)
MariaDB [fbb]>

Merge a specific table column from one database to another

I have two SQL databases. One is an older backup of the other.
I would like to merge only 1 specific table from [user_database0] into [user_database1] using either ssh or inside phpMyAdmin.
That table that I want to restore from backup is called [prefix_table].
However, I don't want to restore all columns from that table [prefix_table], just the [comment] column.
One of my biggest concerns is that some of the rows from the [prefix_table] have been deleted and I DO NOT want to restore those deleted rows from the old database.
Here is an Example:
*- Merge table [prefix_table]from [user_database0]:
prefix_table
+---------------------------------------------------+
| id | name | comment | age | person_id |
+++++++++++++++++++++++++++++++++++++++++++++++++++++
| 1111 | name1 | old text 1 | 01 | 001 |
+------------+-------+------------+-----+-----------+
| 2222 | name2 | old text 2 | 02 | 002 |
+------------+-------+------------+-----+-----------+
| 3333 | name3 | old text 3 | 03 | 003 |
+------------+-------+------------+-----+-----------+
| 4444 | name4 | old text 4 | 04 | 004 |
+------------+-------+------------+-----+-----------+
*-Into table [prefix_table] in [user_database1] :
prefix_table
+-----------------------------------------------------+
| id | name | comment | age | person_id |
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
| 1111 | namenew | new text 1 | 99 | 001 |
+------------+---------+------------+-----+-----------+
| 4444 | name4 | new text 4 | 04 | 004 |
+------------+---------+------------+-----+-----------+
| 5555 | name5 | text 1 | 05 | 005 |
+------------+---------+------------+-----+-----------+
| 6666 | name6 | text 2 | 06 | 006 |
+------------+---------+------------+-----+-----------+
*- Resulting database [user_database1]:
prefix_table
+-----------------------------------------------------+
| id | name | comment | age | person_id |
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
| 1111 | namenew | old text 1 | 99 | 001 |
+------------+---------+------------+-----+-----------+
| 4444 | name4 | old text 4 | 04 | 004 |
+------------+---------+------------+-----+-----------+
| 5555 | name5 | text 1 | 05 | 005 |
+------------+---------+------------+-----+-----------+
| 6666 | name6 | text 2 | 06 | 006 |
+------------+---------+------------+-----+-----------+
So it basically has to check if the table [prefix_table] matches in both databases then overwrite the data in the [comment] column. Note that if other column data changes, it should leave it as is, only the [comment] should be updated.
In Summary (both databases are on same server in one phpMyAdmin account)
FROM: [user_database0].[prefix_table].[content]
TO: [user_database1].[prefix_table].[content]
IF: [id] column matches in both tables.
Here is a working version from the recommendations below:
UPDATE
[new_db_name].[table_name]
INNER JOIN
[old_db_name].[table_name]
ON
[new_db_name].[table_name].[column name] = [old_db_name].[table_name].[column name]
SET
[new_db_name].[table_name].[matching column name] = [new_db_name].[table_name].[matching column name]
The following will work. Here it is updating the new comment with the old comment for any matching IDs in both.
Edit: This should point you in the correct direction. I had your schema before, I misunderstood what you wrote - my bad.
UPDATE [new_table_name] SET comment = [old_table_name].onecomment
FROM [new_table_name]
INNER JOIN [old_table_name] ON [new_table_name].aboutme_id = [old_table_name].aboutme_id
Note: the above syntax is SQL Server because the question was at first erroneously tagged sql-server.

How to Query a Table with Multiple Foreign Keys and Return Actual Values

New to MySQL, so please bear with me.
I'm working on a project that collects user's degrees. Users can save 3 degrees where the type, subject matter, and school are variable. These relations are normalized for other query uses so 5 tables are involved and are shown below (all have more columns then shown, just included the relevant info). The last one, 'user_degrees' is where the keys come together.
degrees
+----+-------------------+
| id | degree_type |
+----+-------------------+
| 01 | Bachelor's Degree |
| 02 | Master's Degree |
| 03 | Ph.D. |
| 04 | J.D. |
+----+-------------------+
acad_category
+------+-----------------------------------------+
| id | acad_cat_name |
+------+-----------------------------------------+
| 0015 | Accounting |
| 0026 | Business Law |
| 0027 | Finance |
| 0028 | Hotel & Restaurant Management |
| 0029 | Human Resources |
| 0030 | Information Systems and Technology |
+------+-----------------------------------------+
institutions
+--------+--------------------------------------------+
| id | inst_name |
+--------+--------------------------------------------+
| 000001 | A T Still University of Health Sciences |
| 000002 | Abilene Christian University |
| 000003 | Abraham Baldwin Agricultural College |
+------+----------------------------------------------+
users
+----------+----------+
| id | username |
+----------+----------+
| 00000013 | Test1 |
| 00000018 | Test2 |
| 00000023 | Test3 |
+----------+----------+
user_degrees
+---------+-----------+---------+---------+
| user_id | degree_id | acad_id | inst_id |
+---------+-----------+---------+---------+
| 18 | 1 | 4 | 1 |
| 23 | 1 | 15 | 1 |
| 23 | 2 | 15 | 1 |
| 23 | 3 | 15 | 1 |
+---------+-----------+---------+---------+
How can I query 'user_degrees' to find all degrees by user x, but return the actual values of the foreign keys? Taking user Test3 as an example, I'm looking for output like so (truncated for layout's sake):
+-------------------+-------------------+-------------------+
| degree_type | acad_cat_name | inst_name |
+-------------------+-------------------+-------------------+
| Bachelor's Degree | Accounting | A T Still Uni.. |
| Master's Degree | Accounting | A T Still Uni.. |
| Ph.D. | Accounting | A T Still Uni.. |
+-------------------+-------------------+-------------------+
I'm guessing a mix of multiple joins, temp tables and subqueries are the answer but am having trouble grasping the order of things. Any insight is much appreciated, thanks for reading.
You need to join user_degrees to degrees (and the other tables referenced by user_degrees). This is the query that will give you your example output:
SELECT
ud.user_id, d.degree_type, ac.acad_cat_name, i.inst_name
FROM
user_degrees ud
INNER JOIN degrees d ON d.id = ud.degree_id
INNER JOIN acad_category ac ON ac.id = ud.acad_id
INNER JOIN institutions i ON i.id = ud.inst_id
WHERE
ud.user_id = 18
You may also want to read this article to understand different kinds of joins: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
The only way to understand these things at your stage of learning is to actually write the queries and then modify them until you get your desired output.