Writing a many-to-many query for common table attributes - mysql

It's been a long time since I've worked directly with MySQL, so for fun, I have a MySQL database of train information, based off the NYC Subway system. I have a Station table and Route table, and there is a many-to-many relationship between the two.
What I want to do is find out which stations two or more different routes have in common. I'm trying different join techniques but none of them seem to be working (I probably have the syntax wrong).
So for example, I want to see which stations serve both routes "1" and "2" (i.e., which stations do these routes have in common?).
mysql> describe station;
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(45) | YES | UNI | NULL | |
+-------+-------------+------+-----+---------+----------------+
mysql> describe route;
+-------+------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(4) | YES | | NULL | |
+-------+------------+------+-----+---------+----------------+
The table between them is called RouteStation that has ID's for each of the two tables.
The query I want is to find the names of the stations that serve any group of routes. So in this example, I want to find the names of the stations that serve both routes "1" and "2".
SELECT Station.name FROM Station
JOIN RouteStation ON (Station.id = RouteStation.stationId)
JOIN Route ON (Route.id = RouteStation.routeId)
WHERE Route.name = "1" AND Route.name = "2";
I know there is probably an issue with the last part because it's not possible for the route name to be "1" and "2" at the same time, but I hope that the gist of what I'm looking for is clear enough.
EDIT: RouteStation schema:
mysql> describe routestation;
+-----------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+---------+------+-----+---------+-------+
| stationId | int(11) | NO | MUL | NULL | |
| routeId | int(11) | NO | MUL | NULL | |
+-----------+---------+------+-----+---------+-------+

Can you try this one:
SELECT name
FROM station
WHERE id IN (
SELECT RS.stationId
FROM RouteStation RS
WHERE RS.routeId IN (1, 2)
GROUP BY RS.stationId
HAVING COUNT(RS.routeId) > 1
)
Instead of using AND, you should use the OR keyword: routeId = 1 OR routeId = 2. It is the same with IN keyword.
After that, we used GROUP BY stationId so we can COUNT the routeIds because we need only the stations which are HAVING more than 1 routes.

Related

MySQL - UPDATE one column based on results of a SELECT when the SELECT returns multiple columns

I've read MySQL - UPDATE query based on SELECT Query and am trying to do something similar - i.e. run an UPDATE query on a table and populate it with the results from a SELECT.
In my case the table I want to update is called substances and has a column called cas_html which is supposed to store CAS Numbers (chemical codes) as a HTML string.
Due to the structure of the database I am running the following query which will give me a result set of the substance ID and name (substances.id, substances.name) and the CAS as a HTML string (cas_values which comes from cas.value):
SELECT s.`id`, GROUP_CONCAT(c.`value` ORDER BY c.`id` SEPARATOR '<br>') cas_values, GROUP_CONCAT(s.`name` ORDER BY s.`id`) substance_name FROM substances s LEFT JOIN cas_substances cs ON s.id = cs.substance_id LEFT JOIN cas c ON cs.cas_id = c.id GROUP BY s.id;
Sample output:
id | cas_values | substance_name
----------------------------------------
1 | 133-24<br> | Chemical A
455-213<br>
21-234
-----|----------------|-----------------
2 999-23 | Chemical B
-----|----------------|-----------------
3 | | Chemical C
-----|----------------|-----------------
As you can see the cas_values column contains the HTML string (which may also be an empty string as in the case of "Chemical C"). I want to write the data in the cas_values column into substances.cas_html. However I can't piece together how to do this because other posts I'm reading get the data for the UPDATE in one column - I have other columns returned by my SELECT query.
Essentially the problem is that in my "sample output" table above I have 3 columns being returned. Other SO posts seem to have just 1 column being returned which is the actual values that are used in the UPDATE query (in this case on the substances table).
Is this possible?
I am using MySQL 5.5.56-MariaDB
These are the structures of the tables, if this helps:
mysql> DESCRIBE substances;
+-------------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-----------------------+------+-----+---------+----------------+
| id | mediumint(8) unsigned | NO | PRI | NULL | auto_increment |
| app_id | varchar(8) | NO | UNI | NULL | |
| name | varchar(1500) | NO | | NULL | |
| date | date | NO | | NULL | |
| cas_html | text | YES | | NULL | |
+-------------+-----------------------+------+-----+---------+----------------+
4 rows in set (0.01 sec)
mysql> DESCRIBE cas;
+-------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------------------+------+-----+---------+----------------+
| id | mediumint(8) unsigned | NO | PRI | NULL | auto_increment |
| value | varchar(13) | NO | UNI | NULL | |
+-------+-----------------------+------+-----+---------+----------------+
2 rows in set (0.01 sec)
mysql> DESCRIBE cas_substances;
+--------------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-----------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| cas_id | mediumint(8) unsigned | NO | MUL | NULL | |
| substance_id | mediumint(8) unsigned | NO | MUL | NULL | |
+--------------+-----------------------+------+-----+---------+----------------+
3 rows in set (0.02 sec)
Try something like this :
UPDATE substances AS s,
(
SELECT s.`id`,
GROUP_CONCAT(c.`value` ORDER BY c.`id` SEPARATOR '<br>') cas_values,
GROUP_CONCAT(s.`name` ORDER BY s.`id`) substance_name
FROM substances s
LEFT JOIN cas_substances cs ON s.id = cs.substance_id
LEFT JOIN cas c ON cs.cas_id = c.id
GROUP BY s.id
) AS t
SET s.cas_html=t.cas_values
WHERE s.id = t.id
If you don't want to modify all the value, the best way to limit the update to test it, is to add a condition in the where, something like that :
...
WHERE s.id = t.id AND s.id = 1

MySQL Performance and Memcache

I have the following (simplified) Mysql Tables:
Requests:
+----------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------------+--------------+------+-----+---------+-------+
| ID | bigint(20) | NO | PRI | NULL | |
| UniqueIdentifier | varchar(255) | YES | MUL | NULL | |
| UniversalServiceId | bigint(20) | YES | MUL | NULL | |
+----------------------+--------------+------+-----+---------+-------+
Observations:
+---------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+--------------+------+-----+---------+-------+
| ID | bigint(20) | NO | PRI | NULL | |
| Value | varchar(255) | NO | | NULL | |
| RequestId | bigint(20) | NO | MUL | NULL | |
+---------------------+--------------+------+-----+---------+-------+
I have indexed UniqueIdentifier, UniversalServiceId and RequestId.
The tables are queried on UniqueIdentifier and UniversalServiceId with a JOIN on RequestId.
The Observation table has many millions of records. The queries are painfully slow to return and I am wondering if there is anything that I can do to improve performance. I have just started reading about memcache but it seems that it may be useful only after the first query (which is often the only one) for a particular dataset.
This is they type of query that is being used:
select * from Observations where RequestId in (select ID from Requests where UniqueIdentifier = '123456' and UniversalServiceId = '1234'
Any advice / guidance appreciated!
I recommend you use a query using a JOIN operation, rather than an IN (subquery) predicate.
For example:
SELECT o.ID
, o.Value
, o.RequestId
FROM Observations o
JOIN Requests r
ON r.ID = o.RequestId
WHERE r.UniqueIdentifier = '123456'
AND r.UniversalServiceId = '1234'
For optimum performance, suitable indexes would be:
... ON Requests (UniversalServiceId, UniqueIdentifier, ID)
... ON Observations (RequestId, Value, ID)
(The choice of the leading column in the index on the Requests table would depend on the expected cardinality.)

MYSQL need AVERAGE from one table based on information from second table

I'm slowly teaching myself MySQL methods, and I'm having a tough time with this. I haven't even been able to figure out HOW to Google the question.
I have the following two tables (I think my data is normalized, but suggestions welcome):
Table 1
+----------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+--------------+------+-----+---------+----------------+
| rate_id | int(11) | NO | PRI | NULL | auto_increment |
| rate | decimal(9,2) | YES | | NULL | |
| guess | decimal(9,2) | YES | | NULL | |
| date | date | YES | MUL | NULL | |
| house_id | int(11) | NO | MUL | NULL | |
| date_mod | date | YES | | NULL | |
+----------+--------------+------+-----+---------+----------------+
Table 2
+-----------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+---------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(64) | YES | | NULL | |
| beds | int(2) | YES | | NULL | |
| baths | int(2) | YES | | NULL | |
| pets | char(4) | YES | | NULL | |
| pool | char(4) | YES | | NULL | |
+-----------------+---------------+------+-----+---------+----------------+
I want to populate guess with the average rate of all properties during the same date period, based on similar properties in table 2. That is, for each id/house_id, I need all the houses that are similar (same beds, baths, view, etc.), and then the average of all the rates on the same dates.
My biggest issue is that I don't understand how to reference a field in a second table based on the id selected. This is what I'm starting with - just to see if I can get averages to return (I know this won't UPDATE the guess field).
SELECT AVG(t1.rate)
INNER JOIN t2 ON (t1.house_id = t2.id)
WHERE t2.beds = t2.beds
AND t2.baths = t2.baths
AND t2.pets = t2.pets
AND t1.date = t1.date
AND t1.house_id = 2;
Aside from the fact that the SQL command doesn't complete - I think it's obvious that my SQL knowledge is woefully inadequate - I think I'm just missing a more complex SQL method to identify the fields I'm looking for. Can anybody help?
#Strawberry - appreciate the comments. As it turns out, I already had AVG() working (I had already Googled it)
My problems were in two sets. First, how does one get column values from one table to use to get keys for us on another table. I didn't realize that you could use a nested SELECT statement to do this - although I still haven't convinced myself this is the most efficient way to do this.
Second, the average function was grouping all rates together into one average. I was able to do a bit of manipulation to produce the output I was looking for, but ultimately, the GROUP BY function was able to provide me most of the functionality I needed. Below is basically what I ended up with.
SELECT date, AVG(t1.rate) FROM rates
JOIN houses ON (t1.house_id = t2.id)
WHERE beds = (SELECT beds FROM t2 WHERE id = 2)
AND baths = (SELECT baths FROM t2 WHERE id = 2)
AND pets = (SELECT pets FROM t2 WHERE id = 2)
GROUP BY date;
Thanks for commenting.

Alter one Table on Insertion into another

Let me first explain my situation here , I have a tabled called users which stores the user information.
+----------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------+-------------+------+-----+---------+-------+
| user_id | varchar(10) | NO | PRI | NULL | |
| username | text | NO | | NULL | |
| password | text | NO | | NULL | |
| name | text | NO | | NULL | |
| email | text | NO | | NULL | |
| status | varchar(15) | NO | | active | |
+----------+-------------+------+-----+---------+-------+
And a table called country
+--------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-------------+------+-----+---------+-------+
| country_id | varchar(10) | NO | PRI | NULL | |
| country_name | text | NO | | NULL | |
| country_rate | double | NO | | 0.2 | |
+--------------+-------------+------+-----+---------+-------+
Now , I need a process to map the no of countries assigned a particular user.
For Example:
user_001 is allowed to use country_001 , country_002 , country_003 n
user_002 is allowed to use country_003 , country_008
and so on..
What is the best approach to achieve the above?
What I thought is to have a table called say assignment and it will have the following fields:
assignment_id (primary key)
user_id (foreign Key)
country_001 (bool)
country_002 (bool)
...
...
country_010 (bool)
I am not sure if this the best approach , but even if I go for this , I am stuck in the point as to how to alter the structure of assignment , on every insertion in the country table ( Add a BOOL field in assignment with the newly created country_id as the column name).
I hope I was able to explain m situation. I know I can achieve via language support code (PHP , C++ etc) , but I was thinking if this could be done using some kind of TRIGGER so that I don't have to care about it in the code section.
Thanks a Lot.
It would be a better option to define user_countries table like this:
user_id fk on users(user_id)
country_id fk on countries(country_id)
unique key on ( user_id, country_id ) -- composite unique key
I am not sure why you want to define 10 columns in the user-country relation table.
But instead of 10 columns you just define a country_id with a foreign key and
with one-to-many relation ship between user_id and country_id. With this you can easily handle user and country relations. Unique key on their combination makes data access faster and avoids duplicates too.

MYSQL query return empty result

I need to look up email preferences for users.
This table contains the types of email a user can receive, broken down by category.
email_preferences_categories
+----------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| name | text | YES | | NULL | |
| overview | text | YES | | NULL | |
+----------+------------------+------+-----+---------+----------------+
This table contains their preference for receiving various types. If they haven't set their preferences, this table won't have any rows for them.
email_preferences
+------------+---------------------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------------------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| user_id | int(10) unsigned | NO | | NULL | |
| name | text | YES | | NULL | |
| frequency | enum('Daily','Monthly','None') | YES | | Daily | |
+------------+---------------------------------+------+-----+---------+----------------+
I need to construct a MYSQL query that returns the name and frequency corresponding to the email preferences for a given user.
SELECT name, frequency
FROM email_preferences
LEFT JOIN email_preferences_categories using (name)
WHERE user_id = 42
Where I'm having trouble: If the user hasn't set their preferences, this query doesn't return any rows. I would like it to return the default of 'Daily' for email categories that are missing.
Change LEFT JOIN to RIGHT JOIN.
...
FROM email_preferences
RIGHT JOIN email_preferences_categories
...
Or alternatively you can swap the tables around:
...
FROM email_preferences_categories
LEFT JOIN email_preferences
...
These two options both do the same thing - ensure that you get all rows from email_preferences_categories even if there is no matching row in email_preferences.
You also need to change the join condition as you already noticed.
I would like it to return the default of 'Daily' for email categories that are missing.
You can use IFNULL:
SELECT name, IFNULL(frequency, 'Daily') AS frequency
This query doesn't need a WHERE clause. It needs a more restrictive JOIN. Here is the full query combined with Mark Byers answer above.
SELECT email_preferences_categories.name, IFNULL(frequency, 'Daily') AS frequency
FROM email_preferences_categories
LEFT JOIN email_preferences
ON email_preferences.name = email_preferences_categories.name
AND user_id = 42;