An abstraction of the problem is like this:
I have one table having a column called 'country'. the value stored are name of the country, e.g. US, UK..
I have another table having a column called 'country_code'. the value stored are numerical representations of the country, e.g. 12, 17...
how can I perform a join operation (e.g. inner join) based on these 2 tables? the difficulty is that the country and country_code has a one-to-one mapping but not directly equal to each other.
You could create a Mapping table containging the country and the country_code.
I assume you cannot change the table containing the country_code to use the string representation from country, or add an int column to your countries table?
Something like
country_mappings
country varchar column
country_code int column
PRIMARY KEY country, country_mapping
'
SELECT *
FROM countries c INNER JOIN
country_mappings cm ON c.country = cm.country inner join
your_other_table yot ON cm.country_code = yot.country_code
Related
I have table contacts with more than 1,000,000 and other table cities which have about 20,000 records. Need to fetch all cities which have used in contacts table.
Contacts table have following columns
Id, name, phone, email, city, state, country, postal, address, manager_Id
cities table have
Id, city
I used Inner join for this, but its taking a long time to go. Query takes more than 2 minutes to execute.
I used this query
SELECT cities.* FROM cities
INNER JOIN contacts ON contacts.City = cities.city
WHERE contacts.manager_Id= 1
created index on manager_Id as well. But still its very slow.
for better performance you could add index
on table cities column city
on table contacts a composite index on columns (manager_id, city)
Filter contacts first and then join to cities:
SELECT ct.*
FROM cities ct INNER JOIN (
SELECT city FROM contacts
WHERE manager_Id = 1
) cn ON cn.city = ct.city
You need indexes for city in both tables and for manager_id in contacts.
As others have pointed out about having proper index, I am taking it a bit more for clarification. You are specifically looking for contacts where the MANAGER ID = 1. This is not expected to be one person, but could be many people. So having the MANAGER ID in the first position will optimize get me all people for that manager. By having the city as part of the index via (manager_id, city), you are pulling the two data elements you need to optimize as part of the index. This way the engine does not have to go to the raw data pages to get the other part of interest.
Now, From that, you want all the city information (hence the join to city table on that ID).
Since you are only querying the CITIES and not the actual contact people information, you probably want to have DISTINCT City ID. Lets say a manager is responsible for 50 people and most of them live in the same city or neighboring. You may have 5 distinct cities? That too will limit your result set of joining.
Having said that, I would do a follows, and with MySQL, using STRAIGHT_JOIN can help optimize by "do the query as I wrote it, don't think for me".
select STRAIGHT_JOIN
cty.*
from
( select distinct c.City
from Contacts c
where c.Manager_ID = 1 ) PQ
JOIN Cities cty
on PQ.City = Cty.City
The "PQ" is an alias representing my "pre-query" of just DISTINCT cities for a given manager.
Again, have one index on Contacts table on (manager_id, city). On the city table, I would expect and index on (city).
You need two indexes, one on each table.
On the contacts table, first index manager_Id, then City
CREATE INDEX idx_contacts_mgr_city ON contacts(manager_Id, City);
On the cities table, just index `City.
Is the 'City' field from the table 'Contacts' a VARCHAR?
If that's the case, I see multiple things here.
First of all, since you have already have the 'Id' for the corresponding city in your 'cities' tables, I don't see why not to use the same 'Id' from the 'cities' table for the 'Contacts' table.
You can add the 'IdCity' field to the 'Contacts' table so you don't have to modify your existing records.
You'll have to insert the 'IdCity' manually though for each of your records, or you can create a Query using 'cities' table and then compare the 'idCity' but insert the 'city' (city name) in your 'Contacts' table.
Returning to your query:
Then, use an INT JOIN instead of a VARCHAR JOIN. Since you have many records, this can show up an important significance in performance.
It looks like you need to add two indexes, one on cities.city and one on (contacts.manager_Id, contacts.city). That should speed things up significantly.
I have 4 tables containing id and names from different fields, and a master table that contains only ids, i need to create a query that return the names.
This is the structure (simplified)
table region = columns id, name
table country = columns id, name
table ethnics = columns id, name
table religion = columns id, name
table master = columns region, country,ethnics, religion
table master contains ONLY ids for each column, and i need to return the names that matches those ids, but i can't create the proper JOIN syntax.
Any hint?
Try this:
select region.name, country.name, ethnics.name, religion.name
from master
join region on (region.id = master.region)
join country on (country.id = master.country)
join ethnics on (ethnics.id = master.ethnics)
join religion on (religion.id = master.religion)
Then you can add any where clauses that you might need to filter the results.
I have two tables in mySql like:
table 1: city
id name transport
1 new-york 1,3,4
2 dallas 3,4
3 la 1,2,4
4 california 3,4
table 2: transport
id name
1 bus
2 trolleybus
3 train
4 metro
Can I received result like example with one query?
result:
id name transport
1 new-york bus,train,metro
2 dallas train,metro
3 la bus,trolleybus,metro
4 california train,metro
You should change your database structure and normalize it. Never store data as comma-separation since its a bad way to store data. However till you fix the database design the following query should do what you are looking at.
select
id,
name,
group_concat(transport)
from
(
select
c.id,
c.name,
t.transport as transport
from city c
join transport t on find_in_set(t.id,c.transport)
)x
group by id ;
DEMO
If you need to order the transport values then you can use
group_concat(transport ORDER BY transport)
why is comma-separation is bad practice?
You can read the following why it should be ignored
Is storing a delimited list in a database column really that bad?
To normalize the database you will need to create another table as
city_transport (cid int , tid) ;
cid = city id
tid = transport id
For each city you will have multiple entry in this table. So the tables should look like
create table city (id int , name varchar(100));
insert into city values
(1,'new-york'),(2,'dallas'),(3,'la'),(4,'california');
create table transport (id int ,transport varchar(100));
insert into transport values
(1,'bus'),(2,'trolleybus'),(3,'train'),(4,'metro');
create table city_transport (cid int ,tid int);
insert into city_transport values
(1,1),(1,3),(1,4),(2,3),(2,4),(3,1),(3,2),(3,4),(4,3),(4,4);
And the query to get the same result is as
select
c.id,
c.name,
group_concat(t.transport order by t.transport) as transport
from city_transport ct
join city c on c.id = ct.cid
join transport t on t.id = ct.tid
group by c.id ;
When you have a large amount of data then essentially you will need index and then using join on indexed columns the performance will be way better than using find_in_set with comma separated list
You should work with a table between city and transport to be correct. That being said, you could fix this using REPLACE() and subqueries but the performance will be horrible.
I have a table with a bunch of orders... one of the columns is order_status. The data in that column ranges from 1 to 5. Each number relates to a name, which is stored in another table that relates that number to the respective name.
SELECT order_id , order_status FROM tablename1
The above would just return the numbers 1,2,3,4,5 for order status. How can i query within the query on the fly to replace these numbers with their respective names.
Also, what's the term used to describe this. I'd Google it if i knew what the appropriate term was.
Each number relates to a name, which is stored in another table that
relates that number to the respective name.
JOIN it with the other table:
SELECT
t.order_id,
s.StatusName
FROM tablename1 AS t
INNER JOIN the statusesTable AS s ON t.order_status = s.status_id;
I have a row of entities stored in a mysql table;
entity table
order
id
name
type (refers to Type(id))
and a row of types stored in another table
Type table
id
name
order
the ORDER column in Type table specifies by what order the entities should be sorted - some should be ordered by name and some by id.
How do I create a mysql query that gets the ORDER BY clause from the type table and sorts the entities in the entity table by the ORDER stored for that entity type in the Type table
for example, let us say I have the following rows:
Entity table:
row 1:
id = 1
name = Virginia
type = 1
row 2:
id = 2
name = Virginia
type = 1
row 3:
id = 3
name = Canada
type = 2
types (rows in Type table)
row 1
id = 1
name = states
order = "name"
row 2:
id = 2
name = countries
order = id
I want to do the following query
SELECT entities.id, entities.name FROM entities INNER JOIN type ON entities.type = type.id ORDER BY ....
in the ORDER BY I want to order the entities based on what is stored in the ORDER row in the type table. So countries should be sorted by Entity(ID) and states should be sorted by Entity(name). How can I do that?
This doesn't seem like a very good design for a database. For your example, I would suggest something more similar to this:
CREATE TABLE countries (
countryID INT NOT NULL,
countryName VARCHAR(30)
);
CREATE TABLE states (
stateID INT NOT NULL,
countryID INT,
stateName VARCHAR(30)
);
Then you can perform queries like:
SELECT c.countryName, s.stateName
FROM countries c LEFT JOIN states s ON c.countryID = s.countryID
ORDER BY countryName, stateName;
At the very least, I would suggest using more obvious names for your columns, like in your entity table, you have a column named 'type' which refers to the 'id' field in the type table. Perhaps name them both typeID or something more obvious.
I also think it's a bad idea to create a column that stores information about which column to order by. Not only does that mean that you'll have to execute two queries every time (one to fetch the order by, and one to fetch the actual data), but you will also be storing a lot of extra data unnecessarily.