Inner join in mysql take a long time - mysql

I have table contacts with more than 1,000,000 and other table cities which have about 20,000 records. Need to fetch all cities which have used in contacts table.
Contacts table have following columns
Id, name, phone, email, city, state, country, postal, address, manager_Id
cities table have
Id, city
I used Inner join for this, but its taking a long time to go. Query takes more than 2 minutes to execute.
I used this query
SELECT cities.* FROM cities
INNER JOIN contacts ON contacts.City = cities.city
WHERE contacts.manager_Id= 1
created index on manager_Id as well. But still its very slow.

for better performance you could add index
on table cities column city
on table contacts a composite index on columns (manager_id, city)

Filter contacts first and then join to cities:
SELECT ct.*
FROM cities ct INNER JOIN (
SELECT city FROM contacts
WHERE manager_Id = 1
) cn ON cn.city = ct.city
You need indexes for city in both tables and for manager_id in contacts.

As others have pointed out about having proper index, I am taking it a bit more for clarification. You are specifically looking for contacts where the MANAGER ID = 1. This is not expected to be one person, but could be many people. So having the MANAGER ID in the first position will optimize get me all people for that manager. By having the city as part of the index via (manager_id, city), you are pulling the two data elements you need to optimize as part of the index. This way the engine does not have to go to the raw data pages to get the other part of interest.
Now, From that, you want all the city information (hence the join to city table on that ID).
Since you are only querying the CITIES and not the actual contact people information, you probably want to have DISTINCT City ID. Lets say a manager is responsible for 50 people and most of them live in the same city or neighboring. You may have 5 distinct cities? That too will limit your result set of joining.
Having said that, I would do a follows, and with MySQL, using STRAIGHT_JOIN can help optimize by "do the query as I wrote it, don't think for me".
select STRAIGHT_JOIN
cty.*
from
( select distinct c.City
from Contacts c
where c.Manager_ID = 1 ) PQ
JOIN Cities cty
on PQ.City = Cty.City
The "PQ" is an alias representing my "pre-query" of just DISTINCT cities for a given manager.
Again, have one index on Contacts table on (manager_id, city). On the city table, I would expect and index on (city).

You need two indexes, one on each table.
On the contacts table, first index manager_Id, then City
CREATE INDEX idx_contacts_mgr_city ON contacts(manager_Id, City);
On the cities table, just index `City.

Is the 'City' field from the table 'Contacts' a VARCHAR?
If that's the case, I see multiple things here.
First of all, since you have already have the 'Id' for the corresponding city in your 'cities' tables, I don't see why not to use the same 'Id' from the 'cities' table for the 'Contacts' table.
You can add the 'IdCity' field to the 'Contacts' table so you don't have to modify your existing records.
You'll have to insert the 'IdCity' manually though for each of your records, or you can create a Query using 'cities' table and then compare the 'idCity' but insert the 'city' (city name) in your 'Contacts' table.
Returning to your query:
Then, use an INT JOIN instead of a VARCHAR JOIN. Since you have many records, this can show up an important significance in performance.

It looks like you need to add two indexes, one on cities.city and one on (contacts.manager_Id, contacts.city). That should speed things up significantly.

Related

Fulltext search on a column with data from 2 tables

I have two tables in MySql DB named as 'Patients' and 'Country'.
Patient table contains
'name','dob',postcode','address', 'country_id' etc.
Country table has
'id' and 'country_name' columns.
Now, I want the user to enter anything from a patient's name, postcode or country and get the required patient's result/data.
To achieve this, one way that I can think of is to perform the query using joins.
The other way, I wanted to ask was will it be a good approach to store the search variables i.e name, postcode and country in a column with full-text type in a way like this 'name_postcode_country' and when a user enters the search variable I perform the full-text search on the newly created column.
Or there's any other better approach that I should be considering.
It's not a good idea to hold all those info at a single column, you may use such a combination with a SELECT that JOINs the mentioned tables :
select p.name, p.dob, p.postcode, p.address,
c.country_name
from Patients p
inner join Country c
on ( p.country_id = c.id )
where ( upper(name) like upper('%my_name_string%') )
or ( upper(postcode) like upper('%my_postcode_string%') )
or ( upper(country) like upper('%my_country_string%') );
you need to use upper or lower pseudocolumns against case-sensitivity problems.

sql table design to fetch records with multiple inclusion and exclusion conditions

We want to select customers based on following parameters i.e. customer should be in:
specific city i.e. cityId=1,2,3...
specific customerId should be excluded i.e. customerId=33,2323,34534...
specific age i.e. 5 years, 7 years, 72 years...
This inclusion & exclusion list can be any long.
How should we design database for this:
Create separate table 'customerInclusionCities' for these inclusion cities and do like:
select * from customers where cityId in (select cityId from customerInclusionCities)
Some we do for age, create table 'customerEligibleAge' with all entries of eligible age entries:
i.e. select * from customers where age in (select age from customerEligibleAge)
and Create separate table 'customerIdToBeExcluded' for excluding customers:
i.e. select * from customers where customerId not in (select customerId from customerIdToBeExcluded)
OR
Create One table with Category and Ids.
i.e. Category1 for cities, Category2 for CustomerIds to be excluded.
Which approach is better, creating one table for these parameters OR creating separate tables for each list i.e. age, customerId, city?
IN ( SELECT ... ) can be very slow. Do your query as a single SELECT without subqueries. I assume all 3 columns are in the same table? (If not, that adds complexity.) The WHERE clause will probably have 3 IN ( constants ) clauses:
SELECT ...
FROM tbl
WHERE cityId IN (1,2,3...)
AND customerId NOT IN (33,2323,34534...)
AND age IN (5, 7, 72)
Have (at least):
INDEX(cityId),
INDEX(age)
(Negated things are unlikely to be able to use an index.)
The query will use one of the indexes; having both will give the Optimizer a choice of which it thinks is better.
Or...
SELECT c.*
FROM customers AS c
JOIN cityEligible AS b ON b.city = c.city
JOIN customerEligibleAge AS ce ON c.age = ce.age
LEFT JOIN customerIdToBeExcluded AS ex ON c.customerId = ex.customerId
WHERE ex.customerId IS NULL
Suggested indexes (probably as PRIMARY KEY):
customers: (city)
customerEligibleAge: (age)
customerIdToBeExcluded: (customerId)
In order to discuss further, please provide SHOW CREATE TABLE for each table and EXPLAIN SELECT ... for any of the queries actually work.
If you use the database only that operation, I recommend to use the first solution. Also the first solution is very simple to deploy.
The second solution fills up with junk the DB.

Deep joins performance

I have three tables named users, cities and countries and these two scenarios:
1) User belongs to city, city belongs to country (deep join)
Table users has 2 fields: id (PK) and city_id (FK).
Table cities has 2 fields: id (PK) and country_id (FK).
Table countries has 2 fields: id (PK) and name.
Get any user's country:
SELECT country.name
FROM users
LEFT JOIN cities ON user.city_id = cities.id
LEFT JOIN countries ON city.country_id = country.id
WHERE user.id = 1;
2) User belongs to city and country, city belongs to country (one join)
Table users has 3 fields: id (PK), city_id (FK) and country_id (FK).
Table cities has 2 fields: id (PK) and country_id (FK).
Table countries has 2 fields: id (PK) and name.
Get any user's country:
SELECT country.name
FROM users
LEFT JOIN countries ON user.country_id = country.id
WHERE user.id = 1;
At first glance, scenario 2 seems faster but, is it a good idea to have country_id FK in users table to save one join? Or should I take advantage of relationships and make a deep join? What of these two scenarios actually perform faster?
One join is almost always faster than 2 joins, but the question here shouldn't be which is faster but which is more maintainable (also look at When to optimize).
Are you actually having a performance problem? Even though in this case the data probably never changes (at least, cities usually don't change country) there is still a risk that the data between the tables gets out of date. So the question here is, is it fast enough?
These types of optimisations generally give very little benefit in terms of performance but bring in risks that the data will be out of date and it makes things more complex.
In the first situation you are primary key based lookup on three tables and reducing it to only two tables in the second. That is what I would consider a micro-optimization. You won't see significant performance returns unless the tables are enormous (millions of rows) or writes are happening quickly enough to cause lock contention.

Fetch data from table in mysql that must have corresponding records in another table

I want to show records from table name station where station have at least one song in song table.
Table structure
station
station_id
stration_name
station_description
song
song_id
station_id
song_location
Please suggest me the way to form query that shows me station data which have songs in song table.please specify a way that do not returns record with corresponding songs zero count.
What you're looking for is a INNER JOIN. You could join your stations table with your songs table by stations.station_id and songs.station_id. This will work because INNER JOIN only return rows for which the join-predicate is satisfied.
I've made an example available at SQL Fiddle, but I do recommend spending a few minutes understanding mechanics of JOIN.
SELECT DISTINCT something
FROM somewhere
JOIN somewhere_else
ON somewhere_else.other_thing = somewhere.thing;
You could join the tables together at station_id.
It looks like each song is linked to a spesific station. Meaning where these ID (station_id) is equal, the station has this song...

Normalizing MySQL table with records of another table

i have 2 tables. The city tables is not normalized because the country information is in plain text. I have added the id_country to the 'city' table (that column is empty).
I need to check for matches between city>country and country>country and then update the city records that matched with the id_country from the country table. At the end i will be able to delete the 'country' column from the city table.
City table
id_city (1, 2, 3...)
city (Washington, Guayaquil, Bonn...)
country (Germany, Ecuador, USA...)
id_country (currently empty)
Country table
id_country (1, 2, 3...)
code (GE, EC, US...)
country (Germany, Ecuador, USA...)
I have no idea on where to start and if it can be done with a SQL query. My original idea was to search for matches in a php loop but that seems to be a really harder implementation.
You can do this with a JOIN on an UPDATE statement.
UPDATE city c1 INNER JOIN country c2 ON c1.country=c2.country
SET c1.id_country=c2.id_country;
Using an INNER JOIN will make sure that updates only occur for cities that have a matching country value.
Once you've run it, you'll be able to select all those cities that still have a null id_country just in case some of them didn't match. Conversely, once you've determined that all your cities have an id_country, you can delete that column from the city table.
The city tables is not normalized because the country information is
in plain text.
Nonsense. Normalization doesn't mean "replace plain text with id numbers". Find whoever taught you that and poke him in the eye with a sharp stick.
Your real problem is that "city" plus "country" isn't sufficient to identify cities, at least in the USA. I think there are at least a dozen different cities named "Washington" in the USA.
Instead of replacing the country name with an id number, you'd be far better off replacing it with the two-letter country code. The codes are human-readable; the id numbers will require an additional JOIN in every query that uses your table of cities.
Something like this should work:
UPDATE city set id_country = (SELECT country.id_country from country WHERE country.country = city.country)