MySQL: translate id's to values - mysql

I have a table that contains alot of columns with ids(keys) corresponding to other tables.
for example, I have a table of cars that were sold
[table of cars that were sold]
(
car_make_id
, car_engine_id
, car_model_id
, car_radio_id
, buyer_id
, seller_id
, car_tittle_id
, sale_price
)
with each one of the id fields having another table containing the id and name like:
[another table]
(
car_make_id
, car_engine_id
, car_model_id
, car_radio_id
, buyer_id
, seller_id
, car_tittle_id
, sale_price
)
[and another table]
(
car_make
, car_make_id
)
[and another table]
(
car_title
, car_title_id
)
etc,...with each table named car_lookup, car_model_lookup,...
Is there anyway to join all these simply without writing a million subqueries. The are millions of entries in this table, and each additional join costs alot in terms of time. I am looking for a fast and efficient way of comparing this data against another table that doesn't have id's, but just the names. lets say I have a list of compatible radios that would have(make, model, engine, radio) and I want to have a list of all the sellers names who sold cars with incompatible radios, and how many incompatible sales they made.
I have been doing stuff like this in perl, but it can take hours to run. so I am looking for something that can be done in mysql.
ps: the car stuff is just an example, I don't actually work with cars, but it illustrates the problem I am having. I cannot change the way the database is set up either, due to a large number of code that already queries the data.
Thanks

You need some way of telling the database which tables to pull names from for each ID.
If this kind of query is too slow, perhaps you can optimize your database or MySQL server to be able to fill these JOIN statements faster. Try increasing cache sizes (especially if your server has much RAM) and make sure you have key indexing on those lookup tables.
SELECT car_make, car_engine, car_model, car_radio,
buyer, seller, car_title, sale_price FROM cars_sold
JOIN car_make_lookup USING (car_make_id)
JOIN car_engine_lookup USING (car_engine_id)
JOIN car_title_lookup USING (car_title_id)
JOIN car_model_lookup USING (car_model_id)
JOIN car_radio_lookup USING (car_radio_id)
JOIN buyer_lookup USING (buyer_id)
JOIN seller_lookup USING (seller_id)
JOIN car_title_lookup USING (car_title_id)

Related

Fulltext search on a column with data from 2 tables

I have two tables in MySql DB named as 'Patients' and 'Country'.
Patient table contains
'name','dob',postcode','address', 'country_id' etc.
Country table has
'id' and 'country_name' columns.
Now, I want the user to enter anything from a patient's name, postcode or country and get the required patient's result/data.
To achieve this, one way that I can think of is to perform the query using joins.
The other way, I wanted to ask was will it be a good approach to store the search variables i.e name, postcode and country in a column with full-text type in a way like this 'name_postcode_country' and when a user enters the search variable I perform the full-text search on the newly created column.
Or there's any other better approach that I should be considering.
It's not a good idea to hold all those info at a single column, you may use such a combination with a SELECT that JOINs the mentioned tables :
select p.name, p.dob, p.postcode, p.address,
c.country_name
from Patients p
inner join Country c
on ( p.country_id = c.id )
where ( upper(name) like upper('%my_name_string%') )
or ( upper(postcode) like upper('%my_postcode_string%') )
or ( upper(country) like upper('%my_country_string%') );
you need to use upper or lower pseudocolumns against case-sensitivity problems.

Specific MySQL Normalization for many same inputs

Thank you for opening this post. I need your help with MySQL DB Normalization.
I have 10 sales people, selling stuff, and I have few people in support team, that does telemarketing sales for salesmen. So anyways, was thinking what's the best way to normalize table, but to take into consideration that salesmen often quits so I have to transfer their DB to another sales agent DB.
Currently I have only one table. Is it better to put all stuff into that one table or to separate into many other small tables. Currently my table looks like this
id , mb , company_name , city , company_owner , phone_no1 , phone_no2 , app_status , sales_agent , cc_agent , three_options , exp_date , cc_comment , sales_comment , input_date , call_made , status
id = AI PK UQ
mb = is unique key that I provide
company_name, city, company_owner, cc_comment, sales_comment, input_date, call_made, phone_no1 and phone_no2 should be in one table since it's all different. Right?
sales_agent = I have 10 people
cc_agent = I have 3 people
app_status = I have 3 statuses to select, so it has to be one out of three
status = I have 15 statuses to pick from, it has to be only one of those
So, maybe one table for all changeable stuff, another for sales_agent, another for cc_agent, app_status and status?
Thank you in advance.
Lots of smaller tables is better. The goal of normalization is to eliminate repeating data, and increase referential integrity. You achieve that with lots of small tables and use of foreign keys.
When creating your table structure, you want to look for nouns. When you use a noun-phrase approach, identifying possible tables is easier. In your case, I see columns called company_name, company_owner, etc. Create a table called 'Company' and give it columns id, name, owner, city, state, zip etc.
You have sales people, agents, and support people. But they are all employees. So another possible table is 'Employee' and job_title is a column.
Really spend the time and think out the database structure, and try to get the database into 3rd normal form.

sql table design to fetch records with multiple inclusion and exclusion conditions

We want to select customers based on following parameters i.e. customer should be in:
specific city i.e. cityId=1,2,3...
specific customerId should be excluded i.e. customerId=33,2323,34534...
specific age i.e. 5 years, 7 years, 72 years...
This inclusion & exclusion list can be any long.
How should we design database for this:
Create separate table 'customerInclusionCities' for these inclusion cities and do like:
select * from customers where cityId in (select cityId from customerInclusionCities)
Some we do for age, create table 'customerEligibleAge' with all entries of eligible age entries:
i.e. select * from customers where age in (select age from customerEligibleAge)
and Create separate table 'customerIdToBeExcluded' for excluding customers:
i.e. select * from customers where customerId not in (select customerId from customerIdToBeExcluded)
OR
Create One table with Category and Ids.
i.e. Category1 for cities, Category2 for CustomerIds to be excluded.
Which approach is better, creating one table for these parameters OR creating separate tables for each list i.e. age, customerId, city?
IN ( SELECT ... ) can be very slow. Do your query as a single SELECT without subqueries. I assume all 3 columns are in the same table? (If not, that adds complexity.) The WHERE clause will probably have 3 IN ( constants ) clauses:
SELECT ...
FROM tbl
WHERE cityId IN (1,2,3...)
AND customerId NOT IN (33,2323,34534...)
AND age IN (5, 7, 72)
Have (at least):
INDEX(cityId),
INDEX(age)
(Negated things are unlikely to be able to use an index.)
The query will use one of the indexes; having both will give the Optimizer a choice of which it thinks is better.
Or...
SELECT c.*
FROM customers AS c
JOIN cityEligible AS b ON b.city = c.city
JOIN customerEligibleAge AS ce ON c.age = ce.age
LEFT JOIN customerIdToBeExcluded AS ex ON c.customerId = ex.customerId
WHERE ex.customerId IS NULL
Suggested indexes (probably as PRIMARY KEY):
customers: (city)
customerEligibleAge: (age)
customerIdToBeExcluded: (customerId)
In order to discuss further, please provide SHOW CREATE TABLE for each table and EXPLAIN SELECT ... for any of the queries actually work.
If you use the database only that operation, I recommend to use the first solution. Also the first solution is very simple to deploy.
The second solution fills up with junk the DB.

How do I make the rows of a lookup table into the columns of a query?

I have three tables: students, interests, and interest_lookup.
Students has the cols student_id and name.
Interests has the cols interest_id and interest_name.
Interest_lookup has the cols student_id and interest_id.
To find out what interests a student has I do
select interests.interest_name from `students`
inner join `interest_lookup`
on interest_lookup.student_id = students.student_id
inner join `interests`
on interests.interest_id = interest_lookup.interest_id
What I want to do is get a result set like
student_id | students.name | interest_a | interest_b | ...
where the column name 'interest_a' is a value in interests.name and
the interest_ columns are 0 or 1 such that the value is 1 when
there is a record in interest_lookup for the given
student_id and interest_id and 0 when there is not.
Each entry in the interests table must appear as a column name.
I can do this with subselects (which is super slow) or by making a bunch of joins, but both of these really require that I first select all the records from interests and write out a dynamic query.
You're doing an operation called a pivot. #Slider345 linked to (prior to editing his answer) another SO post about doing it in Microsoft SQL Server. Microsoft has its own special syntax to do this, but MySQL does not.
You can do something like this:
SELECT s.student_id, s.name,
SUM(i.name = 'a') AS interest_a,
SUM(i.name = 'b') AS interest_b,
SUM(i.name = 'c') AS interest_c
FROM students s
INNER JOIN interest_lookup l USING (student_id)
INNER JOIN interests i USING (interest_id)
GROUP BY s.student_id;
What you cannot do, in MySQL or Microsoft or anything else, is automatically populate columns so that the presence of data expands the number of columns.
Columns of an SQL query must be fixed and hard-coded at the time you prepare the query.
If you don't know the list of interests at the time you code the query, or you need it to adapt to changing lists of interest, you'll have to fetch the interests as rows and post-process these rows in your application.
What your trying to do sounds like a pivot.
Most solutions seem to revolve around one of the following approaches:
Creating a dynamic query, as in Is there a way to pivot rows to columns in MySQL without using CASE?
Selecting all the attribute columns, as in How to pivot a MySQL entity-attribute-value schema
Or, identifying the columns and using either a CASE statement or a user defined function as in pivot in mysql queries
I don't think this is possible. Actually I think this is just a matter of data representatioin. I would try to use a component to display the data that would allow me to pivot the data (for instance, the same way you do on excel, open office's calc, etc).
To take it one step further, you should think again why you need this and probably try to solve it in the application not in the database.
I know this doesn't help much but it's the best I can think of :(

Database better inner join for performance in this case

I have 2 tables that manages the time spent on doing various things:
#times(id, time_in_minutes)
#times_intervals(id, times_id, time_in_minutes, start, end)
Then the #times might relate to different things:
#tasks(id, description)
#products(id, description, serial_number, year)
What is the best practice in order to reuse the same #times and #times_intervals for #task and #products?
I would think about:
#times(+task_id, +product_id)
// add task_id and product_id to the original #times table
But if I do so, when I'd join the #times table with #task and #products table would be slower as should choice between the 2 (task_id or product_id). When task_id is not null join on the #tasks and viceversa.
(I'm using MySQL6)
Thanks a lot
I would drop the time_in_minutes column from the times table. This information is redundant if it is just the sum of the detail and is a premature optimization.
I would add a product_time table containing product_id, times_id and a task_time table containing task_id, time_id
Then to get the total time with a product:
SELECT *
FROM product p
INNER JOIN product_time pt
ON pt.product_id = p.id
INNER JOIN (
SELECT times_id, SUM(time_in_minutes) as time_in_minutes
FROM times_intervals
GROUP BY times_id
) AS t
ON t.times_id = pt.times_id
Typically to make this perform, you would have a non-clustered covering index for times_intervals with columns times_id and time_in_minutes - note that the times table is simply a data-less header table at this point and the only purpose it to group the times_intervals and it's only necessary because you have this very similar arrangement for tasks.
If there were not two (or more) entities using the times_intervals, you might simply put product_id in the times_intervals and treat it as your header/master id.
I would suggest against adding an id column to times for every table you might join it to. It would break normalization and make joins much more complicated.
If you only have one time (or time interval) for a task or a product, make a column in that table that references the times table. Otherwise, you could make a separate table like
#multitimes(multi_id, time_id)
where the two columns together are a primary key, and then have products and tasks reference multi_id. Then each record in each of those tables can be related to any number of times without any conflicts.