I'm wondering if this its even posible.
I want to join 2 tables based on the data of table 1.
Example table 1 has column food with its data beeing "hotdog".
And I have a table called hotdog.
IS it possible to do a JOIN like.
SELECT * FROM table1 t join t.food on id = foodid
I know it doesnt work but, its even posible, is there a work arround?.
Thanks in advance.
No, you can't join to a different table per row in table1, not even with dynamic SQL as #Cade Roux suggests.
You could join to the hotdog table for rows where food is 'hotdog' and join to other tables for other specific values of food.
SELECT * FROM table1 JOIN hotdog ON id = foodid WHERE food = 'hotdog'
UNION
SELECT * FROM table1 JOIN apples ON id = foodid WHERE food = 'apples'
UNION
SELECT * FROM table1 JOIN soups ON id = foodid WHERE food = 'soup'
UNION
...
This requires that you know all the distinct values of food, and that all the respective food tables have compatible columns so you can UNION them together.
What you're doing is called polymorphic associations. That is, the foreign key in table1 references rows in multiple "parent" tables, depending on the value in another column of table1. This is a common design mistake of relational database programmers.
For alternative solutions, see my answers to:
Possible to do a MySQL foreign key to one of two possible tables?
Why can you not have a foreign key in a polymorphic association?
I also cover solutions for polymorphic associations in my presentation Practical Object Oriented Models In SQL, and in my book SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.
Only with dynamic SQL. It is also possible to left join many different tables and use CASE based on type, but the tables would be all have to be known in advance.
It would be easier to recommend an appropriate design if we knew more about what you are trying to achieve, what your design currently looks like and why you've chosen that particular table design in the first place.
-- Say you have a table of foods:
id INT
foodtype VARCHAR(50) (right now it just contains 'hotdog' or 'hamburger')
name VARCHAR(50)
-- Then hotdogs:
id INT
length INT
width INT
-- Then hamburgers:
id INT
radius INT
thickness INT
Normally I would recommend some system for constraining only one auxiliary table to exist, but for simplicity, I'm leaving that out.
SELECT f.*, hd.length, hd.width, hb.radius, hb.thickness
FROM foods f
LEFT JOIN hotdogs hd
ON hd.id = f.id
AND f.foodtype = 'hotdog'
LEFT JOIN hamburgers hb
ON hb.id = f.id
AND f.foodtype = 'hamburger'
Now you will see that such a thing can be code generated (or even for a very slow prototype dynamic SQL on the fly) from SELECT DISTINCT foodtype FROM foods given certain assumptions about table names and access to the table metadata.
The problem is that ultimately whoever consumes the result of this query will have to be aware of new columns showing up whenever a new table is added.
So the question moves back to your client/consumer of the data - how is it going to handle the different types? And what does it mean for different types to be in the same set? And if it needs to be aware of the different types, what's the drawback of just writing different queries for each type or changing a manual query when new types are added given the relative impact of such a change anyway?
Related
Currently, this is what my SELECT code looks like:
SELECT student.stu_code, user.f_name, user.l_name
FROM user
INNER JOIN student
ON student.stu_code = user.user_id
INNER JOIN course
ON course.stu_code ?????;
Basically, to elaborate the student table inherits from user table, therefore I had user_id = stu_code. What I'm confused about is how to join course table with student table.
Let's say that the course table has a course code (PK), a few other attributes and a stu_code column, however, the student code column has multiple values inside a single column to represent that multiple students are taking the course and stored as VARCHAR.
Example: Student table has stu_code string value of '123' and course table has a stu_code with string value of '123, 246, 369'.
How would I go about joining these two tables together and separating the stu_code in the course table so that it represents 3 separate stu_code values -> i.e. '123', '246', '369'.
Any help is greatly appreciated!
however, the student code column has multiple values inside a single column to represent that multiple students are taking the course and stored as VARCHAR.
Your data model is broken. Put your effort into fixing the data model. You want a junction/association table courseStudents or perhaps enrolled, with columns like:
stu_code (foreign key to students)
course_code (foreign key to students)
enrollment_date
and so on
What is wrong with your data model? Here are a few things:
You are storing numbers as a string.
You are putting multiple values into a string column.
You cannot define foreign key relationships.
SQL has poor string handling capabilities.
SQL has a great way to store lists of things. It is not called "string". It is called "table".
Your data model is ~broken~ hindering you from elegant solutions.
You cannot join your two tables efficiently. While they might both contain strings they do not contain data with the same rules. Thus, you must transform the data in order to join them so you could do this in a few ways but one way is using regular expression function.
You can use it to evaluate a test on whether the stu_code matches the list of codes. Further, you can do this dynamically ... constructing the test string itself based upon values from the left and right
join based on REGEXP
SELECT student.stu_code, user.f_name, user.l_name
FROM user
INNER JOIN student
ON student.stu_code = user.user_id
INNER JOIN course
ON student.stu_code REGEXP CONCAT('[[:<:]]',course.stu_code,'[[:>:]]')
Assuming tables and data:
Student
- - - -
stu_code
123
Course
- - - -
stu_code
'123, 246, 369'
Example:
http://sqlfiddle.com/#!9/672b57f/4
about the regular expression
in mysql the regex syntax can be a little bit different. [[:<:]] is the character class in spencer notation for word boundary.
if you have a new enough version of mysql/mariadb you can use more typical ICU notation of \b.
more about that here : https://dev.mysql.com/doc/refman/8.0/en/regexp.html
about efficiency
in large datasets the performance will be awful. you will have to scan all records and you will have to perform the function on all of them. In a large set you might get some gains by joining on like first (which is faster than regexp). This will be much faster at filtering-out and then the regexp can deal with filtering-in.
Perhaps your model was based upon an assumption of having a courses table with very few rows?
It ironic because you have made your course table unnecessarily large. You would actually be better off with an intermediary table that represents the many-to-many nature (the fact that students can take many courses and courses can have many students) with 1 row per unique relationship. While this table would be an order of magnitude "longer" it would be leaner and it could be indexed and query performance would be faster.
The courses table does not need to have any awareness of the student list and thus you can alter courses by removing courses.stu_code once you change the model (aside: It might be useful if courses cached a hint of the expected student count for that course)
possible link table
would be a new table like this (note how it only ever needs these 2 columns)
stu_course_lnk
- - - - - - - -
stu_code course_id
123 ABC
124 ABC
...
123 XYZ
...
124 LMN
then you add joins of
...
student.stu_code = stu_course_lnk.stu_code
and
stu_course_lnk.course_id = course.id
...
I am applying a group of data mining algorithms to a dataset comprised of a set of customers along with a large number of descriptive attributes that summarize various aspects of their past behavior. There are more than 10,000 attributes, each stored as a column in a table with the customer id as the primary key. For several reasons, it is necessary to pre-compute these attributes rather than calculating them on the fly. I generally try to select customer with a specified attribute set. The algorithms can combine any arbitrary number of these attributes together in a single SELECT statement and join the required tables. All the tables have the same number of rows (one per customer).
I am wondering what's the best way to structure these tables of attributes. Is it better to group the attributes into tables of 20-30 columns, requiring more joins on average but fewer columns per SELECT, or have tables with the maximum number of columns to minimize the number of joins, but having potentially all 10K columns joined at once?
I also thought of using one giant 3-column customerID-attribute-value table and storing all the info there, but it would be harder to structure a "select all customers with these attributes-type query that I need."
I'm using MySQL 5.0+, but I assume this is a general SQL-ish question.
From my expirience using tables with 10,000 columns is very-very-very bad idea. What if in future this number will be increased?
If there are a lot of attributes you shouldn't use a horizontal scaled tables (with large number of columns). You should create a new table attributes and place alltributes values into it. Then connect this table with Many-To-One relationship to main entry table
Maybe the second way is to use no-SQL (like MongoDB) systems
As #odiszapc said, you have to use a meta-model structure, like for instance:
CREATE TABLE customer(ID INT NOT NULL PRIMARY KEY, NAME VARCHAR(64));
CREATE TABLE customer_attribute(ID INT NOT NULL, ID_CUSTOMER INT NOT NULL, NAME VARCHAR(64), VALUE VARCHAR(1024));
Return basic informations of given customer:
SELECT * FROM customers WHERE name='John';
Return customer(s) matching certain attributes:
SELECT c.*
FROM customer c
INNER JOIN attribute a1 ON a1.id_customer = c.id
AND a1.name = 'address'
AND a1.value = '1078, c/ los gatos madrileƱos'
INNER JOIN attribute a2 ON a2.id_customer = c.id
AND a2.name = 'age'
AND a2.value = '27'
Your generator should generate the inner joins on the fly.
Proper indexes on the tables should allow all this engine to go relatively fast (if we assume 10k attributes per customer, and 10k customers, that's actually pretty much a challenge...)
10,000 columns is much. The SELECT statement will be very long and messy if you wouldn't use *. I think you can narrow the attributes down to most useful and meaningful ones, eliminating others
Some background I have a set of data that represents the alchemy ingredients and their effects from Skyrim. If you're unfamiliar with this you can combine 2-4 ingredients to make a potion. Each ingredient has 4 effects. If any effects between ingredients are the same it will make that type of potion. I've identified this as a many-to-many relationship and I set up my tables like so:
ingredients: ing_id (key), ing_name, (other supplemental info)
effects: eff_id (key), eff_name
ing_eff_xref: eff_id, ing_id
I would like to input 2 or more available ingredients and return possible combinations without knowing what the effects are. My sql experience is pretty much limited to phpmyadmin and simple select queries. I guess my questions are: is this the right way to structure the tables for this type of relationship, do I need to set foreign keys if I don't plan on updating the tables, and is there a query that can take a set of ing_names and return only eff_names that intersect?
Here is the mysqldump of the db if anyone is interested: http://dl.dropbox.com/u/59699040/alchemy_db.sql
is this the right way to structure the tables for this type of relationship?
Yes, but then you don't need to have effect1 through effect4 on the ingredient table.
do I need to set foreign keys if I don't plan on updating the tables?
Yes. The only way for you to get the data that you're after is by JOINing three tables together. Without foreign keys (or more specifically, appropriate indexes), that may not perform well on queries. Of course you do have a small number of rows overall, but using foreign keys is a good practice to follow in this type of scenario.
is there a query that can take a set of ing_names and return only
eff_names that intersect?
I think you're after something like this:
SELECT e.eff_name
FROM ingredients i
INNER JOIN ing_eff_xref ie ON ie.ing_id = i.ing_id
INNER JOIN effects e ON e.eff_id = ie.eff_id
WHERE i.ing_name = 'Abecean Longfin ';
If you need to see effects for multiple ingredients, you could adjust your WHERE clause, like this:
WHERE i.ing_name IN ('Abecean Longfin ','Eye of Sabre Cat ','Bear Claws ');
You'll probably not want duplicate effects, so you could do a SELECT DISTINCT to eliminate those.
Can potion effects stack in Skyrim? If they in can stack, then you can do a GROUP BY query with a COUNT to get the stacked value of each effect:
SELECT e.eff_name, count(*) as value
FROM ingredients i
INNER JOIN ing_eff_xref ie ON ie.ing_id = i.ing_id
INNER JOIN effects e ON e.eff_id = ie.eff_id
WHERE i.ing_name IN ('Eye of Sabre Cat ','Bear Claws ')
GROUP BY e.eff_name;
This query will list 6 effects with a value of 1, and "Restore Stamina" will have a value of 2. Not sure if Skyrim potions work this way or not, but it was just an extra thought.
I have three tables: students, interests, and interest_lookup.
Students has the cols student_id and name.
Interests has the cols interest_id and interest_name.
Interest_lookup has the cols student_id and interest_id.
To find out what interests a student has I do
select interests.interest_name from `students`
inner join `interest_lookup`
on interest_lookup.student_id = students.student_id
inner join `interests`
on interests.interest_id = interest_lookup.interest_id
What I want to do is get a result set like
student_id | students.name | interest_a | interest_b | ...
where the column name 'interest_a' is a value in interests.name and
the interest_ columns are 0 or 1 such that the value is 1 when
there is a record in interest_lookup for the given
student_id and interest_id and 0 when there is not.
Each entry in the interests table must appear as a column name.
I can do this with subselects (which is super slow) or by making a bunch of joins, but both of these really require that I first select all the records from interests and write out a dynamic query.
You're doing an operation called a pivot. #Slider345 linked to (prior to editing his answer) another SO post about doing it in Microsoft SQL Server. Microsoft has its own special syntax to do this, but MySQL does not.
You can do something like this:
SELECT s.student_id, s.name,
SUM(i.name = 'a') AS interest_a,
SUM(i.name = 'b') AS interest_b,
SUM(i.name = 'c') AS interest_c
FROM students s
INNER JOIN interest_lookup l USING (student_id)
INNER JOIN interests i USING (interest_id)
GROUP BY s.student_id;
What you cannot do, in MySQL or Microsoft or anything else, is automatically populate columns so that the presence of data expands the number of columns.
Columns of an SQL query must be fixed and hard-coded at the time you prepare the query.
If you don't know the list of interests at the time you code the query, or you need it to adapt to changing lists of interest, you'll have to fetch the interests as rows and post-process these rows in your application.
What your trying to do sounds like a pivot.
Most solutions seem to revolve around one of the following approaches:
Creating a dynamic query, as in Is there a way to pivot rows to columns in MySQL without using CASE?
Selecting all the attribute columns, as in How to pivot a MySQL entity-attribute-value schema
Or, identifying the columns and using either a CASE statement or a user defined function as in pivot in mysql queries
I don't think this is possible. Actually I think this is just a matter of data representatioin. I would try to use a component to display the data that would allow me to pivot the data (for instance, the same way you do on excel, open office's calc, etc).
To take it one step further, you should think again why you need this and probably try to solve it in the application not in the database.
I know this doesn't help much but it's the best I can think of :(
I currently have 3 tables,
Users (Id, PositionId)
MonsterInstances (Id, PositionId)
TreasureInstances (Id, PositionId)
and 1 position table.
Positions (Id, Coordinate, TypeId)
PositionId, in my 3 tables, are foreign keys into my Position table.
I want to use a single Positions table, as shown above, to normalize all of my position data. The problem I am facing is that I must identify a type so that when my query executes, it knows which table to query.
e.g.
SP -- GetObjectByPosition (positionId)
IF TypeId = 1
SELECT * FROM Users JOIN... WHERE PositionId = positionId
ELSE IF TypeId = 2
SELECT * FROM MonsterInstances JOIN...
This seems like bad design to me. The only way around it I can percieve would be to have 3 seperate tables.
UserPositions
MonsterInstancePositions
TreasureInstancePositions
However, I'm not always interested in extracting user, monster, or treasure data. Sometimes I only want the position Id and location -- which would mean with three tables, I would have to do a union.
Is there a better way to do this?
Users, MonsterInstances, TreasureInstances could be rewritten as a single "ObjectInstances" table that includes a type column. Then queries that would work against those 3 tables separately would instead work against ObjectInstances and a typeID, referencing a new OjbectTypes table. Make sense?