Many-to-many multi level queries - mysql

I have a table called "obj_rels" where I have have fields such as:
pri_type
pri_type_id
sec_type
sec_type_id
The type represents the table the relationship is in. For example COM for comment, FIL for file, SBM for submission or ENT for entity.
I want to know the entity for each object type. For example,
Let's say I have a comment (#4) on a file (#3) uploaded to respond to a submission (#2) for entity (#1). The obj_rels table would have:
+----------+-------------+----------+-------------+
| pri_type | pri_type_id | sec_type | sec_type_id |
+----------+-------------+----------+-------------+
| COM | 4 | FIL | 3 |
| SBM | 2 | FIL | 3 |
| SBM | 2 | ENT | 1 |
+----------+-------------+----------+-------------+
There is no logic to Primary and Secondary entry, so it needs to look at both sides to find a match. How can I write a query that would dig back into the relationships to find the "ENT" associated with the "COM"?
The one I've written repeats a union query five times, thinking I probably won't have more than 5 levels ever, but it is extremely slow. I only have about 9k records in the table, and it takes over 30 seconds to run the query.
What is best practice for this type of relationship search?
Thanks!

Related

How to structure a MySQL query to join with an exclusion

In short; we are trying to return certain results from one table based on second level criteria of another table.
I have a number of source data tables,
So:
Table DataA:
data_id | columns | stuff....
-----------------------------
1 | here | etc.
2 | here | poop
3 | here | etc.
Table DataB:
data_id | columnz | various....
-----------------------------
1 | there | you
2 | there | get
3 | there | the
4 | there | idea.
Table DataC:
data_id | column_s | others....
-----------------------------
1 | where | you
2 | where | get
3 | where | the
4 | where | idea.
Table DataD: etc. There are more and more will be added ongoing
And a relational table of visits, where there are "visits" to some of these other data rows in these other tables above.
Each of the above tables holds very different sets of data.
The way this is currently structured is like this:
Visits Table:
visit_id | reference | ref_id | visit_data | columns | notes
-------------------------------------------------------------
1 | DataC | 2 | some data | etc. | so this is a reference
| | | | | to a visit to row id
| | | | | 2 on table DataC
2 | DataC | 3 | some data | etc. | ...
3 | DataB | 4 | more data | etc. | so this is a reference
| | | | | to a visit to row id
| | | | | 4 on table DataB
4 | DataA | 1 | more data | etc. | etc. etc.
5 | DataA | 2 | more data | etc. | you get the idea
Now we currently list the visits by various user given criteria, such as visit date.
however the user can also choose which tables (ie data types) they want to view, so a user has to tick a box to show they want data from DataA table, and DataC table but not DataB, for example.
The SQL we currently have works like this; the column list in the IN conditional is dynamically generated from user choices:
SELECT visit_id,columns, visit_data, notes
FROM visits
WHERE visit_date < :maxDate AND visits.reference IN ('DataA','DataC')
The Issue:
Now, we need to go a step beyond this and list the visits by a sub-criteria of one of the "Data" tables,
So for example, DataA table has a reference to something else, so now the client wants to list all visits to numerous reference types, and IF the type is DataA then to only count the visits if the data in that table fits a value.
For example:
List all visits to DataB and all visits to DataA where DataA.stuff = poop
The way we currently work this is a secondary SQL on the results of the first visit listing, exampled above. This works but is always returning the full table of DataA when we only want to return a subset of DataA but we can't be exclusive about it outside of DataA.
We can't use LEFT JOIN because that doesn't trim the results as needed, we can't use exclusionary joins (RIGHT / INNER) because that then removes anything from DataC or any other table,
We can't find a way to add queries to the WHERE because again, that would loose any data from any other table that is not DataA.
What we kind of need is a JOIN within an IF/CASE clause.
Pseudo SQL:
SELECT visit_id,columns, visit_data, notes
FROM visits
IF(visits.reference = 'DataA')
INNER JOIN DataA ON visits.ref_id = DataA.id AND DataA.stuff = 'poop'
ENDIF
WHERE visit_date < 2020-12-06 AND visits.reference IN ('DataA','DataC')
All criteria in the WHERE clause are set by the user, none are static (This includes the DataA.stuff criteria too).
So with the above example the output would be:
visit_id | reference | ref_id | visit_data | columns | notes
-------------------------------------------------------------
1 | DataC | 2 | some data | etc. |
2 | DataC | 3 | some data | etc. |
5 | DataA | 1 | more data | etc. |
We can't use Union because the different Data tables contain lots of different details.
Questions:
There may be a very straightforward answer to this but I can't see it,
How can we approach trying to achieve this sort of partial exclusivity?
I suspect that our overarching architecture structure here could be improved (the system complexity has grown organically over a number of years). If so, what could be a better way of building this?
What we kind of need is a JOIN within an IF/CASE clause.
Well, you should know that's not possible in SQL.
Think of this analogy to function calls in a conventional programming language. You're essentially asking for something like:
What we need is a function call that calls a different function depending on the value you pass as a parameter.
As if you could do this:
call $somefunction(argument);
And which $somefunction you call would be determined by the function called, depending on the value of argument. This doesn't make any sense in any programming language.
It is similar in SQL — the tables and columns are fixed at the time the query is parsed. Rows of data are not read until the query is executed. Therefore one can't change the tables depending on the rows executed.
The simplest answer would be that you must run more than one query:
SELECT visit_id,columns, visit_data, notes
FROM visits
INNER JOIN DataA ON visits.ref_id = DataA.id AND DataA.stuff = 'poop'
WHERE visit_date < 2020-12-06 AND visits.reference = 'DataA';
SELECT visit_id,columns, visit_data, notes
FROM visits
WHERE visit_date < 2020-12-06 AND visits.reference = 'DataC';
Not every task must be done in one SQL query. If it's too complex or difficult to combine two tasks into one query, then leave them separate and write code in the client application to combine the results.

MS Access help needed forming a specific report

I have a table with a column for agent names and a column for each of the skills those agents could possibly have. Each skill the agent is assigned shows a 1 in the field under that skill.
Columns look like this:
+---------+----------+----------+----------+
| Name | 'Skill1' | 'Skill2' | 'Skill3' |
+---------+----------+----------+----------+
| John | 1 | | 1 |
| Sam | 1 | 1 | |
| Roberta | 1 | | 1 |
+---------+----------+----------+----------+
I would like to make a query that returns a list of all agent names that have a 1 for each particular skill. The query would return something like this:
+-----------+
| Skill 1 |
+-----------+
| John |
| Sam |
| Roberta |
+-----------+
Additionally I would like to be able to query a single name and retrieve all skills that agent has (all rows the Name column has a 1 in) like this:
+-----------+
| John |
+-----------+
| Skill 1 |
| Skill 3 |
+-----------+
I've done this in Excel using an index but I'm new to Access and not sure how to complete this task.
Thanks in advance.
One of the reasons that you are finding this task difficult is because your database is not normalised and so due to the way that your database is structured, you are working against MS Access, not with it.
Consequently, whilst a solution is still possible with the current data, the resulting queries will be painful to construct and will either be full of multiple messy iif statements, or several union queries performing the same operations over & over again, one for each 'skill'.
Then, if you every wish to add another Skill to the database, all of your queries have to be rewritten!
Whereas, if your database was normalised (as Gustav has suggested in the comments), the task would be a simple one-liner; and what's more, if you add a new skill later on, your queries will automatically output the results as if the skill had always been there.
Your data has a many-to-many relationship: an agent may have many skills, and a skill may be known by many agents.
As such, the most appropriate way to represent this relationship is using a junction table.
Hence, you would have a table of Agents such as:
tblAgents
+-----+-----------+----------+------------+
| ID | FirstName | LastName | DOB |
+-----+-----------+----------+------------+
| 1 | John | Smith | 1970-01-01 |
| ... | ... | ... | ... |
+-----+-----------+----------+------------+
This would only contain information unique to each agent, i.e. minimising the repeated information between records in the table.
You would then have a table of possible Skills, such as:
tblSkills
+-----+---------+---------------------+
| ID | Name | Description |
+-----+---------+---------------------+
| 1 | Skill 1 | Skill 1 Description |
| 2 | Skill 2 | Skill 2 Description |
| ... | ... | ... |
+-----+---------+---------------------+
Finally, you would have a junction table linking Agents to Skills, e.g.:
tblAgentSkills
+----+----------+----------+
| ID | Agent_ID | Skill_ID |
+----+----------+----------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 1 |
| 4 | 3 | 2 |
+----+----------+----------+
Now, say you want to find out which agents have Skill 1, the query is simple:
select Agent_ID from tblAgentSkills where Skill_ID = 1
What if you want to find out the skills known by an agent? Equally as simple:
select Skill_ID from tblAgentSkills where Agent_ID = 1
Of course, these queries will merely return the ID fields as present in the junction table - but since the ID uniquely identifies a record in the tblAgents or tblSkills tables, such ID is all you need to retrieve any other required information:
select
tblAgents.FirstName,
tblAgents.LastName
from
tblAgentSkills inner join tblAgents on
tblAgentSkills.AgentID = tblAgents.ID
where
tblAgentSkills.Skill_ID = 1
To get all agents with skill1, open the query designer and create the following query:
this will generate the following sql
SELECT Skills.AgentName
FROM Skills
WHERE (((Skills.Skill1)=1));
If you adjust the names you can also paste this query into the sql pane of the designer to get the query you want.
To get all the skills an agent has I chose a parameterized query. Open the query designer and create a new query:
When you run this query it will ask you for the name of the agent. Make sure to type the agent name exactly. Here is the resulting sql:
SELECT Skills.AgentName, Skills.Skill1, Skills.Skill2, Skills.Skill3
FROM Skills
WHERE (((Skills.AgentName)=[Agent]));
If you continue working with this query I would improve the table design by breaking your table into a skills table, agents table, skills&agents table. Then link the skills and agents tables to the skills&agents table in a many to many relationship. The query to get all an agents skills would then look like this in the designer:

How do I handle linking a record to another table?

I'm very new to Access and my teacher is... hard to follow. So I feel like there's something pretty basic I'm probably missing here. I think the biggest problem I'm having with this question is that I'm struggling to find the words to communicate what I actually need to do, which is really putting a damper on my google-fu.
In terms of what I think I want to do, I want to make a record reference another table in its entirety.
Main
+----+-------+--------+-------+----------------------------+
| PK | Name | Phone# | [...] | Cards |
+----+-------+--------+-------+----------------------------+
| 1 | Bob | [...] | [...] | < Reference to 2nd table > |
| 2 | Harry | [...] | [...] | [...] |
| 3 | Ted | [...] | [...] | [...] |
+----+-------+--------+-------+----------------------------+
Bob's Cards
+----+-------------+-----------+-------+-------+-------+
| PK | Card Name | Condition | Year | Price | [...] |
+----+-------------+-----------+-------+-------+-------+
| 1 | Big Slugger | Mint | 1987 | .20 | [...] |
| 2 | Quick Pete | [...] | [...] | [...] | [...] |
| 3 | Mac Donald | [...] | [...] | [...] | [...] |
+----+-------------+-----------+-------+-------+-------+
This would necessitate an entire new table for each record in the main table though, if it's even possible.
But the only alternative solution I can think of is to add 'Card1, Condition1, [...], Card2, Condition2, [...], Card3, [...]' fields to the main table and having to add another set of fields any time someone increases the maximum number of cards stored.
So I'm sort of left believing there is some other approach I should be taking that our teacher has failed to properly explain. We haven't even touched on forms and reports yet so I don't need to worry about working them in.
Any pointers?
(Also, the entirety of this data and structure is only a rough facsimile of my own, as I'd rather learn how to do it and apply it myself than be like 'here's my data, pls fix.')
Third option successfully found in comments by the helpful Minty.
This depends on a number of things, however to keep it simple you
would normally add one field to the cards table, with an number data
type called CardOwnerID. In your example it would be 1 indicating Bob.
This is known as a foreign key. (FK) - However if you have a table of
cards and multiple possible owners then you need a third table - a
Junction table. This would consist of the Main Person ID and the Card
ID. – Minty

SQL - database design, need suggestion

I'm building website where each user has different classes
+----+-----------+---------+
| id | subject | user_id |
+----+-----------+---------+
| 1 | Math 140 | 2 |
| 2 | ART 240 | 2 |
+----+-----------+---------+
Each class then will have bunch of Homework files, Class-Papers files and so on.
And here I need your help. What will be the better approach: Build one table like that:
+----+-----------+--------------------------------------------------+--------------+
| id | subject | Homework | Class-Papers |
+----+-----------+-----------------------------------------------------------------+
| 1 | Math 140 | www.example.com/subjects/Math+140/file_name.pdf | bla-bla |
| 2 | Math 140 | www.example.com/subjects/Math+140/file_name.pdf | bla-bla |
| 3 | Math 140 | www.example.com/subjects/Math+140/file_name.pdf | bla-bla |
| 4 | ART 240 | www.example.com/subjects/ART +240/file_name.pdf | bla-bla |
| 5 | ART 240 | www.example.com/subjects/ART +240/file_name.pdf | bla-bla |
+----+-----------+--------------------------------------------------+--------------+
And than just separate the content when I want to display it,
OR build a table for every single subject and than just load necessary table?
Or if you can suggest something better or more common/useful/efficient please go ahead.
You should read about normalization and relational design before attempting this.
This is a one-to-many relationship - model it as such.
A table for every subject is crazy. You'll have to add a new table for every subject.
A better solution will make it possible to add new subjects simply by adding data. That's what the relational model is all about.
Don't worry about tables; think about it in natural language first.
A SUBJECT(calculus) can have many COURSES(differential, integral, multi-variable).
A COURSE(differential calculus) can have many SECTIONs (Mon 9-10 am in room 2 of the math building).
A STUDENT(first name, last name, student id) can sign up for zero or more SECTIONs. The list of SECTIONs for a given STUDENT is a TRANSCRIPT.
Each STUDENT has one TRANSCRIPT per semester (fall 2012).
A SECTION can have zero or more ASSIGNMENTs.
These are the tables you'll need for this simple problem. Worry about the names and how they relate before you start writing SQL. You'll be glad you did.
I most definitely woudl NOT create a separate table for each subject. If you did, then if you wanted a query like "list all the homework for student X", you would have to access different tables depending on which subjects that student was enrolled in. Worse, anytime someone added a new subject, you would have to create a new table. If down the road you decide you need a new attribute of homework, instead of updating one table, you would have to update every one of these subject tables. It's just bad news all around.

Database modeling for international and multilingual purposes

I need to create a large scale DB Model for a web application that will be multilingual.
One doubt that I've every time I think on how to do it is how I can resolve having multiple translations for a field. A case example.
The table for language levels, that administrators can edit from the backend, can have multiple items like: basic, advance, fluent, mattern... In the near future probably it will be one more type. The admin goes to the backend and add a new level, it will sort it in the right position.. but how I handle all the translations for the final users?
Another problem with internationalization of a database is that probably for user studies can differ from USA to UK to DE... in every country they will have their levels (that probably it will be equivalent to another but finally, different). And what about billing?
How you model this in a big scale?
Here is the way I would design the database:
Visualization by DB Designer Fork
The i18n table only contains a PK, so that any table just has to reference this PK to internationalize a field. The table translation is then in charge of linking this generic ID with the correct list of translations.
locale.id_locale is a VARCHAR(5) to manage both of en and en_US ISO syntaxes.
currency.id_currency is a CHAR(3) to manage the ISO 4217 syntax.
You can find two examples: page and newsletter. Both of these admin-managed entites need to internationalize their fields, respectively title/description and subject/content.
Here is an example query:
select
t_subject.tx_translation as subject,
t_content.tx_translation as content
from newsletter n
-- join for subject
inner join translation t_subject
on t_subject.id_i18n = n.i18n_subject
-- join for content
inner join translation t_content
on t_content.id_i18n = n.i18n_content
inner join locale l
-- condition for subject
on l.id_locale = t_subject.id_locale
-- condition for content
and l.id_locale = t_content.id_locale
-- locale condition
where l.id_locale = 'en_GB'
-- other conditions
and n.id_newsletter = 1
Note that this is a normalized data model. If you have a huge dataset, maybe you could think about denormalizing it to optimize your queries. You can also play with indexes to improve the queries performance (in some DB, foreign keys are automatically indexed, e.g. MySQL/InnoDB).
Some previous StackOverflow questions on this topic:
What are best practices for multi-language database design?
What's the best database structure to keep multilingual data?
Schema for a multilanguage database
How to use multilanguage database schema with ORM?
Some useful external resources:
Creating multilingual websites: Database Design
Multilanguage database design approach
Propel Gets I18n Behavior, And Why It Matters
The best approach often is, for every existing table, create a new table into which text items are moved; the PK of the new table is the PK of the old table together with the language.
In your case:
The table for language levels, that administrators can edit from the backend, can have multiple items like: basic, advance, fluent, mattern... In the near future probably it will be one more type. The admin goes to the backend and add a new level, it will sort it in the right position.. but how I handle all the translations for the final users?
Your existing table probably looks something like this:
+----+-------+---------+
| id | price | type |
+----+-------+---------+
| 1 | 299 | basic |
| 2 | 299 | advance |
| 3 | 399 | fluent |
| 4 | 0 | mattern |
+----+-------+---------+
It then becomes two tables:
+----+-------+ +----+------+-------------+
| id | price | | id | lang | type |
+----+-------+ +----+------+-------------+
| 1 | 299 | | 1 | en | basic |
| 2 | 299 | | 2 | en | advance |
| 3 | 399 | | 3 | en | fluent |
| 4 | 0 | | 4 | en | mattern |
+----+-------+ | 1 | fr | élémentaire |
| 2 | fr | avance |
| 3 | fr | couramment |
: : : :
+----+------+-------------+
Another problem with internationalitzation of a database is that probably for user studies can differ from USA to UK to DE... in every country they will have their levels (that probably it will be equivalent to another but finally, different). And what about billing?
All localisation can occur through a similar approach. Instead of just moving text fields to the new table, you could move any localisable fields - only those which are common to all locales will remain in the original table.