My question is the following:
As asked in the question "How to count amount of rows referring to a particular row foreign key in MySql?", I want to count table references involving multiple tables referring to the table I'm interested about. However here we want the specific number of references per row for the resourced table.
In addition, what about the variant where the tables do reference eachother, but the foreign key does not exist?
Let's setup some minimal examples;
We have three tables, here called A, B, and C. B and C refer rows in A. I want to count the total amount of references for each row in A.
Contents of the first table (A), and expected query results in the column 'Count':;
+----+------------+-------+
| ID | Name | Count |
+----+------------+-------+
| 1 | First row | 0 |
| 2 | Second row | 5 |
| 3 | Third row | 2 |
| 4 | Fourth row | 1 |
+----+------------+-------+
Contents of the second table (B):
+----+------+
| ID | A_ID |
+----+------+
| 1 | 2 |
| 2 | 2 |
| 3 | 2 |
+----+------+
Contents of the third table (C):
+----+------+
| ID | A_ID |
+----+------+
| 1 | 2 |
| 2 | 2 |
| 3 | 3 |
| 4 | 3 |
| 5 | 4 |
+----+------+
Important restrictions for a solution
The solution should work with n tables, for reasonable values of n. The example has n=2.
The solution should not involve a subset of the product set of all the tables. As some rows from A may be referenced a bunch of times in all the other tables the size of the product set may well be stupidly large (e.g. 10*10*10*... becomes big quickly). E.g. it may not be O(q^n) where n is the number of tables and q is the amount of occurrences.
This is a partial solution, which I believe still suffers from performance problems related to condition [2]
I'm adding this as an answer as it may be useful for those working towards a better solution
Apply the following query. Extend as necessary with additional tables, adding additional lines to both the sum and the set of JOINs. This particular solution will work as long as you have less than about 90 tables. With more than that, you will have to run multiple queries like it and cache the results (for example by creating a column in the 'A' table), then sum all these later on.
SELECT
COUNT(DISTINCT B.ID) +
COUNT(DISTINCT C.ID) -- + .....
AS `Count`
FROM A
LEFT JOIN B ON A.ID = B.A_ID
LEFT JOIN C ON A.ID = C.A_ID
Unfortunately, if you have often referenced rows, the query will have a massive intermediate result, run out of memory, and thus never complete.
Related
In short; we are trying to return certain results from one table based on second level criteria of another table.
I have a number of source data tables,
So:
Table DataA:
data_id | columns | stuff....
-----------------------------
1 | here | etc.
2 | here | poop
3 | here | etc.
Table DataB:
data_id | columnz | various....
-----------------------------
1 | there | you
2 | there | get
3 | there | the
4 | there | idea.
Table DataC:
data_id | column_s | others....
-----------------------------
1 | where | you
2 | where | get
3 | where | the
4 | where | idea.
Table DataD: etc. There are more and more will be added ongoing
And a relational table of visits, where there are "visits" to some of these other data rows in these other tables above.
Each of the above tables holds very different sets of data.
The way this is currently structured is like this:
Visits Table:
visit_id | reference | ref_id | visit_data | columns | notes
-------------------------------------------------------------
1 | DataC | 2 | some data | etc. | so this is a reference
| | | | | to a visit to row id
| | | | | 2 on table DataC
2 | DataC | 3 | some data | etc. | ...
3 | DataB | 4 | more data | etc. | so this is a reference
| | | | | to a visit to row id
| | | | | 4 on table DataB
4 | DataA | 1 | more data | etc. | etc. etc.
5 | DataA | 2 | more data | etc. | you get the idea
Now we currently list the visits by various user given criteria, such as visit date.
however the user can also choose which tables (ie data types) they want to view, so a user has to tick a box to show they want data from DataA table, and DataC table but not DataB, for example.
The SQL we currently have works like this; the column list in the IN conditional is dynamically generated from user choices:
SELECT visit_id,columns, visit_data, notes
FROM visits
WHERE visit_date < :maxDate AND visits.reference IN ('DataA','DataC')
The Issue:
Now, we need to go a step beyond this and list the visits by a sub-criteria of one of the "Data" tables,
So for example, DataA table has a reference to something else, so now the client wants to list all visits to numerous reference types, and IF the type is DataA then to only count the visits if the data in that table fits a value.
For example:
List all visits to DataB and all visits to DataA where DataA.stuff = poop
The way we currently work this is a secondary SQL on the results of the first visit listing, exampled above. This works but is always returning the full table of DataA when we only want to return a subset of DataA but we can't be exclusive about it outside of DataA.
We can't use LEFT JOIN because that doesn't trim the results as needed, we can't use exclusionary joins (RIGHT / INNER) because that then removes anything from DataC or any other table,
We can't find a way to add queries to the WHERE because again, that would loose any data from any other table that is not DataA.
What we kind of need is a JOIN within an IF/CASE clause.
Pseudo SQL:
SELECT visit_id,columns, visit_data, notes
FROM visits
IF(visits.reference = 'DataA')
INNER JOIN DataA ON visits.ref_id = DataA.id AND DataA.stuff = 'poop'
ENDIF
WHERE visit_date < 2020-12-06 AND visits.reference IN ('DataA','DataC')
All criteria in the WHERE clause are set by the user, none are static (This includes the DataA.stuff criteria too).
So with the above example the output would be:
visit_id | reference | ref_id | visit_data | columns | notes
-------------------------------------------------------------
1 | DataC | 2 | some data | etc. |
2 | DataC | 3 | some data | etc. |
5 | DataA | 1 | more data | etc. |
We can't use Union because the different Data tables contain lots of different details.
Questions:
There may be a very straightforward answer to this but I can't see it,
How can we approach trying to achieve this sort of partial exclusivity?
I suspect that our overarching architecture structure here could be improved (the system complexity has grown organically over a number of years). If so, what could be a better way of building this?
What we kind of need is a JOIN within an IF/CASE clause.
Well, you should know that's not possible in SQL.
Think of this analogy to function calls in a conventional programming language. You're essentially asking for something like:
What we need is a function call that calls a different function depending on the value you pass as a parameter.
As if you could do this:
call $somefunction(argument);
And which $somefunction you call would be determined by the function called, depending on the value of argument. This doesn't make any sense in any programming language.
It is similar in SQL — the tables and columns are fixed at the time the query is parsed. Rows of data are not read until the query is executed. Therefore one can't change the tables depending on the rows executed.
The simplest answer would be that you must run more than one query:
SELECT visit_id,columns, visit_data, notes
FROM visits
INNER JOIN DataA ON visits.ref_id = DataA.id AND DataA.stuff = 'poop'
WHERE visit_date < 2020-12-06 AND visits.reference = 'DataA';
SELECT visit_id,columns, visit_data, notes
FROM visits
WHERE visit_date < 2020-12-06 AND visits.reference = 'DataC';
Not every task must be done in one SQL query. If it's too complex or difficult to combine two tasks into one query, then leave them separate and write code in the client application to combine the results.
I previously got a great answer (thank you #Paul Spiegel) on removing records from a table whose string was contained at the end of another record. For example, removing 'Farm' when 'Animal Farm' existed) and grouped by a Client Field.
The problem is, in fact, a little more complex and spans three tables, I'd hoped I could extend the logic easily but it turns out to also be challenging (for me). Instead of one table with Client and Term, I have three tables:
Terms
Clients
Look-up-Table (LUT) where I store pairs of TermID and ClientID
I have made some progress since initially posting this question so where I stand is I made the Joins and resultant Select return the fields I want to delete from the Look-up-Table (LUT):
http://sqlfiddle.com/#!9/479c72/45
The final select being:
Select Distinct(C.Title),T2.Term From LUT L
Inner Join Terms T
On L.TermID=T.ID
Inner Join Terms T2
On T.Term Like Concat('% ', T2.Term)
Inner Join Clients C
On C.ID=L.ClientID;
I am in the process of trying to turn this into a Delete with little success.
Append this to your query:
Inner Join LUT L2
On L2.ClientID = L.ClientID
And L2.TermID = T2.ID
That will ensure, that the clients do match and you will get the following result:
| ClientID | TermID | ID | Term | ID | Term | ID | Title | ClientID | TermID |
|----------|--------|----|---------------|----|-----------|----|-------|----------|--------|
| 1 | 2 | 2 | Small Dog | 1 | Dog | 1 | Bob | 1 | 1 |
| 2 | 5 | 5 | Big Black Dog | 3 | Black Dog | 2 | Alice | 2 | 3 |
To delete the corresponding rows from the LUT table, replace Select * with Delete L2.
But deleting the terms is more tricky. Since it's a many-to-many relation, the term may belong to multiple clients. So you can't just delete them. You will need to cleanup up the table in a second statement. That can be done with the following statement:
Delete T
From Terms T
Left Join LUT L
On L.TermID = T.ID
Where L.TermID Is Null
Demo: http://sqlfiddle.com/#!9/b17659/1
Note that in this case the term Medium Dog will also be deleted, since it doesn't belong to any client.
I have two tables
one as td_job which has these structure
|---------|-----------|---------------|----------------|
| job_id | job_title | job_skill | job_desc |
|------------------------------------------------------|
| 1 | Job 1 | 1,2 | |
|------------------------------------------------------|
| 2 | Job 2 | 1,3 | |
|------------------------------------------------------|
The other Table is td_skill which is this one
|---------|-----------|--------------|
|skill_id |skill_title| skill_slug |
|---------------------|--------------|
| 1 | PHP | 1-PHP |
|---------------------|--------------|
| 2 | JQuery | 2-JQuery |
|---------------------|--------------|
now the job_skill in td_job is actualy the list of skill_id from td_skill
that means the job_id 1 has two skills associated with it, skill_id 1 and skill_id 2
Now I am writing a query which is this one
SELECT * FROM td_job,td_skill
WHERE td_skill.skill_id IN (SELECT td_job.job_skill FROM td_job)
AND td_skill.skill_slug LIKE '%$job_param%'
Now when the $job_param is PHP it returns one row, but if $job_param is JQuery it returns empty row.
I want to know where is the error.
The error is that you are storing a list of id's in a column rather than in an association/junction table. You should have another table, JobSkills with one row per job/skill combination.
The second and third problems are that you don't seem to understand how joins work nor how in with a subquery works. In any case, the query that you seem to want is more like:
SELECT *
FROM td_job j join
td_skill s
on find_in_set(s.skill_id, j.job_skill) > 0 and
s.skill_slug LIKE '%$job_param%';
Very bad database design. You should fix that if you can.
I read somewhere that column order in mysql is important. I believe they were referring to the indexed columns.
QUESTION: If column order is important, when and why is it important?
The reason I ask is because I have a table in mysql similar to the one below.
The primary index is on the left and I have an index on the far right. Is this bad?
It is a MyISAM table and will be used predominantly for selects (no inserts, deletes or updates).
-----------------------------------------------
| Primary index | data1| data2 | d3| Index |
-----------------------------------------------
| 1 | A | cat | 1 | A |
| 2 | B | toads | 3 | A |
| 3 | A | yabby | 7 | B |
| 4 | B | rabbits | 1 | B |
-----------------------------------------------
Column order is only important when defining indexes, as this affects whether an index is suitable to use in executing a query. (This is true of all RBDMS's, not just MySQL)
e.g.
Index defined on columns MyIndex(a, b, c) in that order.
A query such as
select a from mytable
where c = somevalue
probably won't use that index to execute the query (depends on several factors such as row count, column selectivity etc)
Whereas, it will most likely choose to use an index defined as MyIndex2(c,a,b)
Update: see use-the-index-luke.com (thanks Greg).
I have two tables, that relate via a one-to-many relationship i.e
tableOne (1)----------(*) tableTwo
Given the basic schema below
tableOne {
groupID int PK,
groupTitle varchar
}
and
tableTwo {
bidID int PK,
groupID int FK
}
Consider the two tables yield the following record-set based on joining the tables on the tableOne.groupID = tableTwo.groupID,
tableOne.groupID | tableOne.groupTitle | tableTwo.bidID | tableTwo.groupID
________________________________________________________________________________
1 | Physics Group | 1 | 1
2 | Chemistry Group | 2 | 2
2 | Chemistry Group | 3 | 2
1 | Physics Group | 4 | 1
I would like to list such a record-set in an HTML table as follows:
tableOne.groupID | tableOne.groupTitle | tableTwo.bidID | tableTwo.groupID
________________________________________________________________________________
1 | Physics Group | 1 | 1
| Physics Group | 4 | 1
2 | Chemistry Group | 2 | 2
| Chemistry Group | 3 | 2
I'm interested in finding out if this can be done in SQL, or alternatively finding out ways of listing such a record-set in HTML using good standards.
The solution that comes to mind is simply iterating through the record-set and leveraging a sentinel to list all records with the same tableOne.groupID grouped in a single row <tr> - and also listing tableOne.groupIDs once as a unique identifier of that record-group. However I don't want to go down that path as I would like to avoid mixing code with HTML if possible.
You can order the sql results using the ORDER BY clause.
So if you add
ORDER BY tableOne.groupID ASC, tableTwo.bidID ASC
in your query, you are half-way there.
Next step is to loop and print the recordset from your asp page, but also check if the last groupID is different than the current, in order to decide whether to show it or not..