I am moving an old Mantis table that had a varchar(64) category_id column to a new Mantis table that has a int(10) category_id column.
The simplified structure is as follows
bug_table (Old DB)
+----+-------------+-------------+--------+
| id | project_id | category_id | report |
+----+-------------+-------------+--------+
| 1 | 0 | Server | crash |
| 2 | 0 | Database | error |
| 3 | 1 | Server | bug |
| 4 | 1 | Server | crash |
+----+-------------+-------------+--------+
category_table (New DB)
+----+------------+----------+
| id | project_id | name |
+----+------------+----------+
| 0 | 1 | Server |
| 1 | 1 | Database |
| 2 | 2 | Server |
| 3 | 2 | Database |
+----+------------+----------+
I need a magical query that will replace category_id in the bug_table with the numerical category_id in the category_table. Thankfully I am able to match rows by project_id and categories by name.
Here is the query I am working on but have gotten stuck in the complexity
UPDATE bug_table b SET b.category_id = c.id USING category_table WHERE b.category_id = c.name
I like to approach such a task a little differently than you do for a new lookup/reference table.
To me, the new category table would only have id and name columns. There are only two rows based on the sample data: Server and Database. Yes, I realize there could be other names, but those can easily be added, and should be added, before proceeding to maximize the id matching that follows.
Next I would add a new column to the bug table that could be called 'category_new' with the data type that will store the new category id. Alternatively, you could rename the existing category_id column to category, and the new column for the id's could then be column_id.
After all that is done then you can update the new column by joining the category on names and set the id that matches: (note this assumes the non-alternative approach mentioned in step 2)
UPDATE bug_table JOIN category_table ON bug_table.category_id = category_table.name
SET bug_table.category_new = category_table.id
After that runs, check the new column to verify the updated id's.
Finally, after successful update, now the old category_id column (with the names) from the bugs_table can be dropped, and the category_new column can be renamed as the category_id.
=====
Note that if you decide to go with the alternative column approach mentioned, of course the query will be similar but differ slightly. Then only a column drop is needed at the end
If there are other tables to apply the same category changes, the operation (basically steps 2 through 5) would be similar for those tables too.
Related
In short; we are trying to return certain results from one table based on second level criteria of another table.
I have a number of source data tables,
So:
Table DataA:
data_id | columns | stuff....
-----------------------------
1 | here | etc.
2 | here | poop
3 | here | etc.
Table DataB:
data_id | columnz | various....
-----------------------------
1 | there | you
2 | there | get
3 | there | the
4 | there | idea.
Table DataC:
data_id | column_s | others....
-----------------------------
1 | where | you
2 | where | get
3 | where | the
4 | where | idea.
Table DataD: etc. There are more and more will be added ongoing
And a relational table of visits, where there are "visits" to some of these other data rows in these other tables above.
Each of the above tables holds very different sets of data.
The way this is currently structured is like this:
Visits Table:
visit_id | reference | ref_id | visit_data | columns | notes
-------------------------------------------------------------
1 | DataC | 2 | some data | etc. | so this is a reference
| | | | | to a visit to row id
| | | | | 2 on table DataC
2 | DataC | 3 | some data | etc. | ...
3 | DataB | 4 | more data | etc. | so this is a reference
| | | | | to a visit to row id
| | | | | 4 on table DataB
4 | DataA | 1 | more data | etc. | etc. etc.
5 | DataA | 2 | more data | etc. | you get the idea
Now we currently list the visits by various user given criteria, such as visit date.
however the user can also choose which tables (ie data types) they want to view, so a user has to tick a box to show they want data from DataA table, and DataC table but not DataB, for example.
The SQL we currently have works like this; the column list in the IN conditional is dynamically generated from user choices:
SELECT visit_id,columns, visit_data, notes
FROM visits
WHERE visit_date < :maxDate AND visits.reference IN ('DataA','DataC')
The Issue:
Now, we need to go a step beyond this and list the visits by a sub-criteria of one of the "Data" tables,
So for example, DataA table has a reference to something else, so now the client wants to list all visits to numerous reference types, and IF the type is DataA then to only count the visits if the data in that table fits a value.
For example:
List all visits to DataB and all visits to DataA where DataA.stuff = poop
The way we currently work this is a secondary SQL on the results of the first visit listing, exampled above. This works but is always returning the full table of DataA when we only want to return a subset of DataA but we can't be exclusive about it outside of DataA.
We can't use LEFT JOIN because that doesn't trim the results as needed, we can't use exclusionary joins (RIGHT / INNER) because that then removes anything from DataC or any other table,
We can't find a way to add queries to the WHERE because again, that would loose any data from any other table that is not DataA.
What we kind of need is a JOIN within an IF/CASE clause.
Pseudo SQL:
SELECT visit_id,columns, visit_data, notes
FROM visits
IF(visits.reference = 'DataA')
INNER JOIN DataA ON visits.ref_id = DataA.id AND DataA.stuff = 'poop'
ENDIF
WHERE visit_date < 2020-12-06 AND visits.reference IN ('DataA','DataC')
All criteria in the WHERE clause are set by the user, none are static (This includes the DataA.stuff criteria too).
So with the above example the output would be:
visit_id | reference | ref_id | visit_data | columns | notes
-------------------------------------------------------------
1 | DataC | 2 | some data | etc. |
2 | DataC | 3 | some data | etc. |
5 | DataA | 1 | more data | etc. |
We can't use Union because the different Data tables contain lots of different details.
Questions:
There may be a very straightforward answer to this but I can't see it,
How can we approach trying to achieve this sort of partial exclusivity?
I suspect that our overarching architecture structure here could be improved (the system complexity has grown organically over a number of years). If so, what could be a better way of building this?
What we kind of need is a JOIN within an IF/CASE clause.
Well, you should know that's not possible in SQL.
Think of this analogy to function calls in a conventional programming language. You're essentially asking for something like:
What we need is a function call that calls a different function depending on the value you pass as a parameter.
As if you could do this:
call $somefunction(argument);
And which $somefunction you call would be determined by the function called, depending on the value of argument. This doesn't make any sense in any programming language.
It is similar in SQL — the tables and columns are fixed at the time the query is parsed. Rows of data are not read until the query is executed. Therefore one can't change the tables depending on the rows executed.
The simplest answer would be that you must run more than one query:
SELECT visit_id,columns, visit_data, notes
FROM visits
INNER JOIN DataA ON visits.ref_id = DataA.id AND DataA.stuff = 'poop'
WHERE visit_date < 2020-12-06 AND visits.reference = 'DataA';
SELECT visit_id,columns, visit_data, notes
FROM visits
WHERE visit_date < 2020-12-06 AND visits.reference = 'DataC';
Not every task must be done in one SQL query. If it's too complex or difficult to combine two tasks into one query, then leave them separate and write code in the client application to combine the results.
I have two tables. The first one (item) is listing apartments. The second (feature) is a list of features that an apartment could have. Currently we list about 25 different features.
As every apartment can have a different set of features, I think it makes sense to have a 1:1 relationship between items and features table.
If in feature table for one the features the value is '1', this means that the linked apartment has this feature.
+-------------+------------+--------------+-------------+------------+
| table: item | | | | |
+-------------+------------+--------------+-------------+------------+
| id | created_by | titel | description | address |
+-------------+------------+--------------+-------------+------------+
| 10 | user.id | Nice Flat | text | address.id |
+-------------+------------+--------------+-------------+------------+
| 20 | user.id | Another Flat | text | address.id |
+-------------+------------+--------------+-------------+------------+
| 30 | user.id | Bungalow | text | address.id |
+-------------+------------+--------------+-------------+------------+
| 40 | user.id | Apartment | text | address.id |
+-------------+------------+--------------+-------------+------------+
+----------------+---------+--------------+----------------+--------------+------+
| table: feature | | | | | |
+----------------+---------+--------------+----------------+--------------+------+
| id | item_id | key_provided | security_alarm | water_supply | lift |
+----------------+---------+--------------+----------------+--------------+------+
| 1 | 10 | 1 | 0 | 0 | 1 |
+----------------+---------+--------------+----------------+--------------+------+
| 2 | 20 | 0 | 1 | 1 | 0 |
+----------------+---------+--------------+----------------+--------------+------+
| 3 | 30 | 1 | 1 | 0 | 1 |
+----------------+---------+--------------+----------------+--------------+------+
| 4 | 40 | 1 | 1 | 1 | 1 |
+----------------+---------+--------------+----------------+--------------+------+
I want to build a filter functionality so user can select to show only apartments with certain features.
e.g.:
$key_provided = 1;
$security_alarm = 1;
$water_supply = 0;
Does this database approach sounds reasonable for you?
What’s the best way to build a MySQL query to retrieve only apartments where the filter criteria match, keeping in mind that the number of features can be grow in future?
A better approach is to have a features table. In your case, they all seem to be binary -- yes or no -- so you can get away with:
create table item_features (
item_feature_id int auto_increment primary key,
item_id int not null,
feature varchar(255)
foreign key item_id references items(item_id)
);
The data would then have the positive features, so the first item would be:
insert into item_features (item_id, feature)
values (1, 'key_provided'), (1, 'lift');
This makes it easy to manage the features, particularly adding new ones. You might want to use a trigger, check constraint, or reference table to validate the feature names themselves, but I don't want to stray too far from your question.
Then checking for features is a little more complicated, but not that much more so. One method is explicitly using exists and not exists for each desired/undesired one:
select i.*
from items i
where exists (select 1
from item_features itf
where itf.item_id = i.item_id and
itf.feature = 'key_provided'
) and
exists (select 1
from item_features itf
where itf.item_id = i.item_id and
itf.feature = 'security_alarm'
) and
not exists (select 1
from item_features itf
where itf.item_id = i.item_id and
itf.feature = 'water supply'
);
For your existing data structure, you can filter as follows:
select i.*
from item i
inner join feature f
on f.item_id = i.id
and f.key_provided = 1
and f.security_alarm = 1
and f.water_supply = 0
This will give you all the apartments that satisfy the given criteria. For more criterias, you can just add more conditions to the on part of the join.
As a general comment about your design:
since you are creating a 1-1 relationship between apartments and features, you might as well consider having a single table to store them (spreading the information over two tables does not have any obvious advantages)
your design is OK as long as features do not change too often, since, basically, everytime a new feature is created, you need to add more columns to your table. If features are added (or removed) frequently, this can become heavy to manage; in that case, you could consider having a separated table where each (item, feature) tuple is stored in a different row, which will make this of things easier to do (with the downside that queries will get more complicated to write)
I have two tables
one as td_job which has these structure
|---------|-----------|---------------|----------------|
| job_id | job_title | job_skill | job_desc |
|------------------------------------------------------|
| 1 | Job 1 | 1,2 | |
|------------------------------------------------------|
| 2 | Job 2 | 1,3 | |
|------------------------------------------------------|
The other Table is td_skill which is this one
|---------|-----------|--------------|
|skill_id |skill_title| skill_slug |
|---------------------|--------------|
| 1 | PHP | 1-PHP |
|---------------------|--------------|
| 2 | JQuery | 2-JQuery |
|---------------------|--------------|
now the job_skill in td_job is actualy the list of skill_id from td_skill
that means the job_id 1 has two skills associated with it, skill_id 1 and skill_id 2
Now I am writing a query which is this one
SELECT * FROM td_job,td_skill
WHERE td_skill.skill_id IN (SELECT td_job.job_skill FROM td_job)
AND td_skill.skill_slug LIKE '%$job_param%'
Now when the $job_param is PHP it returns one row, but if $job_param is JQuery it returns empty row.
I want to know where is the error.
The error is that you are storing a list of id's in a column rather than in an association/junction table. You should have another table, JobSkills with one row per job/skill combination.
The second and third problems are that you don't seem to understand how joins work nor how in with a subquery works. In any case, the query that you seem to want is more like:
SELECT *
FROM td_job j join
td_skill s
on find_in_set(s.skill_id, j.job_skill) > 0 and
s.skill_slug LIKE '%$job_param%';
Very bad database design. You should fix that if you can.
I'm not very good at joining tables in mysql and I'm still learning,
So I wanted to ask, when joining two tables....
I have 2 tables
So for the first table I want to join the 2 of its columns (id & path) on the second table.
But on the second table there's no column name id and path, there is a column name pathid & value. The field of the pathid column is the same as the id.
it looks like this.
first table
| id | path |
---------------------
| 1 | country/usa |
| 2 | country/jpn |
| 3 | country/kor |
second table
| pathid | value |
-------------------
| 3 | 500 |
| 1 | 10000 |
| 2 | 2000 |
So on the first table, it indicates that for usa the id is 1, japan is 2, korea is 3.
And on the table it says that for pathid no. 3 ( which is the id for korea) the value is 500 and so on with the others.
I want it to look like this. So then the path will be joined on the second table on its corresponding value. How can I do this on mysql? Thank You
Desired Result
| id | path | value |
------------------------------
| 1 | country/usa | 10000 |
| 2 | country/jpn | 2000 |
| 3 | country/kor | 500 |
You can join on the columns irrespective of the column name as long as the data type match.
SELECT id, path, value
FROM firstTable, secondTable
WHERE id = pathid
If you have same column names on both tables then you need to qualify the name using alias. Say the column names for id were same on both tables then whenever you use id you should mention which table you are referring to. other wise it will complain about the ambiguity.
SELECT s.id, path, value
FROM firstTable f, secondTable s
WHERE f.id = s.pathid
Note that I ommited s. on other columns in select, it will work as long as the second table doesn't have columns with same name.
I'm trying to create a separate table that would track read/unread posts. Using MySQLi I will have two tables items and items_tracking. When the page is rendered it will join the the tables and check if the user read the posts or not.
items
+-------+------------+----+
| id | created_by | .. |
+-------+------------+----+
| item1 | id12 | .. |
| item2 | id433 | .. |
+-------+------------+----+
items_tracking
+---------+---------+------+
| user_id | item_id | read |
+---------+---------+------+
| id1 | item1 | 0 |
| id2 | item2 | 0 |
| id94 | item1 | 1 |
+---------+---------+------+
Now the idea was that whenever a new item/post is created in the items table, it will also create rows in the items_tracking table for all users and with column read = 0. Problem is, I have no idea how to work around this since the foreign key I would use in items_tracking is still pretty much undetermined.
Any ideas on how to approach inserting in both tables at the same time, while the second table references the first?
You don't need records with read=0 in the tracking table.
SELECT ..., t.read FROM items i LEFT JOIN items_tracking t ON (t.item_id = i.id)
This query will work even if there is no corresponding record in items_tracking; in this case, t.read in the result will be NULL. You only need to insert the records with read = 1, although you don't need even this flag, you test for t.item_id IS NOT NULL to see if you have a record in items_tracking.