So, I have 2 tables
On the first table, lets call it products, lets say I have
product_id, company_id (this is a FK), product_name.
On the second table, lets call it deals, I have
deal_id, company_id (same one as the first table), deal_title.
I need to add products to the deals. if I added a product_id field to the deals table, I would have multiple rows and ids for each deal, which is completely wrong. What is the correct way to do it?
You should add a table for manage the relation between products and deals
eg:
table products_deal
product_id
deal_id
What you want is a pivot table between the two tables you have that have a structure like:
|-deal_id----|-product_id----|
| 10 | 23 |
| 10 | 24 |
| 10 | 32 |
| ...
| ...
If you need to find all products associated with deal #10, you can just use a query like SELECT * FROM pivot_table WHERE deal_id = 10
Related
I want to get a record from a joint table at a time. But I don't hope the tables are joined as a whole.
The actual tables are as follow.
table contents -- stores content information.
+----+----------+----------+----------+-------------------+
| id | name |status |priority |last_registered_day|
+----+----------+----------+----------+-------------------+
| 1 | content_1|0 |1 |2020/10/10 11:20:20|
| 2 | content_2|2 |1 |2020/10/10 11:21:20|
| 3 | content_3|2 |2 |2020/10/10 11:22:20|
+----+----------+----------+----------+-------------------+
table clusters -- stores cluster information
+----+----------+
| id | name |
+----+----------+
| 1 | cluster_1|
| 2 | cluster_2|
+----+----------+
table content_cluster -- each record indicates that one content is on one cluster
+----------+----------+-------------------+
|content_id|cluster_id| last_update_date|
+----------+----------+-------------------+
| 1 | 1 |2020-10-01T11:30:00|
| 2 | 2 |2020-10-01T11:30:00|
| 3 | 1 |2020-10-01T10:30:00|
| 3 | 2 |2020-10-01T10:30:00|
+----------+----------+-------------------+
By specifying a cluster_id, I want to get one content name at a time where contents.status=2 and (contents name, cluster_id) pair is in content_cluster. The query in sql is something like follow.
SELECT contents.name
FROM contents
JOIN content_cluster
ON contents.content_id = content_cluster.content_id
where contents.status = 2
AND content_cluster.cluster_id = <cluster_id>
ORDER
BY contents.priority
, contents.last_registered_day
, contents.name
LIMIT 1;
However, I don't want the tables to be joined as a whole every time as I have to do it frequently and the tables are large. Is there any efficient way to do this? I can add some indices to the tables. What should I do?
I would try writing the query like this:
SELECT c.name
FROM contents c
WHERE EXISTS (SELECT 1
FROM content_cluster cc
WHERE cc.content_id = c.content_id AND
cc.cluster_id = <cluster_id>
) AND
c.status = 2
ORDER BY c.priority, c.last_registered_day, c.name
LIMIT 1;
Then create the following indexes:
content(status, priority, last_registered_day, name, content_id, name)
content_cluster(content_id, cluster_id).
The goal is for the execution plan to scan the index for context and for each row, look up to see if there is a match in content_cluster. The query stops at the first match.
I can't guarantee that this will generate that plan (avoiding the sort), but it is worth a try.
This query can easily be optimized by applying correct indexes. Apply the alter statements I am mentioning below. And let me know if the performance have considerably increased or not:
alter table contents
add index idx_1 (id),
add index idx_2(status);
alter table content_cluster
add index idx_1 (content_id),
add index idx_2(cluster_id);
If a content can be in multiple clusters and the number of clusters can change, I think that doing a join like this is the best solution.
You could try splitting your contents table into different tables each containing the contents of a specific cluster, but it would need to be updated frequently.
I want to aggregate data from table A and B into one output. Problem I am facing is that some single records are needed from table A (where is some ID in a column connected with table B containing multiple records with the same ID). I want to data in one column in table B and add this value to related ID from table A.
I have already tried with joins but it is not working for me still. Can you please take a look at this code?
select r.variable, s.variable_1 s.variable_2, sum(r.sum),
from table1 r
join table2 s on r.variable = s.variable
where some_cirrcumstances
group by r.variable ,s.variable_1
order by r.variable ,s.variable_1;
Regards
Edit: please keep an eye on this translation
Please find an example what results I want:
Data in table A:
ID | variable_1 | variable_2 | Description
There is a lot of unique rows and I want to combine it with data from table B which looks like:
Data in table B:
Week_1 | ID | other_variable_1 | our_variable
Week_1 | ID | other_variable_1 | our_variable
Week_1 | ID | other_variable_1 | our_variable
ID is a connector between but I dont know how to combine these data. We can have multiple rows for one ID and sum of column per ID is needed.
Sorry if my question seems unclear, I'll try to explain.
I have a column in a row, for example /1/3/5/8/42/239/, let's say I would like to find a similar one where there is as many corresponding "ids" as possible.
Example:
| My Column |
#1 | /1/3/7/2/4/ |
#2 | /1/5/7/2/4/ |
#3 | /1/3/6/8/4/ |
Now, by running the query on #1 I would like to get row #2 as it's the most similar. Is there any way to do it or it's just my fantasy? Thanks for your time.
EDIT:
As suggested I'm expanding my question. This column represents favourite artist of an user from a music site. I'm searching them like thisMyColumn LIKE '%/ID/%' and remove by replacing /ID/ with /
Since you did not provice really much info about your data I have to fill the gaps with my guesses.
So you have a users table
users table
-----------
id
name
other_stuff
And you like to store which artists are favorites of a user. So you must have an artists table
artists table
-------------
id
name
other_stuff
And to relate you can add another table called favorites
favorites table
---------------
user_id
artist_id
In that table you add a record for every artist that a user likes.
Example data
users
id | name
1 | tom
2 | john
artists
id | name
1 | michael jackson
2 | madonna
3 | deep purple
favorites
user_id | artist_id
1 | 1
1 | 3
2 | 2
To select the favorites of user tom for instance you can do
select a.name
from artists a
join favorites f on f.artist_id = a.id
join users u on f.user_id = u.id
where u.name = 'tom'
And if you add proper indexing to your table then this is really fast!
Problem is you're storing this in a really, really awkward way.
I'm guessing you have to deal with an arbitrary number of values. You have two options:
Store the multiple ID's in a blob object in JSON format. While MySQL doesn't have JSON functions built in, there are user defined functions that will extract values for you, etc.
See: http://blog.ulf-wendel.de/2013/mysql-5-7-sql-functions-for-json-udf/
Alternatively, switch to PostGres
Add as many columns to your table as the maximum number of ID's you expect to have. So if /1/3/7/2/4/8/ is the longest entry, have 6 columns in your table. Reason this is bad: you'll have sparse columns that'll unnecessarily slow your tables.
I'm sure you could write some horrific regex to accomplish the task, but I caution on using complex regex's on enormous tables.
I am working on a social network website, so i hope users will be a lot.
I need to save tags (key | counter) for every user and i wonder if it's better to use 1) a big table vs 2) one really large table vs 3) splitted big tables.
1) this is an example for many tables implementation
table userid_tags (every user has it's own table)
key | counter
----- ---------
tag1 | 3
tag2 | 1
tag3 | 10
Query 1: SELECT * FROM userid_tags WHERE key='tag1'
Query 2: SELECT * FROM userid_tags
2) single table implementation:
table tags
key | counter | user_id
----- ------------------
tag1 | 3 | 20022
tag2 | 1 | 20022
tag2 | 10 | 31234
Query 1: SELECT * FROM userid_tags WHERE key='tag1' AND user_id='20022'
Query 2: SELECT * FROM userid_tags AND user_id='20022'
3) splitted tables implementation
table 1000_tags (user_id from 1 to 1000)
key | counter | user_id
----- ------------------
tag1 | 3 | 122
tag2 | 1 | 122
tag2 | 10 | 734
table 21000_tags (user_id from 20000 to 21000)
key | counter | user_id
----- ------------------
tag1 | 3 | 20022
tag2 | 1 | 20022
tag2 | 10 | 20234
Query 1: SELECT * FROM userid_tags WHERE key='tag1' AND user_id='20022'
Query 2: SELECT * FROM userid_tags AND user_id='20022'
Question for 3) what's a good split index? i used 1000 (users) following the instict
2 is the right answer. Think about how you are going to maintain one table per user, or 1 table per 1000 tags. How Will you create/update/delete the tables? What if you have to make mass changes? How will you be able to figure out which table you need to select from? Even if you can, what if you need to select from more than one of those tables simultaneously (e.g. get the tags for two users).
Having the tables split up won't give you much of a performance benefit as it is. It's true that if the tables grow very large inserts may become slower because mysql has to create the keys, but as long as you have the appropriate keys look ups should be very fast.
Another similar solution would be to have a table for tags, a table for users, and a table that maps both of them. This will keep the tag cardinality small and if you're using an auto_increment surrogate key for both tables, the key length for both will be small which should make look ups as fast as possible with no restrictions on the relation (i.e. having to figure out other tables to join on for other users).
Using option 2 is the correct way to handle this. You can still use partitions within the table though. All the information about using partition can be found in the MySQL documentation.
Splitting the table in partitions for every thousand users would look something like:
CREATE TABLE tags (`key VARCHAR(50), counter INT, user_id INT)
PARTITION BY KEY(user_id) partitions 1000;
If the user_id would be 21001 you could start searching in the correct partition something like:
SELECT * FROM tags PARTITION (p22);'
Because the id 21001 would be in the 22nd partition. Check the link for more information.
Lookup table - unique row identity
The other lookup tables just do not make sense as from what I have seen giving a row an ID then putting that id in another table which also has a id then adding these id's to some more tables which may reference them and still creating a lookup tables with more id's (this is how all the examples I can find seem) What I have done is this :
product_item - table
------------------------------------------
id | title | supplier | price
1 | title11 | suuplier1 | price1
etc.
it then goes on to include more items (sure you get it)
product_feature - table
--------------------------
id | title | iskeyfeature
1 | feature1 | true
feature_desc - table
-----------------------------
id | title | desc
1 | desc1 | text description
product_lookup - table
item_id | feature_id | feature_desc
1 | 1 | 1
1 | 2 | 2
1 | 3 | 3
1 |64 | 15
(as these only need to be referenced in the lookup the id's can be multiples per item or multiple items per feature)
What I want to do without adding item_id to every feature row or description row is retrieve only the columns from the multiple tables where their id is referenced in the same row of the lookup table. I want to know if it is possible to select all the referenced columns from the lookup row if I only know the item_id eg. Item_id = 1 return all rows where item_id = 1 with the columns referenced in the same row. Every item can have multiple features and also every feature could be attached to multiple items , this will not matter if I can just get the pattern right in how to construct this query from a single known value.
Any assistance or just some direction will be greatly appreciated. I'm using phpmyadmin, and sure this will be easier with some php voodoo I am learning mysql from tutorials ect and would like to know how to do it with sql directly.
Having a NULL value in a column is not the major concern that would lead to this design - it's the problem with adding new attribute columns in the future, at which MySQL is disgracefully bad.
If you want to make a query that returns everything about an item in one row, you need to LEFT OUTER JOIN back to the product_lookup table for each feature_id. This is about every 10th mysql question on Stack Overflow, so you should be able to find tons of examples.