mysql lookup table - mysql

Lookup table - unique row identity
The other lookup tables just do not make sense as from what I have seen giving a row an ID then putting that id in another table which also has a id then adding these id's to some more tables which may reference them and still creating a lookup tables with more id's (this is how all the examples I can find seem) What I have done is this :
product_item - table
------------------------------------------
id | title | supplier | price
1 | title11 | suuplier1 | price1
etc.
it then goes on to include more items (sure you get it)
product_feature - table
--------------------------
id | title | iskeyfeature
1 | feature1 | true
feature_desc - table
-----------------------------
id | title | desc
1 | desc1 | text description
product_lookup - table
item_id | feature_id | feature_desc
1 | 1 | 1
1 | 2 | 2
1 | 3 | 3
1 |64 | 15
(as these only need to be referenced in the lookup the id's can be multiples per item or multiple items per feature)
What I want to do without adding item_id to every feature row or description row is retrieve only the columns from the multiple tables where their id is referenced in the same row of the lookup table. I want to know if it is possible to select all the referenced columns from the lookup row if I only know the item_id eg. Item_id = 1 return all rows where item_id = 1 with the columns referenced in the same row. Every item can have multiple features and also every feature could be attached to multiple items , this will not matter if I can just get the pattern right in how to construct this query from a single known value.
Any assistance or just some direction will be greatly appreciated. I'm using phpmyadmin, and sure this will be easier with some php voodoo I am learning mysql from tutorials ect and would like to know how to do it with sql directly.

Having a NULL value in a column is not the major concern that would lead to this design - it's the problem with adding new attribute columns in the future, at which MySQL is disgracefully bad.
If you want to make a query that returns everything about an item in one row, you need to LEFT OUTER JOIN back to the product_lookup table for each feature_id. This is about every 10th mysql question on Stack Overflow, so you should be able to find tons of examples.

Related

Delete partial duplicate rows

I have a Dataverse table that has a few columns. One of those columns is an Order Number column. There should only be one row per order number. If there is more than 1, only the first one should be kept. How can I do this in Power Automate?
What I have tried so far: First, I created an array of all the order numbers. From there, I feel stuck. I started to add an Apply to Each action, loop through the table, count how many of each order number there are, but then I confused myself and didn't think that was the right way to go.
Or...is there a way to keep the "duplicate" rows from getting added to the Dataverse table in the first place? The data is getting loaded into the table via a JSON load. Is there a way to delete the "duplicate" items from the JSON?
Here's an example of the situation:
| OrderNumber | OrderDate | CustomerName |
| 450123| 2-24-22 | Business A |
| 450123| 2-25-22 | Business A |
| 383238| 2-24-22 | Business B |

MySQL full text search matching similar results

I'll try to explain my situation: I'm trying to create a search engine for products on my website, so when the user needs to find a product I need to show similar ones, here's an example.
User searches:
assassins creed OR assassinscreed OR aSsAssIn's CreeD assuming there are no letters/numbers mispelling (those 3 queries should produce the same result)
Expected results:
Assassin's Creed AND Assassin's Creed: Unity AND Assassin's Creed: Special Edition
What have I tried so far
I have created a MySQL field for the search engine which contains a parsed name of the product (Assassin's Creed: Unity -> assassinscreedunity
I parse the search query
I search using MySQL's INSTR()
My problem
I'm fine by using this, but I heard it can be slow when the number of rows increases, I've created a full-text index in my table, but I don't think it would help, so I need another solution.
Thanks for any answer, and ask me anything before downvoting.
First of all, you should keep track of performance issues in your queries more precisely than 'heard it cand be slow' and 'think it would help'. One starting point may be the Slow Query Log.
If you have a table which contains the same parsed name in more than one row, consider normalizing your database. In the specific case, store unique parsed names in one table, and only the id of the corresponding parsed name in the table you described in your question. This way, you only need to check the smaller table with unique names and can then quickly find all matching entries in the main table by id.
Example:
Consider the following table with your structure
id | product_name | rating
-----------------------------------
1 | assassinscreedunity | 5
2 | assassinscreedunity | 2
3 | monkeyisland | 3
4 | monkeyisland | 5
5 | assassinscreedunity | 4
6 | monkeyisland | 4
you would have to scan all six entries to find relevant rows.
In contrast, consider two tables like this:
id | p_id | rating
--------------------
1 | 1 | 5
2 | 1 | 2
3 | 2 | 3
4 | 2 | 5
5 | 1 | 4
6 | 2 | 4
id | name
--------------------------
1 | assassinscreedunity
2 | monkeyisland
In this case, you only have to scan two entries (compared to six) and can then efficiently look up relevant rows using the integer id.
To further enhance the performance, you could extend the concept of a parsed name and use hashes. For example, you could calculate the SHA1-hash of your parsed name which is a 160 bit value. You can find entries in your database for this value very efficiently. To match substrings, you can add them to the second table as well. Since the hash only needs to computed once, you still can use the database to match by an integer. Another thing for you might be fuzzy hashing.
In addition, you should read up on the Rabin–Karp algorithm or string searching in general.

Make UNIQUE KEY relational to query

That might sound weird, I know. This is an explanation:
1- I have the following table - items (which gets updated because users can update the amount of items as well as the content inside of them):
| id | content | item_id | Order (Unique Index)|
|:-----------|------------:|:------------:|:--------------------:
| 1 | This | 1 | 1
| 2 | is | 1 | 2
| 3 | content | 1 | 3
| 4 | Some | 2 | 1
| 5 | More | 2 | 2
| 6 | More! | 3 | 1
2- On my server, I am running a query that will iterate through my POSTed content check every item based on its item_id as well as to check if the order in that row is set. If order is set, then update the content, else insert new content
Lets say that my content is POSTing 4 items and the item_id = 1. Preferably, what I would want it to do would be this:
| id | content | item_id | Order (Unique Index)|
|:-----------|------------:|:------------:|:--------------------:
| 1 | This | 1 | 1
| 2 | updated | 1 | 2
| 3 | content | 1 | 3
| 4 | Some | 2 | 1
| 5 | More | 2 | 2
| 6 | More! | 3 | 1
> 7 | added | 1 | 4
Note that what happened was, it added a new row because my POSTed content had four items in it. It iterated though every single one, checked if the order existed, if the order existed, then update the value, else create a new row and insert the value as well as the order (key). The order is pretty much the key. That's how I am setting it in there.
It doesn't work when I do this:
// Start loop - for (key in content) {
INSERT INTO items (item_id, content, content_order) VALUES (?, content[key], ?) WHERE item_id = ? ON DUPLICATE KEY UPDATE content = ?
// End loop
What the loop is doing, it is iterating through every content POSTed and inserting it into the database, and if there is a duplicate key (in this case, the Unique Index is the Order column) then only update the content inside of it.
The problem with this is, it will only work on the first three items. Why? Because the first three items are the first ones with those unique indexes. If I was to update the item in which the item_id is 2, then it would give me an error because I cannot update items that have the same unique key. I cannot even INSERT anything because it violates the Unique Index constraints!
So how can I do this?
Is there a way to make the Unique Index absolute to the query - meaning that it will only keep in mind the Unique Indexes based on the queries' specified item_id? (Doubt it)
How can I make it so that it checks if the order is set and update the content or insert a new row without using unique keys?
Is there an alternate way to write this?
If elaboration is needed, please let me know. Thanks.
A straightforward design for your needs would probably have no problems. Although your question is unclear, especially about new content.
Order is not a key of items. Because column order is not unique. The key you want is (item_id,order).
Do you need items id? I'm going to ignore it. I'm going to treat new content as if it were in a table. You probably should build a constant subquery from it. Do all new content item_ids appear in items?
1. No NULLs.
A simple design is to have a version of items called content that holds the rows that make the following fill-in-the-blanks statement true. I'll assume items order is consecutive within item_id.
// item [item_id]'s [order]th content is [content]
content(item_id,order,content)
primary key (item_id,order)
I will guess at your new content format and effect. I'll assume order is consecutive from 1 within item_id. I'll replace all content info for a new content item_id.
// item [item_id]'s [order]th content is [content]
more_content(item_id,order,content)
primary key (item_id,order)
delete from content
where item_id in (select item_id from more_content)
insert into content
select * from more_content
2. NULLs
If NULL order indicates that there is no content then you can instead have content NULL and order=1. (You can also have another table and no NULLs.) If NULL order indicates that there is an unchanged default then just have another table:
// item [item_id]'s content is a default
has_default(item_id)
primary key (item_id)
delete from has_default
where item_id in (select item_id from more_content)
delete from content
where item_id in (select item_id from more_content)
insert into content
select * from more_content
If you want items readable the way it is, make a view:
// [order] is null AND item [item_id]'s default content is [content]
// OR [order] is not null AND item [item_id]'s [order]th content is [content]
create view items as (
select c.item_id,c.content,if(d.item_id is null,c.ord,NULL) as ord
from content c left join has_default d on c.item_id=d.item_id
)
It's hard to make out much more about your design.
It may be difficult to implement constraints for any design for your needs in SQL. But you should start with a straightforward design.

Constrain database to hold one value or the other never both

Is it possible to add a database constraint to limit a row to have a single value in one of two columns, never more and never less? Let me illustrate:
Sales Order Table
---------------------------------
id | person_id | company_id |
Rows for this would look like:
id | person_id | company_id |
---|-----------|------------|
1 | 1 | null |
2 | 2 | null |
3 | null | 1 |
4 | null | 2 |
In this illustration, the source of the sales order is either a person or a company. It is one or the other, no more or less. My question is: is there a way to constrain the database so that 1) both fields can't be null and 2) both fields can't be not-null? i.e., one has to be null and one has to be not-null...
I know the initial reaction from some may be to combine the two tables (person, company) into one customer table. But, the example I'm giving is just a very simple example. In my application the two fields I'm working with cannot be combined into one.
The DBMS I'm working with is MySQL.
I hope the question makes sense. Thank you in advance for your help!
This may come as a shock...
mysql doesn't support CHECKconstraints. It allows you to define them, but it totally ignores them.
They are allowed in the syntax only to provide compatibility with other database's syntax.
You could use a trigger on update/insert, and use SIGNAL to raise an exception.

How to get the right "version" of a database entry?

Update: Question refined, I still need help!
I have the following table structure:
table reports:
ID | time | title | (extra columns)
1 | 1364762762 | xxx | ...
Multiple object tables that have the following structure
ID | objectID | time | title | (extra columns)
1 | 1 | 1222222222 | ... | ...
2 | 2 | 1333333333 | ... | ...
3 | 3 | 1444444444 | ... | ...
4 | 1 | 1555555555 | ... | ...
In the object tables, on an object update a new version with the same objectID is inserted, so that the old versions are still available. For example see the entries with objectID = 1
In the reports table, a report is inserted but never updated/edited.
What I want to be able to do is the following:
For each entry in my reports table, I want to be able to query the state of all objects, like they were, when the report was created.
For example lets look at the sample report above with ID 1. At the time it was created (see the time column), the current version of objectID 1 was the entry with ID 1 (entry ID 4 did not exist at that point).
ObjectID 2 also existed with it's current version with entry ID 2.
I am not sure how to achieve this.
I could use a query that selects the object versions by the time column:
SELECT *
FROM (
SELECT *
FROM objects
WHERE time < [reportTime]
ORDER BY time DESC
)
GROUP BY objectID
Lets not talk about the performance of this query, it is just to make clear what I want to do. My problem is the comparison of the time columns. I think this is no good way to make sure that I got the right object versions, because the system time may change "for any reason" and the time column would then have wrong data in it, which would lead to wrong results.
What would be another way to do so?
I thought about not using a time column for this, but instead a GLOBAL incremental value that I know the insertion order across the database tables.
If you are interting new versions of the object, and your problem is the time column(I assume you are using this column to sort which one is newer); I suggest you to use an auto-incremental ID column for the versions. Eventually, even if the time value is not reliable for you, the ID will be.Since it is always increasing. So higher ID, newer version.