How to get rid of arrays in mysql database? - mysql

Dining room specializes on complex dinners. Have collection of recipes (each of them collect rates of the products). Every product have changeable price.
Is it the best design?
Recipe(r_id, r_title, r_category, r_price)
Product(p_id, p_title, p_price)
UsingProducts(r_id, p_id, amount)
I am just not sure about UsingProducts..

The design looks quite okay.
As zerkms mentioned, you're lacking units. That doesn't have to be a problem, as your product can be "100 g flour" so the unit is implicit. However, when printing the recipe, you would print "5 x 100 g flour" instead of "500 g flour". It would also print "10 x 100 g flour" instead of "1 kilo flour".
Just think about whether this an issue for you and if you even need unit conversion like 1000 g = 1 kilo.
Another point is your category. So a recipe can only belong to one category. So you won't have something like "vegetarian" and "soups" with the problem where to place a vegetarian soup, but use distinct categories instead. Okay. However, don't you want a table for them, so to be able to easily select them? If you want to stay with this design you should at least make them an enum column (something special in MySQL), so you dont mistakenly have recipes in "soups" and others in "suops".
At last: What is the r_price for? Shouldn't that be the the sum of all sub prices (product price x amount)? Don't hold data redundantly. This must not be done. Otherwise inconsistencies can occur (e.g. 10$ + 10$ = 30$). Remove r_price from table recipe to have a normalized database.

Related

Find the table entry with the largest number, only if it matches a condition

I have a table like below and I want to return the name of the item with the greatest effect of a particular type. For example, I want the name of the ring with the best 'Shield' enchantment, in this case 'Brusef Amelion's Ring'.
Description
Apparel slot
Effect Type
Effect Value
Apron of Adroitness
Chest
Fortify Agility
5 pts
Brusef Amelion's Ring
Ring
Shield
18%
Cuirass of the Herald
Chest
Fortify Health
15 pts
Fortify Magicka Pants
Legs
Fortify Magicka
20 pts
Grand ring of Aegis
Ring
Shield
6%
I've tried using a MAXIFS statement:
=maxifs(Effect Value, Apparel Slot, "=Ring", Effect Type, "=Shield")
and that returns 0.18, as I'd expect. But I want it to return the name of this item, 'Brusef Amelion's Ring'. So I then tried using a vlookup on this value, but there doesn't seem to be the option to only lookup a value if ('Apparel Slot'='Ring' && 'Effect Type'='Shield'), for example.
I feel like there must be a way of nesting some specific functions here, but I can't quite figure it out.
Is there any way to do this while avoiding manually sorting and filtering my data before running each query?
Is this what you are looking for?
=DGET(A1:D6,"Description",{"Apparel slot","Effect Value";"Ring",MAXIFS(D2:D6,B2:B6,"Ring",C2:C6,"Shield")})
DGET function needs column headers to work with as you can see. So we want the value from the column Description. "Originally" this is a DGET function:
DGET(A2:F20,"price",{"Ticker";"Google"})
This is saying: Find price where ticker = google.
We can enhance this a bit by entering two criteria's this is done , or \ separated (depends on your country settings) like this:
DGET(A2:F20,"price",{"Ticker","Year;"Google",2020})
DGET(A2:F20;"price";{"Ticker"\"Year;"Google"\2020})
Find price where ticker = google and year = 2020.
Ofcource you can replace "Google" with a cell reference. Hope this helps.

Comparision of data in huge databases

I have a database in mysql which has a collection of attributes (ex. 'weight', 'height', 'no of pages' etc) and attribute values (ex. '30 tons', '12 inches', '2 pgs' etc) and mapped with the respective product ids.
The data has been collected from different sites and hence the attribute values have different formats (ex. '222 pgs' or '222 pages' or '222') (ex2. '12 inches', '12 meters', '12 cms').
What I need to do is that I have to compare the values of same attributes of different products. So I have to compare '222 pgs' with '222 pages' for all the attributes which differ in formats.
There are around 4000 attributes and the number will increase further. Is there any way to compare these without having to assign each attribute a specific type individually? Or what is the fastest way to compare these?
Well, until they invent a clairvoyant computer, a human being will have to tell it that pgs and pages mean the same thing and that inches and meters are convertible.
You'll have to sanitize the data one way or another. I'd probably start by identifying units that measure the same dimension1 and common aliases2 for each unit, then parse the data to split the quantity from the unit and normalize3 the unit. Once you have done that, the data becomes directly comparable.
But all this is really just a remedy for the problem that should not have been there in the first place, were the database designed properly.
1 A "mass" is a dimension measured by units such as kg, t, lb etc. A "length" is a dimension measured by m, km, in etc.
2 E.g. an in and inch denote exactly the same unit, pgs and pages are the same etc.
3 I.e. make sure a particular dimension is always represented by the same unit: for example convert all lengths to m, all masses to kg, all pages to pages etc.
You haven't explained what you want to do after you find out that attributes for a pair of products differ (while still meaning the same thing).
I.e.: if I see that in Instance A has field Length set to "12 pgs" and Instance B has Length reporting "12 pages" what do you do?
List this? Autocorrect? Drop one of the two values? Open a window for a human user to correct?
Personally I'd go for a "select attribute,count(*) from X group by attribute" so that you can find out the most common spelling of the unit, and then you can also write corrective scripts that may automatically convert ".. pgs" to " pages" as soon as you have decided the correct representation.
Of course this will not help at all unless you enforce correct spelling of the units, and this requires for sure better input-output filters, including the main UI, but also any kind of bulk uploader utility you may use to create or update products.
A redesign of the DB to add "Unit" as an extra, categorized attribute for each measure would also help a lot.

mysql best way to manage product sizes

I am developing product database, for sizes i created a separate table PRODUCT_SIZE(id,sizetext)
e.g. (1,'Small') ,(2,'Large'), (3,'Extra Large'),...
I provided These sizes list as checkbox, when a product is added, all possible sizes can be selected against current product.
e.g. for T-Shirt, SMALL, and LARGE sizes selected.
these 2 Sized are available against each new stock purchased entry.
Now i came to know, that there can be different size units, some items can be in inches, some in kg, and some in meters.
I have a altered solution in mind:
to alter table
PRODUCT_SIZE(id,sizetext, UNitType);
Now it can be: (1,'5','KG') ,(2,'10','KG'), (3,'2.5'.'Inches'),...
Is ther any better approch, suggestion?
It seems like you're forcing 'clothing size', 'weight' and 'length' into one 'size' attribute.
Try these tables:
product (product_id, name)
"Nike t-shirt"
attribute_group (attribute_group_id, name)
"Shirt size", "Weight", "Length", etc.
attribute_value (attribute_value_id, attribute_group_id, name)
"Shirt size" would have rows for "Small", "Large", etc.
product_attribute (product_id, attribute_value)
"Nike t-shirt" is "Large"
Add a "display order" to attribute_value, too (so "Small" can be displayed before "Large").
Do this for your other attributes, too.
I've done this for a production site, and I think it worked well.
Good luck.
Instead of making a seperate table for this, why don't you just put all of your dropdown options in an application scoped variable? Then you can just add that data right into a field in product as a string and deal with the different options/units programmatically.
I made a database storing clothes where the sizes were a few types.One article has sizes like xs s m l other is
26 27 28 29 30 and so on.
I decided to do this:
# on one side in the script i define size types and names;
$sizeTypes[1] = [XS, S, M, L];
$sizeTypes[2] = [29, 30, 31, 32];
#The and so on
#and on the other side in the database, there are just two columns
size_type_id(int) | size_qty |
# so if I have one article with 3 pieces of size S 2 pieces of size M and 5 pieces of size L the database will store:
size_type_id| size_qty |
1 |0:0;1:3;2:2;3:5|
then in the script I just translate it so that 0 of type 1 is XS 1 of type 1 is S 2 of type 2 is 31 and so on

Stock management of assemblies and its sub parts (relationships)

I have to track the stock of individual parts and kits (assemblies) and can't find a satisfactory way of doing this.
Sample bogus and hyper simplified database:
Table prod:
prodID 1
prodName Flux capacitor
prodCost 900
prodPrice 1350 (900*1.5)
prodStock 3
-
prodID 2
prodName Mr Fusion
prodCost 300
prodPrice 600 (300*2)
prodStock 2
-
prodID 3
prodName Time travel kit
prodCost 1200 (900+300)
prodPrice 1560 (1200*1.3)
prodStock 2
Table rels
relID 1
relSrc 1 (Flux capacitor)
relType 4 (is a subpart of)
relDst 3 (Time travel kit)
-
relID 2
relSrc 2 (Mr Fusion)
relType 4 (is a subpart of)
relDst 3 (Time travel kit)
prodPrice: it's calculated based on the cost but not in a linear way. In this example for costs of 500 or less, the markup is a 200%. For costs of 500-1000 the markup is 150%. For costs of 1000+ the markup is 130%
That's why the time travel kit is much cheaper than the individual parts
prodStock: here is my problem. I can sell kits or the individual parts, So the stock of the kits is virtual.
The problem when I buy:
Some providers sell me the Time Travel kit as a whole (with one barcode) and some sells me the individual parts (with a different barcode)
So when I load the stock I don't know how to impute it.
The problem when I sell:
If I only sell kits, calculate the stock would be easy: "I have 3 Flux capacitors and 2 Mr Fusions, so I have 2 Time travel kits and a Flux Capacitor"
But I can sell Kits or individual parts. So, I have to track the stock of the individual parts and the possible kits at the same time (and I have to compensate for the sell price)
Probably this is really simple, but I can't see a simple solution.
Resuming: I have to find a way of tracking the stock and the database/program is the one who has to do it (I cant ask the clerk to correct the stock)
I'm using php+MySql. But this is more a logical problem than a programing one
Update: Sadly Eagle's solution wont work.
the relationships can and are recursive (one kit uses another kit)
There are kit that does use more than one of the same part (2 flux capacitors + 1 Mr Fusion)
I really need to store a value for the stock of the kit. The same database is used for the web page where users want to buy the parts. And I should show the avaliable stock (otherwise they wont even try to buy). And can't afford to calculate the stock on every user search on the web page
But I liked the idea of a boolean marking the stock as virtual
Okay, well first of all since the prodStock for the Time travel kit is virtual, you cannot store it in the database, it will essentially be a calculated field. It would probably help if you had a boolean on the table which says if the prodStock is calculated or not. I'll pretend as though you had this field in the table and I'll call it isKit for now (where TRUE implies it's a kit and the prodStock should be calculated).
Now to calculate the amount of each item that is in stock:
select p.prodID, p.prodName, p.prodCost, p.prodPrice, p.prodStock from prod p where not isKit
union all
select p.prodID, p.prodName, p.prodCost, p.prodPrice, min(c.prodStock) as prodStock
from
prod p
inner join rels r on (p.prodID = r.relDst and r.relType = 4)
inner join prod c on (r.relSrc = c.prodID and not c.isKit)
where p.isKit
group by p.prodID, p.prodName, p.prodCost, p.prodPrice
I used the alias c for the second prod to stand for 'component'. I explicitly wrote not c.isKit since this won't work recursively. union all is used rather than union for effeciency reasons, since they will both return the same results.
Caveats:
This won't work recursively (e.g. if
a kit requires components from
another kit).
This only works on kits
that require only one of a particular
item (e.g. if a time travel kit were
to require 2 flux capacitors and 1
Mr. Fusion, this wouldn't work).
I didn't test this so there may be minor syntax errors.
This only calculates the prodStock field; to do the other fields you would need similar logic.
If your query is much more complicated than what I assumed, I apologize, but I hope that this can help you find a solution that will work.
As for how to handle the data when you buy a kit, this assumes you would store the prodStock in only the component parts. So for example if you purchase a time machine from a supplier, instead of increasing the prodStock on the time machine product, you would increase it on the flux capacitor and the Mr. fusion.

Human name comparison: ways to approach this task

I'm not a Natural Language Programming student, yet I know it's not trivial strcmp(n1,n2).
Here's what i've learned so far:
comparing Personal Names can't be solved 100%
there are ways to achieve certain degree of accuracy.
the answer will be locale-specific, that's OK.
I'm not looking for spelling alternatives! The assumption is that the input's spelling is correct.
For example, all the names below can refer to the same person:
Berry Tsakala
Bernard Tsakala
Berry J. Tsakala
Tsakala, Berry
I'm trying to:
build (or copy) an algorithm which grades the relationship 2 input names
find an indexing method (for names in my database, for hash tables, etc.)
note:
My task isn't about finding names in text, but to compare 2 names. e.g.
name_compare( "James Brown", "Brown, James", "en-US" ) ---> 99.0%
I used Tanimoto Coefficient for a quick (but not super) solution, in Python:
"""
Formula:
Na = number of set A elements
Nb = number of set B elements
Nc = number of common items
T = Nc / (Na + Nb - Nc)
"""
def tanimoto(a, b):
c = [v for v in a if v in b]
return float(len(c)) / (len(a)+len(b)-len(c))
def name_compare(name1, name2):
return tanimoto(name1, name2)
>>> name_compare("James Brown", "Brown, James")
0.91666666666666663
>>> name_compare("Berry Tsakala", "Bernard Tsakala")
0.75
>>>
Edit: A link to a good and useful book.
Soundex is sometimes used to compare similar names. It doesn't deal with first name/last name ordering, but you could probably just have your code look for the comma to solve that problem.
We've just been doing this sort of work non-stop lately and the approach we've taken is to have a look-up table or alias list. If you can discount misspellings/misheard/non-english names then the difficult part is taken away. In your examples we would assume that the first word and the last word are the forename and the surname. Anything in between would be discarded (middle names, initials). Berry and Bernard would be in the alias list - and when Tsakala did not match to Berry we would flip the word order around and then get the match.
One thing you need to understand is the database/people lists you are dealing with. In the English speaking world middle names are inconsistently recorded. So you can't make or deny a match based on the middle name or middle initial. Soundex will not help you with common name aliases such as "Dick" and "Richard", "Berry" and "Bernard" and possibly "Steve" and "Stephen". In some communities it is quite common for people to live at the same address and have 2 or 3 generations living at that address with the same name. The only way you can separate them is by date of birth. Date of birth may or may not be recorded. If you have the clout then you should probably make the recording of date of birth mandatory. A lot of "people databases" either don't record date of birth or won't give them away due to privacy reasons.
Effectively people name matching is not that complicated. Its entirely based on the quality of the data supplied. What happens in practice is that a lot of records remain unmatched - and even a human looking at them can't resolve the mismatch. A human may notice name aliases not recorded in the aliases list or may be able to look up details of the person on the internet - but you can't really expect your programme to do that.
Banks, credit rating organisations and the government have a lot of detailed information about us. Previous addresses, date of birth etc. And that helps them join up names. But for us normal programmers there is no magic bullet.
Analyzing name order and the existence of middle names/initials is trivial, of course, so it looks like the real challenge is knowing common name alternatives. I doubt this can be done without using some sort of nickname lookup table. This list is a good starting point. It doesn't map Bernard to Berry, but it would probably catch the most common cases. Perhaps an even more exhaustive list can be found elsewhere, but I definitely think that a locale-specific lookup table is the way to go.
I had real problems with the Tanimoto using utf-8.
What works for languages that use diacritical signs is difflib.SequenceMatcher()