Saving user's height and weight - mysql

How should I store a user's height and weight in a MySQL database such that I can use the information to find users within a certain height or weight? Also, I will need to be able to display this information in either English or metric system.
My idea is to store the information for height in centimeters and weight in kilograms (I prefer metric over English). I can even let the user enter their information and English system, but do the conversion to metric before saving. I think converting kilograms to pounds might be easy to do in SQL, but I'm not sure how easy it would be to convert 178 centimeters to 5'10" (rounded slightly down).
Should I be saving English and metric values in the database so that I don't need to do conversions when I do my queries? Sounds like a bad idea to store derived/computed values.

There are several ways... one is to just have two numeric columns, one for height, one for weight, then do the conversions (if necessary) at display time. Another is to create a "height" table and a "weight" table, each with a primary key that is linked from another table. Then you can store both English and metric values in these tables (along with any other meta info you want):
CREATE TABLE height (
id SERIAL PRIMARY KEY,
english VARCHAR,
inches INT,
cm INT,
hands INT // As in, the height of a horse
);
INSERT INTO height VALUES
(1,'4 feet', 48, 122, 12),
(2,'4 feet, 1 inch', 49, 124, 12),
(3,'4 feet, 2 inches', 50, 127, 12),
(3,'4 feet, 3 inches', 51, 130, 12),
....
You get the idea...
Then your users table will reference the height and weight tables--and possibly many other dimension tables--astrological sign, marital status, etc.
CREATE TABLE users (
uid SERIAL PRIMARY KEY,
height INT REFERENCES height(id),
weight INT references weight(id),
sign INT references sign(id),
...
);
Then to do a search for users between 4 and 5 feet:
SELECT *
FROM users
JOIN height ON users.height = height.id
WHERE height.inches >= 48 AND height.inches <= 60;
Several advantages to this method:
You don't have to duplicate the "effort" (as if it were any real work) to do the conversion on display--just select the format you wish to display!
It makes populating drop-down boxes in an HTML select super easy--just SELECT english FROM height ORDER BY inches, for instance.
It makes your logic for various dimensions--including non-numerical ones (like astrological signs) obviously similar--you don't have special case code all over the place for each data type.
It scales really well
It makes it easy to add new representations of your data (for instance, to add the 'hands' column to the height table)

I would do it the way that you have said you would like to do it, but on the converting part, you would not convert 178 centimeters to 5'10", you would convert it to 70", then if need be, convert that into 5'10".

Think of 5'10" as either 70" or 5.8333333'. In that case, converting betwen 70" or 5.83333 is just a multiplication, so its easy to store in the db as centimeters if you so choose.
The issue of what the user sees is a presentation issue and nothing to do with the database.

I agree that storing computed values in this case is not ok. Your choices are perfect.
However, I would do the computations at the application level and query the DB with those values - depending on the language your application is written in , I am sure there are plenty o libraries/modules that are made that can compute those transformations.
Edit - to address the issue of storing computed values in DB:
While this is considered to be a bad practice in working with DBs, I usually am not 100% against this practice - just 90%.
I tend to store computed values in DB only when the computations are complex and would take enormous resources to get to the result wanted - this is clearly not the case.
If you would store computed values here you would have only the disadvantages of this technique - when modifying a record, you would have to modify the data in multiple places to keep the consistency of your DB

Related

Why is MS Access returning some results in scientific notation?

I have two fields, both have the size set to double in the table properties. When I subtract one field from the other some of the results are displayed as scientific notation when I click in the cell and others just show regular standard format to decimal places.
The data in both fields was updated with Round([Field01],2) and Round([Filed2],2) so the numbers in the fields should not be any longer than 2 decimal places.
Here's an example:
Field1 = 7.01
Field2 = 7.00
But when I subtract Field1 from Field2 the access display shows 0.01 but when I click on the result it displays, -9.99999999999979E-03. So of course, when I try to filter on all results that have 0.01 the query comes back empty because it thinks the result is -9.99999999999979E-03.
Even stranger is if Field1 = 1.02 and Field2 = 1.00, the result is 0.02 and when I click on the result the display still shows 0.02 and I can filter on all results that equal 0.02.
Why would MS Access treat numbers in the same query differently? Why is it displaying in Scientific Notation and not filtering?
Thanks for any support.
Take this simple code in Access (or even Excel) and run it!
Public Sub TestAdd()
Dim MyNumber As Single
Dim I As Integer
For I = 1 To 10
MyNumber = MyNumber + 1.01
Debug.Print MyNumber
Next I
End Sub
Here is the output of the above:
1.01
2.02
3.03
4.04
5.05
6.06
7.070001
8.080001
9.090001
10.1
You can see that after just 7 additions rounding is occurring!
Note how after JUST 7 simple little additions Access is now spitting out wrong numbers and has rounding errors!
More amazing? The above code runs the SAME in Excel!
Ok, I am sure I have your attention now!
If I recall, the FIRST day and first class in computing science? Computers don't store exact numbers when using floating point numbers.
So, then how is it possible that the WHOLE business community using Excel, or Access, or in fact your desktop calculator not come crashing down?
You mean Access cannot add up 7 simple little numbers without having errors?
How can I even do payroll then?
The basic concept and ALL you need to know here is that computers store real (floating) numbers only as approximate.
And integer values are stored exact.
so, there are several approaches here, and in fact if you writing ANY business software that needs to work with money values? And not suffer rounding errors?
Then you better off to choose what we called some kind of "scaled" integer. Behind the scenes, the computer does NOT use floating numbers, but uses a integer value, and the also has a "decimal" position.
In fact, in a lot of older business BASIC languages, or others? We often had to do the scaling on our own. (so, we would choose a large integer format). In fact, this "scaling" feature still exists in Access!!! (and you see it in the format options).
So, two choices here. If you don't want "tiny" rounding errors, then use "currency" data type. This may, or may not be sufficient for you, since it only allows a max of 4 decimal places. But in most cases, it should suffice. And if you need "more" decimal places, then you can multiply the values by 1000, and then divide by 1000 when done the calculations.
however, try changing the column type to currency and that should work. (this type of data is how your desktop calculator also works - and thus you not see funny rounding errors as a result (in most cases).
but, the FIRST rule of the day? First computer course?
Computers do not store exact numbers for floating point numbers - they are approximations, and are subject to rounding errors. Now, if you really are using double for the table, then I don't think these rounding errors should show up - since you have "so many decimal places" available.
But, I would try using currency data type - it is a scaled integer, or so called packed decimal.
You can ALSO choose to use a packed decimal in Access, and it supports out to 28 digits, and you can set the "scale" (the decimal point location). However, since you can't declare a decimal type in VBA, then I would suggest that in the table (and in VBA code, use currency data types).
If you need more then 4 decimal points, then consider scaling the currency in your code, or perhaps at that point, you consider using a packed decimal type in the table, but values in VBA will have to use the "variant" type, and they will correctly take on the data column setting if used in code and assigned a value from the table(s) in question.
Needless to say, the first day you start dealing with computers, and that first day ANYTHING beyond being a "end user"? Well, this is your first lesson of the day!
"The data in both fields was updated with Round([Field01],2) and Round([Filed2],2) so the numbers in the fields should not be any longer than 2 decimal places." instead of rounding up(which i think is the reason for the scientific notation) you can use number field as data type , then under field size choose double, then under decimal places choose 2.

Is decimal number constraint possible in SQLite?

Issue
I'm using SQLite and I've got a bunch of fields representing measures in millimeters that I'd like to limit to 1 number after decimal point (e.g. 1.2 ; 12.2 ; 122.2 and so on).
I've seen such things as putting DECIMAL(n,1) as the type for example and I tried it but it doesn't seem to constraint the value (I suppose it's because it's not an actual SQLite type).
Do I need to migrate to MySQL for it to work?
EDIT (solution found)
I used Dan04's answer : it's simple and it works really fine :
► Table is as follow :
CREATE TABLE demo(
a REAL CHECK(a = ROUND(a,1)),
b REAL CHECK(b = ROUND(b,1)),
c REAL GENERATED ALWAYS AS (a+b)
)
► Insert corerct data : INSERT INTO demo (a,b) values (41.4,22.6)
► Insert bad data : INSERT INTO demo (a,b) values (1.45,22.68) outputs :
Execution finished with errors.
Result: CHECK constraint failed: a = ROUND(a,1)
At line 1:
insert into demo (a,b) values (1.45,22.68)
You can make a CHECK constraint using the ROUND function. Declare the column as:
mm REAL CHECK(mm = ROUND(mm, 1))
But note that the underlying representation is still a binary floating-point number, with the usual caveats about accuracy.
MySQL's DECIMAL(nn,1) will round to 1 decimal place for storing. That's not the same as a constraint.
When displaying data, your app should round the result to a meaningful precision. (One decimal place is arguably over-kill for weather readings.)
In general, measurements (not money) should be stored in FLOAT. This datatype (in MySQL and many other products) provides 7 "significant digits" and a reasonably high range of values.
FLOAT has sufficient precision when used for latitude and longitude to distinguish two vehicles, but not enough precision to distinguish two people embracing.
(Sorry, I can't speak for SQLite. If FLOAT is available then I recommend you use it and round on output.)

roundings with Access

With Microsoft Access 2010, I have two Single fields:
A = 1.1
B = 2.1
I create a query where I have defined C=A*B
Microsoft Access says that C = 2.30999994277954
but, in reality, C =2.31
How can I get the right result (2.31)?
Slightly off results from operations performed on decimal values can happen if your numeric field size is single or double rather than decimal. Single and double (or floating point) numbers are very close approximations of the "true" numbers, but should not be relied upon if accuracy in operations is required. A related stackoverflow question has more information about this issue: Access comparing floating-point numbers "incorrectly"
If it's possible to modify the underlying table's design, you should change the field size property for the "A" and "B" fields from single to decimal. After changing the field size BUT BEFORE saving the table, you will also need to adjust the Scale property for "A" and "B" from 0 to whatever number of places to the right of the decimal point you might require. You will likely still have a notice about losing data, but if you adjust the field properties correctly before saving the table, this shouldn't be a problem. You should probably make a copy of the table before doing this so that you can verify that there was no data loss. After saving your table and verifying the changes did not result in data loss, your query should represent A * B accurately.

Comparision of data in huge databases

I have a database in mysql which has a collection of attributes (ex. 'weight', 'height', 'no of pages' etc) and attribute values (ex. '30 tons', '12 inches', '2 pgs' etc) and mapped with the respective product ids.
The data has been collected from different sites and hence the attribute values have different formats (ex. '222 pgs' or '222 pages' or '222') (ex2. '12 inches', '12 meters', '12 cms').
What I need to do is that I have to compare the values of same attributes of different products. So I have to compare '222 pgs' with '222 pages' for all the attributes which differ in formats.
There are around 4000 attributes and the number will increase further. Is there any way to compare these without having to assign each attribute a specific type individually? Or what is the fastest way to compare these?
Well, until they invent a clairvoyant computer, a human being will have to tell it that pgs and pages mean the same thing and that inches and meters are convertible.
You'll have to sanitize the data one way or another. I'd probably start by identifying units that measure the same dimension1 and common aliases2 for each unit, then parse the data to split the quantity from the unit and normalize3 the unit. Once you have done that, the data becomes directly comparable.
But all this is really just a remedy for the problem that should not have been there in the first place, were the database designed properly.
1 A "mass" is a dimension measured by units such as kg, t, lb etc. A "length" is a dimension measured by m, km, in etc.
2 E.g. an in and inch denote exactly the same unit, pgs and pages are the same etc.
3 I.e. make sure a particular dimension is always represented by the same unit: for example convert all lengths to m, all masses to kg, all pages to pages etc.
You haven't explained what you want to do after you find out that attributes for a pair of products differ (while still meaning the same thing).
I.e.: if I see that in Instance A has field Length set to "12 pgs" and Instance B has Length reporting "12 pages" what do you do?
List this? Autocorrect? Drop one of the two values? Open a window for a human user to correct?
Personally I'd go for a "select attribute,count(*) from X group by attribute" so that you can find out the most common spelling of the unit, and then you can also write corrective scripts that may automatically convert ".. pgs" to " pages" as soon as you have decided the correct representation.
Of course this will not help at all unless you enforce correct spelling of the units, and this requires for sure better input-output filters, including the main UI, but also any kind of bulk uploader utility you may use to create or update products.
A redesign of the DB to add "Unit" as an extra, categorized attribute for each measure would also help a lot.

MySQL what would the best approach to ranking highest to lowest possible match?

I have a MySQL database I'm searching through. Lets say this is a database of people. When querying for a specific record, it is possible to find a match 100% on each attribute. But querying the database to find closest match on probability (closest matches on table attributes) is more of the strategy.
In this scenario, does it make sense to create a temporary table (much like a tally-sheet) to indicate what attributes match/what attributes are present? What is the typical approach to doing advanced searches on database like this?
Example (below) of a hypothetical stored Procedure
*parameters are just to exemplify how I would search. I'm not concerned how to perform my selects. Question is about approach, strategy, technique *
call FindPerson ("Brown Eyes", "Brown hair", "Height:6'1", "white", "Name:Joe" ,"weight180", "Age 34" "sex m");
RESULT TABLE
NAME AGE HEIGHT WEIGHT HAIR SKIN sex RANK_MATCH
Joe 32 6'1 180 Brown white m 1
Mike 33 6'1 179 Brown white m 2
James 31 6'0 179 Brown black m 3
Just out of my mind. You can create your own score and sort by it. Something like
SELECT `id`,
(IF(`age`=32,1,0)+IF(`height`="6'1",1,0)+...) as `score`
FROM `people`
HAVING `score` > 0
ORDER BY `score` DESC
LIMIT 10;
With this, you can handle every field with its own comparison, and also weight the individual attributes by not just add 1 but 2 or more.
But I'm quiet not sure, how performant this is.
The approach I would use would be to create a scoring function (your stored proc) that would evaluate the given input's standard distance from the mean.
In the proc, you would judge each criteria in a fashion similar to:
INPUT AGE: 32
calculate MEAN of AGE WHERE (sex = m): 34.5
calculate STANDARD DEVIATION of AGE WHERE (sex = m): 2.5
calculate how many STDEVs 32 is from the 34.5 (also known as z-score): 1
Repeat this process for all numeric datatypes, summing them and ORDER BY the sum.
In doing so, the following schema change would be required: height changed from foot/inch form to strictly inches.
Depending on your needs, you may also consider coming up with an arbitrary scale for sex and skin color/hair color. Of course, you may think that measures like these should NOT be factored in because of how drastically it would change the scoring function. If you chose to, you'd have to find some number that would be added to the above SUM...but it's hard because nominative variables don't translate easily into these kinds of things.
If you find that haircolor/skin color is able to be usefully transferred into say, the continous color spectrum, your scoring tidbit would be the same...color value of input vs color value of means and standard deviations.
The query that would find your matches would be something to the effect of:
SELECT
ABS(INPUT_AGE - AVG(AGE)) / STD(AGE) AS age_z,
ABS(INPUT_WT - AVG(WT)) / STD(WT) AS wt_z,
...
(age_z + wt_z + ...) AS score
FROM `table`
ORDER BY score ASC