Mysql table design advice - mysql

I have a general question about MySQL database table design. I have a table that contains ~ 650 thousand records, with approximately 100 thousand added per year. The data is requested quite frequently, 1.6 times per second on average.
It has the following structure right now
id port_id date product1_price product2_price product3_price
1 1 2012-01-01 100.00 200.00 155.00
2 2 2012-01-01 NULL 150.00 255.00
3 3 2012-01-01 300.00 NULL 355.00
4 1 2012-01-02 200.00 250.00 355.00
5 2 2012-01-02 400.00 230.00 255.00
Wouln't it be better to store the data in this manner?
id port_id date product price
1 1 2012-01-01 1 100
1 2 2012-01-01 1 200
1 3 2012-01-01 1 300
1 1 2012-01-02 1 240
Advantages of the alternative design:
with the second design we don't have to store NULL values (if there is no such product in the port)
we can add new products easily - comparing to the first design, where each new product requires a new column
Disadvantages of the alternative design:
The number of records will increase from 650 000 to 650 000 * number_of_products minus all NULL records; that will be approximately 2.1 million records.
In both cases we have id column as PRIMARY_KEY and UNIQUE key on combination of port_id and date.
So the question is: which way to go? Disk space does not matter, the speed of the queries is the most important aspect.
Thank you for your attention.

It seams, that will depend on definition of product table.
If product table is statically compound of maximum three parts, then changing the current design won't help much.
Although the current design smells bad but that will be a business dependent analysis.
BTW change must be done with caution on the side effects with product table and its usages.

Related

Exclude the combination of 2 columns only from my query

Sounds simple but I couldn't find the solution for it.
I have a table with 3 columns. Account, Amount, Date.
I want to get all entries except the ones of one specific account with negative amount. But I still want to get the entries of this account if amount value is positive.
So with this query I'm also not getting the entries from account1 with a positive amount.
select * from table where (account!='account1' AND amount<='0') AND date='2020-05-01'
You can do this using WHERE NOT in your statement.
Example schema:
Account Amount Date
=====================================
1 Ben 200 2020-10-10
2 Frank 200 2020-10-10
3 Ben -300 2020-10-12
4 Ben 10 2020-10-16
5 Mary 2000 2020-10-16
6 Frank -200 2020-10-18
7 Ben -10 2020-10-18
8 Ben 0 2020-10-20
Now if you build your query like this
SELECT * FROM t1 WHERE NOT (account='Ben' AND amount<0);
you should get what you want (all records except the 3rd and 7th).
Edit: if you really only want to exclude records with negative amounts, you need to do < rather than <= as you did in your example above. Depends on whether you want row 8 to be included in the result or not.

Best way to select n-th rows based on data in a field for mySQL table

The final result of this will be used for a graphing application where sometimes we would not want the detailed granularity of data at the level it is stored in the table. This may be hard to phrase in a single question so I will give an example:
Example table:
DateTime AddressID Amount
1/1/2015 10:00:00 1 10
1/1/2015 10:00:00 2 8
1/1/2015 10:01:00 1 7
1/1/2015 10:01:00 2 12
1/1/2015 10:02:00 1 21
1/1/2015 10:02:00 2 15
etc...
Note: The times will always have 00 for the seconds - if that helps.
Note: The entries may NOT always have an entry for every minute, but they generally should. So it is possible some might times might be skipped. But there will always be an entry for both addressIDs (1 & 2) every time without fail.
I need to return the above 3 fields, in a period of time requested (for example past 24 hours), but only for certain increments of time FOR EACH OF THE ADDRESS ID's. For example, records for every 5 minutes, or every 10 minutes.
so in the case of 5 minutes it would return:
DateTime AddressID Amount
1/1/2015 10:**00**:00 1 10
1/1/2015 10:**00**:00 2 8
1/1/2015 10:**05**:00 1 11
1/1/2015 10:**05**:00 2 17
1/1/2015 10:**10**:00 1 28
1/1/2015 10:**10**:00 2 5
etc...
Performance is very important. I hope I explained that well enough for someone to get the idea of what I need and I thank you in advance for your suggestions.
EDIT: For clarification, the 5 minutes in the above example should be the minimum time BETWEEN each row. So, if in the above example, on the rare chance that there was a missing time entry for 10:05:00 it should not simply select the 10:10:00 row, it should select the 10:06:00 record and then the next row selected would be 10:11:00, etc.

Mysql query contains

Table
id name(varhcar)
2 15
3 15,23
4 1315,424
5 1512,2323
6 23,15,345
7 253,234,15
I need to find out those values which contains 15 which mean i need 2,3,6,7 not 4,5.
Above is sample data, in real time it can be any number.
Can anyone please help me?
If your database is small, consider using find_in_set function:
select * from your_table
where find_in_set('15',name);
Consider change the model to master-detail table to increase the speed if you have a big table.
This is the kind of relational model you could adopt to make this an easy problem to solve:
TABLE: records
id
2
3
4
5
6
7
TABLE: values
record_id value
2 15
3 15
3 23
4 1315
4 424
5 1512
5 2323
6 23
6 15
6 345
7 253
7 234
7 15
Then you can query:
SELECT DISTINCT id FROM records
INNER JOIN values ON records.id = values.record_id AND values.value = 15
This is the only way you can take good advantage of MySQL's query optimizer.
Not that it's impossible to do what you're trying to do, but it kind of misses the point.
If you're already storing data in this format, you should write a one-time migration to transfer it to this "normalized" format in the programming language of your choice, using something like Java's split or PHP's explode.

SQL Server 2008: Creating dynamic column names

I have a problem that I cannot solve. I work on Microsoft SQL Server 2008 and I have a table with four columns
Id
Date (2013-07, 2013-08, 2011-03, etc)
Amount 1 (100, 150, etc.)
Amount 2 (100, 80, etc.)
If Amount 1 > 150 then I need to create new columns with the values in Date as column names and distribute Amount 2 into 6 (date) periods starting one month after the Date value.
It should look like this:
Id Date Amount 1 Amount 2
----------------------------------
1 2013-07 160 60
2 2013-10 180 80
Id Date Amount 1 2013-08 2013-09 2013-10 2013-11 2013-12 2014-01 ...
--------------------------------------------------------------------------------
1 2013-07 160 10 10 10 10 10 10
2 2013-10 180 20 20 20...
I don't know how to do this and any help is highly appreciated! Thank you!
The table itself should not have these additional columns because that would be a denormalized table structure. That's a poor way to store data in many cases. But you can easily do a query against your existing table that will return the additional columns in the form you want, so that you can display it this way. Check out PIVOT and UNPIVOT.

How to setup MySQL table to follow a variable over time?

Say I have several registered users in my website.
Users are saved on a single table 'users' that assigns a unique id for each one of them.
I want to allow my users to track their expenses, miles driven, temperature, etc.
I can't be sure each user will always enter a value for all trackable variables when they login -- so an example of what could happen would be:
'example data'
user date amount miles temp etc
1 3/1/2010 $10.00 5 54
2 3/1/2010 $20.00 15
1 3/12/2010 5 55
1 3/15/2010 $10.00 25 51
3 3/20/2010 45
3 4/12/2010 $20.00 10 54
What is the best way to set up my tables for this situation?
Should I create a table exclusive to each user when they register? (could end up with thousands of user-exclusive tables)
'user-1 table'
date amount miles temp etc
3/1/2010 $10.00 5 54
3/12/2010 5 55
3/15/2010 $10.00 25 51
'user-3 table'
date amount miles temp etc
3/20/2010 45
4/12/2010 $20.00 10 54
and so on...
Should I create a single table that is essentially the same as the example data above? (could end up with a gigantic table that needs to be combed to find rows with requested user id's).
'user data table'
user date amount miles temp etc
1 3/1/2010 $10.00 5 54
2 3/1/2010 $20.00 15
1 3/12/2010 5 55
1 3/15/2010 $10.00 25 51
3 3/20/2010 45
3 4/12/2010 $20.00 10 54
Any suggestions?
Databases are built to handle similar data as a set together.
What you want is a single user-data-table, with multiple users in the same table split by user_id. You might want to further normalize that though, so that it stores:
user date type units
1 3/1/2010 dollars 10.00
1 3/1/2010 miles 5
1 3/1/2010 temp 54
2 3/1/2010 dollars 20.00
2 3/1/2010 miles 15
1 3/12/2010 miles 5
1 3/12/2010 temp 55
etc
Or even further if the user+date makes a specific trip
trip-table
tripid user date
========= ======== =========
1 1 3/1/2010
type-table
typeid description
========= ============
1 dollars
2 miles
etc
trip-data
tripid type units
========= ======== =======
1 1 10.00
1 2 5
etc
However, if you will always (or almost always) show your data in the form as entered, with the data pivoted on all the input columns (like a spreadsheet), then you would be better off sticking to the un-normalised form for brevity, programmability and performance.
could end up with a gigantic table that needs to be combed to find rows with requested user id's
Assuming you employ indexes properly and judiciously, modern RDBMS are built to handle gigantic amounts of data. The indexes allow the queries to seek only the data it needs, so there is normally little penalty in keeping it all in one table.
No, just create one table with all possible nullable fields. If user hasn't filled that parameter - then just keep NULL value there.
could end up with a gigantic table that needs to be combed to find rows with requested user id's
Yes, and the query will be fast enough if you'll specify an index for user_id field (for queries like WHERE user_id = 42)