I added a new column in one of my database tables. I'd like to try to populate that column for previous records. New records will be validated through forms. Here is an example.
| Quantity | Description | Price | Amount | Type |
----------------------------------------------------------
| 3 | Storage for Pallets | 3.99 | 11.97 | NULL |
| 3 | Handling for Pallets| 3.99 | 11.97 | NULL |
| 3 | Misc expense | 3.99 | 11.97 | NULL |
----------------------------------------------------------
I'd like to replace those null values based off of keywords in the description. For example the updated table would look like the following.
| Quantity | Description | Price | Amount | Type |
--------------------------------------------------------------
| 3 | Storage for Pallets | 3.99 | 11.97 | Storage |
| 3 | Handling for Pallets| 3.99 | 11.97 | Handling |
| 3 | Misc expense | 3.99 | 11.97 | Misc |
--------------------------------------------------------------
Any ideas on an update statement that will accomplish this?
Because there is no primary key you have put the values into the where clause for all columns' values. Or at least enough so that there is no chance of the update hitting 2+ rows, based on your knowledge of the data.
Here is the first update as an example:
update tbl
set "Type" = 'Storage'
where "Quantity" = 3
and "Description" = 'Storage for Pallets'
and "Price" = 3.99
and "Amount" = 11.97;
Fiddle:
http://sqlfiddle.com/#!12/9fa64/1/0
If there were a primary key, your WHERE clause would be a lot simpler:
where pk_id_field = x
(Because you could rest assured knowing you're about to update the exact row needed, and that no other rows have that value)
It was simpler than I thought
UPDATE
invoice_line_items
SET
line_item_type = "Storage"
WHERE
description like "%storage%";
.....and so on
Actually: If you always want the description's first word for the type, you can go with
UPDATE Invoice_Line_Items
SET line_item_type = SUBSTRING_INDEX(description, ' ', 1);
See SQL Fiddle. You could, of course, add a WHERE clause, if it is not just as straightforward for the whole table… And you are not limited to the first word either…
Related
I created a table (t_subject) like this
| id | description | enabled |
|----|-------------|---------|
| 1 | a | 1 |
| 2 | b | 1 |
| 3 | c | 1 |
And another table (t_place) like this
| id | description | enabled |
|----|-------------|---------|
| 1 | d | 1 |
| 2 | e | 1 |
| 3 | f | 1 |
Right now data from t_subject is used for each of t_place records, to show HTML dropdowns, with all the results from t_subject.
So I simply do
SELECT * FROM t_subject WHERE enabled = 1
Now just for one of t_place records, one record from t_subject should be hidden.
I don't want to simply delete it with javascript, since I want to be able to customize all of the dropdowns if anything changes.
So the first thing I though was to add a place_id column to t_subject.
But this means I have to duplicate all of t_subject records, I would have 3 of each, except one that would have 2.
Is there any way to avoid this??
I thought adding an id_exclusion column to t_subject so I could duplicate records only whenever a record is excluded from another id from t_place.
How bad would that be?? This way I would have no duplicates, so far.
Hope all of this makes sense.
While you only need to exclude one course, I would still recommend setting up a full 'place-course' association. You essentially have a many-to-many relationship, despite not explicitly linking your tables.
I would recommend an additional 'bridging' or 'associative entity' table to represent which courses are offered at which places. This new table would have two columns - one foreign key for the ID of t_subject, and one for the ID of t_place.
For example (t_place_course):
| place_id | course_id |
|----------|-----------|
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
| 3 | 1 |
| 3 | 3 |
As you can see in my example above, place 3 doesn't offer course 2.
From here, you can simply query all of the courses available for a place by querying the place_id:
SELECT * from t_place_course WHERE place_id = 3
The above will return both courses 1 and 3.
You can optionally use a JOIN to get the other information about the course or place, such as the description:
SELECT `t_course`.`description`
FROM `t_course`
INNER JOIN `t_place_course`
ON `t_course`.`id` = `t_place_course`.`course_id`
INNER JOIN `t_place`
ON `t_place`.`id` = `place_id`
I need some help for a MySQL database design. The MySQL database should handle about 150 million records a year. I want to use the myisam engine.
The data structure:
Car brand (>500 brands)
Every car brand has 30+ car models
Every car model has the same 5 values, some model have additional values
Every value has exactly 3 fields:
timestamp
quality
actual value
The car brand can have some values with the same fields
The values are tracked every 5 minutes -> 105120 records a year
About the data:
The field quality should be always 'good' but when it's not I need to know.
The field timestamp is usually the but at least one value has a different timestamp
Deviation: 1-60 seconds
If the timestamp has a different timestamp it has always a different timestamp
Sometimes I don't get data because the source server is down.
How I want to use the data for
Visualisations in chart(time and actual value) with a selection of values
Aggregation of some values for every brand
My Questions:
I thought it's a good idea to split the data into different tables, so I put every brand in an extra table. To find the table by car brand name I created an index table. Is this a good practice?
Is it better to create tables for every car model (about 1500 tables)?
Should I store the quality (if it is not 'good') and the deviation of the timestamp in a seperate table?
Any other suggestions?
Example:
Table: car_brand
| car_brand | tablename | Address |
|-----------|-----------|-------------|
| BMW | bmw_table | the address |
| ... | ... | ... |
Table: bmw_table (105120*30+ car models = more than 3,2 million records per year)
| car_model | timestamp_usage | quality_usage | usage | timestamp_fuel_consumed | quality_usage |fuel_consumed | timestamp_fuel_consumed | quality_kilometer | kilometer | timestamp_revenue | quality_revenue | revenue | ... |
|-------------|---------------------|---------------|-------|-------------------------|----------------|--------------|-------------------------|-------------------|-----------|---------------------|-----------------|---------|-----|
| Z4 | 2015-12-12 12:12:12 | good | 5% | 2015-12-12 12:12:12 | good | 10.6 | 2015-12-12 12:11:54 | good | 120 | null | null | null | ... |
| Z4 | 2015-12-12 12:17:12 | good | 6% | 2015-12-12 12:17:12 | good | 12.6 | 2015-12-12 12:16:54 | good | 125 | null | null | null | ... |
| brand_value | null |null | null | null | null | null | null | null | null | 2015-12-12 12:17:12 | good | 1000 | ... |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
And the other brand tables..
Edit: Queries and quality added
Possible Queries
Note: I assume that the table bmw_table has an extra column that is called car_brand and the table name is simple_table instead of bmw_table to reduce complexity.
SELECT car_brand, sum(revenue), avg(usage)
FROM simple_table
WHERE timestamp_usage>=2015-10-01 00:00:00 AND timestamp_usage>=2015-10-31 23:59:59
GROUP BY car_brand;
SELECT timestamp_usage,usage,revenue,fuel_consumed,kilometer
FROM simple_table
WHERE timestamp_usage>=2015-10-01 00:00:00 AND timestamp_usage>=2015-10-31 23:59:59;
Quality Values
I collect the data from an OPC Server so the qualtiy field contains one of the following values:
bad
badConfigurationError
badNotConnected
badDeviceFailure
badSensorFailure
badLastKnownValue
badCommFailure
badOutOfService
badWaitingForInitialData
uncertain
uncertainLastUsableValue
uncertainSensorNotAccurate
uncertainEUExceeded
uncertainSubNormal
good
goodLocalOverride
Thanks in advance!
Droider
Do not have a separate table per brand. There is no advantage, only unnecessary complexity. Nor 1 table per model. In general, if two table look the same, the data should be combined into a single table. In your example, that one table would have brand and model as columns.
Indexes are your friend for performance. Let's see the queries you will perform, so we can discuss the optimal indexes.
What will you do if the data quality is not 'good'? Simply display "good" or "not good"?
I have currently setup a database where I store CSV's. I am reading more that CSV is bad practice in MYSQL.
My current setup is this:
----------------------------------------------
| id | Exercise | Set | Reps | date |
----------------------------------------------
| 1 | Value 1, | Value 1, | Value 1, | 01/01/16 |
| | Value 2, | Value 2, | Value 2, | |
| | Value 3, | Value 3, | Value 3, | |
----------------------------------------------
When a user is submitting data they can have 'AS MANY' new 'Exercise' inputs (and in turn values) added as they want (there could be up to 50) but only 10 'Set' and 'Reps'. For example:
<input name="exercise1[]">
<input name="set[]><input name="reps[]">
<input name="set[]><input name="reps[]">
<input name="set[]><input name="reps[]">
<input name="exercise2[]">
<input name="set1[]><input name="reps1[]">
<input name="set1[]><input name="reps1[]">
<input name="set1[]><input name="reps1[]">
This is the way it currently works and is working fine but I want to know:
Should I change they way I am storing this data?
If so, I'm unsure how I should save it. Is storing multiple rows for one form submission was a bad idea also?
The only way I can see to allow 'UNLIMITED' exercise values without CSV's is the way below (which uses multiple rows per form submission) and setting up 10 columns in my database for each 'set' and 'reps' (as I mentioned earlier there is only the possibility of 10 'Set' and 'Reps' values):
---------------------------------------------------------------------
| Exercise | Set | Reps | Set2 | Reps2 | date |
---------------------------------------------------------------------
| Value 1 | Value 1 | Value 2 | Value 1 | Value 2 | 01/01/16 |
---------------------------------------------------------------------
| Value 2 | Value 2 | Value 2 | Value 1 | Value 2 | 01/01/16 |
---------------------------------------------------------------------
| Value 3 | Value 3 | Value 3 | Value 1 | Value 2 | 01/01/16 |
---------------------------------------------------------------------
Please help me setup correctly before I go even further through my developing!
There are always different designs. Pick one best suits your situation.
1 - Design in one table as per your example when you pretty sure the structure is not changing, eg exactly x sets and y reps etc
id, excercise, set1, set2, set3, rep1, rep2, ...
2 - If unsure in data when growing, you can design a meta table:
id exercise_id meta value
-- ----------- ---- ---------
1 1 set1 value1
2 1 set2 value2
3 1 rep1 value3
4 2 set1 value5
....
3 - One-Many relationship tables - the traditional method:
exercises (id, date, ...)
exercise_sets(id, exercise_id, value)
exercise_reps(id, exercise_id, value)
I have table:
+----+--------+----------+
| id | doc_id | next_req |
+----+--------+----------+
| 1 | 1 | 4 |
| 2 | 1 | 3 |
| 3 | 1 | 0 |
| 4 | 1 | 2 |
+----+--------+----------+
id - auto incerement primary key.
nex_req - represent an order of records. (next_req = id of record)
How can I build a SQL query get records in this order:
+----+--------+----------+
| id | doc_id | next_req |
+----+--------+----------+
| 1 | 1 | 4 |
| 4 | 1 | 2 |
| 2 | 1 | 3 |
| 3 | 1 | 0 |
+----+--------+----------+
Explains:
record1 with id=1 and next_req=4 means: next must be record4 with id=4 and next_req=2
record4 with id=5 and next_req=2 means: next must be record2 with id=2 and next_req=3
record2 with id=2 and next_req=3 means: next must be record3 with id=1 and next_req=0
record3 with id=3 and next_req=0: means that this is a last record
I need to store an order of records in table. It's important fo me.
If you can, change your table format. Rather than naming the next record, mark the records in order so you can use a natural SQL sort:
+----+--------+------+
| id | doc_id | sort |
+----+--------+------+
| 1 | 1 | 1 |
| 4 | 1 | 2 |
| 2 | 1 | 3 |
| 3 | 1 | 4 |
+----+--------+------+
Then you can even cluster-index on doc_id,sort for if you need to for performance issues. And honestly, if you need to re-order rows, it is not any more work than a linked-list like you were working with.
Am able to give you a solution in Oracle,
select id,doc_id,next_req from table2
start with id =
(select id from table2 where rowid=(select min(rowid) from table2))
connect by prior next_req=id
fiddle_demo
I'd suggest to modify your table and add another column OrderNumber, so eventually it would be easy to order by this column.
Though there may be problems with this approach:
1) You have existing table and need to set OrderNumber column values. I guess this part is easy. You can simply set initial zero values and add a CURSOR for example moving through your records and incrementing your order number value.
2) When new row appears in your table, you have to modify your OrderNumber, but here it depends on your particular situation. If you only need to add items to the end of the list then you can set your new value as MAX + 1. In another situation you may try writing TRIGGER on inserting new items and calling similar steps to point 1). This may cause very bad hit on performance, so you have to carefully investigate your architecture and maybe modify this unusual construction.
I have a table with pairs of matching records that I query like this:
select id,name,amount,type from accounting_entries
where name like "%05" and amount != 0 order by name limit 10;
Results:
+------+----------------------+--------+-------+
| id | name | amount | type |
+------+----------------------+--------+-------+
| 786 | D-1194-838HELLUJP-05 | -5800 | DEBIT |
| 785 | D-1194-838HELLUJP-05 | -5800 | DEBIT |
| 5060 | D-1195-UOK4HS5POF-05 | -5000 | DEBIT |
| 5059 | D-1195-UOK4HS5POF-05 | -5000 | DEBIT |
| 246 | D-1196-0FUCJI66BX-05 | -7000 | DEBIT |
| 245 | D-1196-0FUCJI66BX-05 | -7000 | DEBIT |
| 9720 | D-1197-W2J0EC1BOB-05 | -6500 | DEBIT |
| 9719 | D-1197-W2J0EC1BOB-05 | -6500 | DEBIT |
| 2694 | D-1198-MFKIKHGW0S-05 | -5500 | DEBIT |
| 2693 | D-1198-MFKIKHGW0S-05 | -5500 | DEBIT |
+------+----------------------+--------+-------+
10 rows in set (0.01 sec)
I need to perform an update so that the resulting data will look like this:
+------+----------------------+--------+--------+
| id | name | amount | type |
+------+----------------------+--------+--------+
| 786 | D-1194-838HELLUJP-05 | -5800 | DEBIT |
| 785 | C-1194-838HELLUJP-05 | 5800 | CREDIT |
| 5060 | D-1195-UOK4HS5POF-05 | -5000 | DEBIT |
| 5059 | C-1195-UOK4HS5POF-05 | 5000 | CREDIT |
| 246 | D-1196-0FUCJI66BX-05 | -7000 | DEBIT |
| 245 | C-1196-0FUCJI66BX-05 | 7000 | CREDIT |
| 9720 | D-1197-W2J0EC1BOB-05 | -6500 | DEBIT |
| 9719 | C-1197-W2J0EC1BOB-05 | 6500 | CREDIT |
| 2694 | D-1198-MFKIKHGW0S-05 | -5500 | DEBIT |
| 2693 | C-1198-MFKIKHGW0S-05 | 5500 | CREDIT |
+------+----------------------+--------+--------+
10 rows in set (0.01 sec)
One entry should negate the other entry. It doesn't matter if I update the first or second matching record, what matters is that one has a positive amount and the other has a negative amount. And the type and name need to be updated.
Any clues on how to do this? What would the update command look like? Maybe using a group by clause? I have some ideas on how to do it with a stored procedure, but can I do it with a simple update?
Try this:
UPDATE accounting_entries as ae
SET name = 'C' + SubString(name, 1, Length(name) - 1))
amount = amount * -1
type = 'Credit'
WHERE id =
(SELECT MIN(id) FROM
(SELECT * FROM accounting_entries) as temp
GROUP BY name)
The key is the subquery in the WHERE section that limits the updates to the lowest ID of each name value. The assumption is that the lower ID is the one that you will always want to update. If this is not correct, then update the subquery based on whatever rule you would use.
Edit: Update to subquery based on technique found here, due to limitation on mysql defined here.
This query gives a method for updating all records at once (as it seemed like this is what the OP was looking for. However, the most efficient way to do this would be to enumerate through all records in code (php, asp.net, etc), and through code-based methods update the rows that needed to change. This would eliminate the performance issues inherent with running updates off of subqueries in mysql.
If the ID:s for a pair always match the formula x and x+1, you could say something like
WHERE MOD(`id`, 2) = 1
EDIT: I haven't tested this code, so I can't guarantee that it's possible to put a column name into a MOD like this, but it might be worth a try, and/or further investigation.
Does this constraint hold true all the time (D == -C) ?
If so, you do not need to keep redundant data in your table, store only one "amount" value (for example the debit):
786 | 1194-838HELLUJP-05 | -5800
and then, on the application level, append a D- to the name and get the raw amount or append a C- and get the - amount.