I am facing a problem regarding a string comparison in MySQL.
I have the following table,
res_id | image_min_allowed_dimension | canvas_dimension
1 400x500 8x10
2 800x600 11x14
As you can see in this table,
image_min_allowed_dimension column has 2 sets of record. Ans also canvas_dimension has 2 sets
Now, my goal is to get these 2 sets of record with a given value for image_min_allowed_dimension.
Say, if I give 1024x768 for image_min_allowed_dimension in the PHP script it will give me the 2 sets of record from canvas_dimension field.
The probable algo would be,
Fetch All Records as canvas_dimension
IF image_min_allowed_dimension is Less than or equal to a given value(i.e, 1024x768)
ELSE IF the given value is greater than image_min_allowed_dimension then return nothing.
But as the fields are varchar, how can I achieve that.?
Please help.
Refactor your schema to store your resolutions in a sane manner.
res_id | image_min_allowed_width | image_min_allowed_height | canvas_width | canvas_height
Your future self will thank you for the extra effort.
Related
Suppose we have 2 numbers of 3 bits each attached together like '101100', which basically represents 5 and 4 combined. I want to be able to perform aggregation functions like SUM() or AVG() on this column separately for each individual 3-bit column.
For instance:
'101100'
'001001'
sum(first three column) = 6
sum(last three column) = 5
I have already tried the SUBSTRING() function, however, speed is the issue in that case as this query will run on millions of rows regularly. And string matching will slow the query.
I am also open for any new databases or technologies that may support this functionality.
You can use the function conv() to convert any part of the string to a decimal number:
select
sum(conv(left(number, 3), 2, 10)) firstpart,
sum(conv(right(number, 3), 2, 10)) secondpart
from tablename
See the demo.
Results:
| firstpart | secondpart |
| --------- | ---------- |
| 6 | 5 |
With the current understanding I have of your schema (which is next to none), the best solution would be to restructure your schema so that each data point is its own record instead of all the data points being in the same record. Doing this allows you to have a dynamic number of data points per entry. Your resulting table would look something like this:
id | data_type | value
ID is used to tie all of your data points together. If you look at your current table, this would be whatever you are using for the primary key. For this answer, I am assuming id INT NOT NULL but yours may have additional columns.
Data Type indicates what type of data is stored in that record. This would be the current tables column name. I will be using data_type_N as my values, but yours should be a more easily understood value (e.g. sensor_5).
Value is exactly what it says it is, the value of the data type for the given id. Your values appear to be all numbers under 8, so you could use a TINYINT type. If you have different storage types (VARCHAR, INT, FLOAT), I would create a separate column per type (val_varchar, val_int, val_float).
The primary key for this table now becomes a composite: PRIMARY KEY (id, data_type). Since your previously single record will become N records, the primary key will need to adjust to accommodate that.
You will also want to ensure that you have indexes that are usable by your queries.
Some sample values (using what you placed in your question) would look like:
1 | data_type_1 | 5
1 | data_type_2 | 4
2 | data_type_1 | 1
2 | data_type_2 | 1
Doing this, summing the values now becomes trivial. You would only need to ensure that data_type_N is summed with data_type_N. As an example, this would be used to sum your example values:
SELECT data_type,
SUM(value)
FROM my_table
WHERE id IN (1,2)
GROUP BY data_type
Here is an SQL Fiddle showing how it can be used.
We have a table for which we have to present many many counts for different combinations of fields.
This takes quite a while to do on the fly and doesn't provide historical data, so I'm thinking in the best way to store those counts in another table, with a timestamp, so we can query them fast and get historical trends.
For each count we need 4 pieces of information to identify it, and there are about 1000 different metrics we would like to store.
I'm thinking on three different strategies, having a count and a timestamp but varying in how to identify the count for retrieval.
1 table with 4 fields to identify the count, the 4 fields wouldn't be normalized as they contain data from different external tables.
1 table with 1 "tag" field, which will contain the 4 pieces of information as a tag. This tags could be enriched and kept in another table maybe having a field for each tag part and linking them to the external tables.
Different tables for the different groups of counts to be able to normalize on one or more fields, but this will need anywhere from 6 to tens of tables.
I'm going with the first one, not normalized at all, but wondering if anyone has a better or simpler way to store all this counts.
Sample of a value:
status,installed,all,virtual,1234,01/05/2015
First field, status, can have up to 10 values
Second field, installed, can have up to 10 per different field 1
Third field, all,can have up to 10 different values, but they are the same for all categories
Fourth field, virtual, can have up to 30 values and will also be the same for all previous categories.
Last two fields will be a number and a timestamp
Thanks,
Isaac
When you have a lot of metrics and you don't need to use them to do intra-metrics calculation you can go for the 1. solution.
I would probably build a table like this
Satus_id | Installed_id | All_id | Virtual_id | Date | Value
Or if the combination of the first four columns have a proper name, I would probably create two tables (I think you refer to this possibility as the second solution with the 2):
Metric Table
Satus_id | Installed_id | All_id | Virtual_id | Metric_id | Metric_Name
Values Table
Metric_id | Date | Value
This is good if you have names for your metrics or other details which otherwise you will need to duplicate for each combination with the first approach.
In both cases it will be a bit complicated to do intra-rows operations using different metrics, for this reason this approach is suggested only for high level KPIs.
Finally, because all possible combination for the last two fields are always present in you table you can think to convert them to a columns:
Satus_id | Installed_id | Date | All1_Virtual1 | All1_Virtual2 | ... | All10_Virtua30
With 10 values for All and 30 for Virtual you will have 300 columns, not very easy to handle, but they will be worth to have if you have to do something like:
(All1_Virtual2 - All5_Virtual23) * All6_Virtual12
But in these case I would prefer (if possible) to do the calculation in advance to reduce the number of columns.
I'm trying to find a way to compare two DNA-like strings with MySQL, stored functions are no problem. Also the string may be changed, but needs to have the following format: [code][id]-[value] like C1-4. (- may be changed aswell)
Example of the string:
C1-4,C2-5,C3-9,S5-2,S8-3,L2-4
If a value not exists in the other string, for example S3-1 it will score 10 (max value). If the asked string has C1-4 and the given string has C1-5 the score has to be 4 - 5 = -1 and if the asked string is C1-4 and the given string has C1-2 the score has to be 4 - 2 = 2.
The reason for a this is that my realtime algorithm is getting slow with 10.000 results. (already optimized with stored functions, indexes, query optimalizations) Because 10.000 x small and quick queries will make a lot.
And the score has to be calculated before I can order my query and get the right limit.
Thanks and if you have any questions let me know by comment.
** EDIT **
I'm thinking that it's also possible to not use a string but a table where the DNA-bits are stored as a 1-n relation table.
ID | CODE | ID | VALUE
----------------------
1. | C... | 2. | 4....
Postgresql 9.2 DB which automatically collects data from various machines.
The DB stores all the data including the machine id, the firmware, the manufacturer id etc as well as the actual result data. In one stored field (varchar) there are 5 sub fields which are separated by the ^ character.
ACT18!!!8246-EN-2.00013151!1^7.00^F5260046959^H1P1O1R1C1Q1L1^1 (Machine 1)
The order of this data seems to vary from one machine to another. Eg machine 1 2 and 3. The string above shows the firmware version, in this case "7.0" and it appears in sub-field 2. However, another machine sends the data in a different sub-field - in this case sub-field 3 and the value is "1"
BACT/ALERT^A.00^1^^ (Machine 2)
I want to store the values "7.0" and "1" in a different field in a separate table using a CREATE TRIGGER t_machine_id AFTER INSERT function where I can choose which sub-field is used depending on the machine the data has come from.
Is split_part the best function to do this? Can anyone supply an example code that will do this? I can't find anything in the documentation.
You need to (a) split the data using something like regexp_split_to_table then (b) match which parts are which using some criteria, since you don't have field position-order to rely on. Right now I don't see any reliable rule to decide what's the firmware version and what's the machine number; you can't really say where field <> machine_number because if machine 1 had firmware version 1 you'd get no results.
Given dummy data:
CREATE TABLE machine_info(data text, machine_no integer);
INSERT INTO machine_info(data,machine_no) (VALUES
('ACT18!!!8246-EN-2.00013151!1^7.00^F5260046959^H1P1O1R1C1Q1L1^1',1),
('BACT/ALERT^A.00^1^^',2)
);
Something like:
SELECT machine_no, regexp_split_to_table(data,'\^')
FROM machine_info;
will give you a table of split data elements with machine number, but then you need to decide which fields are which:
machine_no | regexp_split_to_table
------------+------------------------------
1 | ACT18!!!8246-EN-2.00013151!1
1 | 7.00
1 | F5260046959
1 | H1P1O1R1C1Q1L1
1 | 1
2 | BACT/ALERT
2 | A.00
2 | 1
2 |
2 |
(10 rows)
You may find the output of substituting regexp_split_to_array more useful, depending on whether you can get any useful info from field order and how you intend to process the data.
regress=# SELECT machine_no, regexp_split_to_array(data,'\^')
FROM machine_info;
machine_no | regexp_split_to_array
------------+------------------------------------------------------------------
1 | {ACT18!!!8246-EN-2.00013151!1,7.00,F5260046959,H1P1O1R1C1Q1L1,1}
2 | {BACT/ALERT,A.00,1,"",""}
(2 rows)
Say there are two firmware versions; version 1 sends code^blah^fwvers^^ and version 2 and higher sends code^fwvers^blah^blah2^machineno. You can then differentiate between the two because you know that version 1 leaves the last two fields blank:
SELECT
machine_no,
CASE WHEN info_arr[4:5] = ARRAY['',''] THEN info_arr[3] ELSE info_arr[2] END AS fw_vers
FROM (
SELECT machine_no, regexp_split_to_array(data,'\^')
FROM machine_info
) string_parts(machine_no, info_arr);
results:
machine_no | fw_vers
------------+---------
1 | 7.00
2 | 1
(2 rows)
Of course, you've only provided two sample data, so the real matching rules are likely to be more complex. Consider writing an SQL function to extract the desired field(s) and return them from the array passed.
I need to store some flags for user records in a MySQL table (I'm using InnoDB):
---------------------------
| UserId | Mask |
| -------------------------
| 1 | 00000...001 |
| 2 | 00000...010 |
---------------------------
The number of flags is bigger than 64, so I can't use a BIGINT or BIT type to store the value.
I don't want to use many-to-many association tables, because each user can have more than one profile, each one with its set of flags and it would grow too big very quickly.
So, my question is, is it possible to store these flags in a VARCHAR, BLOB or TEXT type column and still do bitwise operations on them? If yes, how?
For now I just need one operation: given a mask A with X bits set to 1 what users have at least those X bits set to 1.
Thanks!
EDIT
To anyone reading this, I've found a solution (for me, at least). I'm using a VARCHAR for the mask field and when searching for a specific mask I use this query:
select * from my_table where mask like '__1__1'
Every record that has the 3rd and last bit set to on will be returned. The "_" symbol is a SQL placehoder for "any single character" (mySQL only, perhaps?).
In terms of speed is doing fine right now, will have to check later when my user base grows.
Anyway, thanks for your input. Other ideas welcomed, of course.