mysql query to calculate values local to Cartesian products of logical groups of rows - mysql

I'm trying to write a query to process a single table that looks like this:
record_id item_id part_id part_length
----------- ------- -------- ------------
1 0 0 123.12
2 0 0 123.09
3 0 1 231.24
4 0 1 239.14
5 1 0 45.91
6 1 0 46.12
7 1 1 62.24
8 1 1 59.40
which is basically a table of inaccurate length measurements of some parts of some items recorded multiple times (not twice, actually each part has 100s of measurements). With a single select, I want to get a result like this:
record_id item_id part_id unit part_length_ratio
----------- ------- -------- ----- ----------------
1 0 0 1 123.12 / 231.24
2 0 0 1 123.09 / 239.14
3 0 1 0 231.24 / 123.12
4 0 1 0 239.14 / 123.09
5 1 0 1 45.91 / 62.24
6 1 0 1 46.12 / 59.40
7 1 1 0 62.24 / 45.91
8 1 1 0 59.40 / 46.12
which is basically selecting each part of an item as the unit and calculates the ratio of the length of other parts of the same item to this unit while matching the measurement times. I wrote a script which computes this kind of table but would like to do it with sql. I can understand if you fail to understand the question :)
for each item i
for each part unit of i
for each part other of i
if unit != other
print i.id other.part_id unit.part_id other.length / unit.length

As I said in a comment, tables are unordered sets: there is no first or second row...
... unless if you want to use the id column to explicitly order the rows.
However, can you guarantee that there will always be (exactly) two samples for each case and that the "lower ID" always match the first sample? This appears to be quite fragile as in real-life, there will probably have cases where a test will be performed twice or a test will be missing or done "late". Not mentioning concurrent access to your DB.
Can't you simply add a "sample number" column?

Related

Filtering duplicate rows based on multiple columns (QueryBuilder 4.2)

I've ran into a little difficulty when trying to filter top N results for a table.
Assume the following table:
ID, X, Y, Result0, Result1
-------------------------------
0 0 0 1 4
1 0 1 2 5
2 0 1 1 4
3 0 2 2 5
4 0 3 0 1
5 1 3 3 4
6 1 3 2 5
7 1 3 4 6
So, let's say I want to get the top 2 results for the highest Result0 value, using Result1 as a tie breaker if the Result0 values are equal, and having only distinct values for (X,Y),
if I'll run the following query:
$result = DB::table('table')
->orderBy('Result1', 'DSC')
->orderBy('Result0', 'DSC')
->take(300)
->get();
This code will return IDs 5,7, because they have the highest Result0 values, but the X,Y for these fields are identical, and I'd like to get only top result for distinct X,Y values.
I tried adding a
->groupBy('X','Y')
But it grouped the entries based on the database order of the entries (i.e the ID) rather than my sorting of that table.
Anyone has any idea how can I achieve my goal?

Algorithm for selecting tiles outwards center point in a 512*512 map

I've got a specific problem. My data (map) in mysql is as follows
id table_row table_col tile_type
1 1 1 0
2 2 1 0
3 3 1 0
... ... ... 0
512 512 1 0
513 1 2 0
514 2 2 0
515 3 2 0
... ... ... 0
... 512 2 0
... 1 3 0
... 2 3 0
... 3 3 0
... ... ... 0
... 512 3 0
... 1 4 0
Map is 512*512. I need to come up with an algorithm that selects tiles from the centre(or near centre 256*256) point. So it should look something like
256*256 first - once selected we can update tile_type to 1
255*256 second - update tile_type to 1
256*255 third - update tile_type to 1
257*256 fourth - update tile_type to 1
256*257 fifth - update tile_type to 1
etc. or similar, but it has to start filling in tiles from centre outwards in all directions (can be random). Any ideas appreciated
Your question lacks a few details, but I am assuming you are asking a means of generating an id that is close to the center of your 512x512 grid.
It appears your grid is enumerated in a particular manner: each column is enumerated in increasing order of table_row values, and the enumeration of columns is done in increasing order of table_col values.
Consequently, we can already know the id of the cell for which the table_row and table_col values are 256: it is 255 x 512 + 256. That is correct, because there are 255 full columns that were enumerated before enumeration started for table_col value 256, and each of those columns had 512 rows in them. Finally, within this column, we are interested in row #256.
A more generalized version of this would look like below.
((num_cols + 1) / 2 - 1) * num_rows + (num_rows + 1) / 2
You don't need to care all that much about the +1s and -1s: they are just a numerical hack to handle odd num_rows and num_cols values.
Anyways, to introduce a proximity measure, you can just use two random variables. A random variable P can represent the distance to the center in terms of colums. (i.e. how far the table_col of the point with the generated id will be from the table_col value of the center of the grid) Another random variable Q can represent the distance to the center in terms of rows.
((num_cols + 1) / 2 - 1 + P) * num_rows + ((num_rows + 1) / 2 + Q)
Then you can just generate values for P and Q based on your needs, and get the id of a cell that is P colums and Q rows away from the center of the grid.
Try Below query.
SELECT (MAX(t.`row`+1)/2), (MAX(t.`column`+1)/2) INTO #max_row, #max_col
FROM tiles t;
SELECT t.`row`, t.`column`, ceil(IF(ABS(#max_row - t.`row`) < ABS(#max_col - t.`column`), ABS(#max_col - t.`column`), ABS(#max_row - t.`row`))) as tbl_order
FROM tiles t
ORDER BY 3

What are the differences in these SQL closure table examples?

I am having some difficulty wrapping my mind around SQL closure tables, and would like some assistance in understanding some of the examples I have found.
Lets say I have a table called sample_items with the following hierarchical data:
id name parent_id
1 'Root level item #1' 0
2 'Child of ID 1' 1
3 'Child of ID 2' 2
4 'Root level item #2' 0
The tree structure should effectively be this:
id
| - 1
| | - 2
| | - 3
| - 4
For ease of querying trees (such as finding all of the descendants of a specific id), I have a table called sample_items_closure using the method described by Bill Karwin in this excellent SO post. I also use an optional path_length column for querying the immediate child or parent when needed. If I understand this method correctly, my closure table data would look like this:
ancestor_id descendant_id path_length
1 1 0
2 2 0
1 2 1
3 3 0
2 3 1
1 3 2
4 4 0
Every row in sample_items now has an entry in the sample_items_closure table for both itself and all of it's ancestors. Everything makes sense so far.
While studying other closure table examples, however, I came across one that adds an additional ancestor for each row that links to the root level (ancestor_id 0) and has a path length of 0. Using the same data I have above, this is what the closure table would look like:
ancestor_id descendant_id path_length
1 1 0
0 1 0
2 2 0
1 2 1
0 2 0
3 3 0
2 3 1
1 3 2
0 3 0
4 4 0
0 4 0
To give better context, here is a select query used on that site, modified to fit my examples:
SELECT `id`,`parent_id` FROM `sample_items` `items`
JOIN `sample_items_closure` `closure`
ON `items`.`id` = `closure`.`descendant_id`
WHERE `closure`.`ancestor_id` = 2
I have two questions related to this method:
Question #1:
Why would an additional row be added linking each descendant to the root level (id 0)?
Question #2:
Why would path_length be 0 for these entries, instead of being the previous ancestor's path_length+1? For example:
ancestor_id descendant_id path_length
1 1 0
0 1 1
2 2 0
1 2 1
0 2 2
3 3 0
2 3 1
1 3 2
0 3 3
4 4 0
0 4 1
Bonus Question:
Why do some examples still include an adjacency list (the parent_id column for sample_items in my example) when the full structure of the tree is already expressed in the closure table?
You could use CTE's. They are made for exactly those use cases and have many great examples, which are close to your case.

MySQL - Pivot columns into rows

I'm trying to sum columns in a table and then return each column and value as a separate row, but only if the sum is > 0. Here's an example table
… stuff widget toast …
- ----- ------ ----- -
… 0 0 1 …
… 1 0 3 …
… 2 0 1 …
… 0 0 1 …
… 0 0 0 …
Summing the columns is easy enough
select sum(stuff) as stuff, sum(widget) as widget, sum(toast) as toast from table
Which produces this output
stuff widget toast
----- ------ -----
3 0 6
But what I actually want to end up with is this (notice that widget is missing since the sum is 0)
thing count
---- -----
stuff 3
toast 6
Any advice?
Here you go:
SELECT 'stuff' as thing, sum(stuff) `count` FROM foo HAVING `count` > 0
UNION
SELECT 'widget', sum(widget) s FROM foo HAVING s > 0
UNION
SELECT 'toast', sum(TOAST) t FROM foo HAVING t > 0
http://sqlfiddle.com/#!2/b586f/7
Aliases in first SELECT determine the column names of the result, aliases in following SELECTs don’t matter, therefor I used short ones.
And HAVING the respective sum > 0 eliminates the widget column in the result as requested.
Only feasible for that small amount of original columns though – for a higher number of columns you probably wouldn’t want to do that.

SQL - counting rows with specific value

I have a table that looks somewhat like this:
id value
1 0
1 1
1 2
1 0
1 1
2 2
2 1
2 1
2 0
3 0
3 2
3 0
Now for each id, I want to count the number of occurences of 0 and 1 and the number of occurences for that ID (the value can be any integer), so the end result should look something like this:
id n0 n1 total
1 2 2 5
2 1 2 4
3 2 0 3
I managed to get the first and last row with this statement:
SELECT id, COUNT(*) FROM mytable GROUP BY id;
But I'm sort of lost from here. Any pointers on how to achieve this without a huge statement?
With MySQL, you can use SUM(condition):
SELECT id, SUM(value=0) AS n0, SUM(value=1) AS n1, COUNT(*) AS total
FROM mytable
GROUP BY id
See it on sqlfiddle.
As #Zane commented above, the typical method is to use CASE expressions to perform the pivot.
SQL Server now has a PIVOT operator that you might see. DECODE() and IIF() were older approaches on Oracle and Access that you might still find lying around.