Reorganizing mysql aggregate row into single piece rows - mysql

Consider the following mysql table:
ID WeightS AmountS WeightM AmountM WeightL AmountL Someothercolumnshere
1 6 3 10 2 18 2 ...
I need to reorganize this data into a pivot-friendly table, where each piece in the amount columns should be one result row. E.g. from the first two columns, WeightS and AmountS, the SELECT should produce 3 result rows, each having a weight of 2 kgs (=6 kgs total). So the full result table should be like this:
Weight Someothercolumnshere
2 ...
2 ...
2 ...
5 ...
5 ...
9 ...
9 ...
I don't even know if there's a SQL syntax which is able to do this kind of operation? I've never had a request like this before. Worst case scenario, I have to do it in php instead, but I think MYSQL is a lot more fun :p
I've built the schema on sqlfiddle, but I'm afraid that's all I've got.

You need a Tally table for the task like this. Create as much rows as needed in it.
Create table Tally(`N` int);
insert into Tally( `N`) values(1),(2),(3),(4),(5);
Then
(select `ID`, `WeightS`/`AmountS`, `Someothercolumnshere`
from Catches
join Tally on Catches.`AmountS` >= Tally.`N`
)
UNION ALL
(select `ID`, `WeightL`/`AmountL`, `Someothercolumnshere`
from Catches
join Tally on Catches.`AmountL` >= Tally.`N`
)
UNION ALL
(select `ID`, `WeightM`/`AmountM`, `Someothercolumnshere`
from Catches
join Tally on Catches.`AmountM` >= Tally.`N`
)

Related

Insert unique records into table based on calculation of other table column MYSQL

I have two tables(ProductionDetails, Production), I need to populate Production based on calculated values of the ProductionDetails. ProductionDetails has the number of buckets gathered, Production needs to be populated with BINS. For example if I have 100 buckets, and I divide those by 30 that equals 3.33 bins. Now I need to create 3 BINS.
select Round( sum(ep.Buckets)/30),ep.lot,ep.crewid
from On_EmpProdDetails ep
group by ep.buckets, ep.lot,ep.crewid
Results in
Bins
Lot
Crew
3
556790186SOCC1
SOCC1
Now I want to insert 3 unique rows into the Production Table
BinID is auto increment
BinID
LOT
1
556790186SOCC1
2
556790186SOCC1
3
556790186SOCC1
You can use a recursive CTE to generate the rows:
insert into production (lot)
with recursive cte as (
select Round( sum(ep.Buckets)/30) as num_bins, ep.lot, ep.crewid, 1 as bin
from On_EmpProdDetails ep
group by ep.buckets, ep.lot,ep.crewid
union all
select num_bins, lot, crewid, bin + 1
from cte
where bin < num_bins
)
select lot
from cte;
Here is a db<>fiddle.

Recursively running a MySQL function

I have a function in MySQL that needs to be run about 50 times (not a set value) in a query. the inputs are currently stored in an array such as
[1,2,3,4,5,6,7,8,9,10]
when executing the MySQL query individually it's working fine, please see below
column_name denotes the column it's getting the data for, in this case, it's a DOUBLE in the database
The second value in the MOD() function is the input I'm supplying MySQL from the aforementioned array
SELECT id, MOD(column_name, 4) AS mod_output
FROM table
HAVING mod_output > 10
To achieve the output I require* the following code works
SELECT id, MOD(column_name, 4) AS mod_output1, MOD(column_name, 5) AS mod_output2, MOD(column_name, 6) AS mod_output3
FROM table
HAVING mod_output1 > 10 AND mod_output2 > 10 AND mod_output3 > 10
However this obviously is extremely dirty, and when having not 3 inputs, but over 50, this will become highly inefficient.
Appart from calling over 50 individual querys, is there a better way to acchieve the same sort (see below) of output?
In escennce i need to supply MySQL with a list of values and have it run MOD() over all of them on a specified column.
The only data I need returned is the id's of the rows that match the MOD() functions output with the specified input (see value 2 of the MOD() function) where the output is less than 10
Please note, MOD() has been used as an example function, however, the final function required *should* be a drop in replacement
example table layout
id | column_name
1 | 0.234977
2 | 0.957739
3 | 2.499387
4 | 48.395777
5 | 9.943782
6 | -39.234894
7 | 23.49859
.....
(The title may be worded wrong, I'm not quite sure how else you'd explain what I'm trying to do here)
Use a join and derived table or temporary table:
SELECT n.n, t.id, MOD(t.column_name, n.n) AS mod_output
FROM table t CROSS JOIN
(SELECT 4 as n UNION ALL SELECT 5 UNION ALL SELECT 6 . . .
) n
WHERE MOD(t.column_name, n.n) > 10;
If you want the results as columns, you can use conditional aggregation afterwards.

How to delete a row where there are only one of the kind in MySql?

I have following data in MySQL table named info:
chapter | section
3 | 0
3 | 1
3 | 2
3 | 3
4 | 0
5 | 0
I would like to delete a row for chapter = n, but only when there is no section>0 for same chapter. So chapter 3 can't be deleted while chapter 4 and 5 can. I know the following doesn't work:
DELETE info WHERE chapter = 3 AND NOT EXISTS (SELECT * FROM info WHERE chapter = 3 AND section>0);
The same table is used twice in the statement. So what is the easiest way to achieve my goal?
You've got the idea right. Here is the syntax:
DELETE
FROM mytable
WHERE chapter NOT IN (
SELECT * FROM (
select tt.chapter
from mytable tt
where tt.section <> 0
group by tt.chapter
) tmp
)
The nested select is a workaround a bug in MySQL.
Demo.
You can run a sub query to return the rows that have sections of more then one and then delete the rows returned from the sub query.
DELETE FROM table1 WHERE table1.chapter Not IN (select chapter from
(SELECT table1.chapter FROM table1 WHERE Table1.section >=1 ) Results);
Example Fiddle based on your question
You could also supply the chapter as well in the sub query where clause if you only want to delete a specfic chapter. If it does not meet the where clause then no records will be deleted.
This should do it.
DELETE FROM Table t
WHERE NOT EXISTS(
SELECT 1
FROM Table t2
Where t2.chapter = t.chapter
And t2.section > 0
)
In my experience Exists generally performs better than In. If you are storing a large amount of records you should take this into consideration.

Unable to apply WHERE/AND on MySQL table with 2 columns on MAMP

I thought I had a very simple query to perform, but I can't seem to make it work.
I have this table with 2 columns:
version_id trim_id
1 15
1 25
1 28
1 30
1 35
2 12
2 25
2 33
2 48
3 11
3 25
3 30
3 32
I am trying to get any version-id's that have say a sub-set of trim_id's. Let's say all version_id's that have trim_id's 25 and 30. My obvious attempt was :
SELECT * FROM table WHERE trim_id=25 AND trim_id=30
I was expecting to have version_id 1 and 3 as a result, but instead I get nothing.
I am working with the latest version of MAMP, which has some odd behavior, like in this case it just tells me its 'LOADING' and never gives me an error message or something. But that's normally the case when there is no data to return.
This is InnoDB, if that helps.
Thanks for your input.
Your query does not work because you are using AND and the trim_id cannot have two different values at the same time, so you need to apply Relational Division to get the result.
You will need to use something similar to the following:
SELECT version_id
FROM yourtable
WHERE trim_id in (25, 30)
group by version_id
having count(distinct trim_id) = 2
See SQL Fiddle with Demo.
This will return the version_id values that have both 25 and 30. Then if you wanted to include additional columns in the final result, you can expand the query to:
select t1.version_id, t1.trim_id
from yourtable t1
where exists (SELECT t2.version_id
FROM yourtable t2
WHERE t2.trim_id in (25, 30)
and t1.version_id = t2.version_id
group by t2.version_id
having count(distinct t2.trim_id) = 2);
See SQL Fiddle with Demo
SELECT *
FROM table
WHERE trim_id IN(25,30)

Need Help streamlining a SQL query to avoid redundant math operations in the WHERE and SELECT

*Hey everyone, I am working on a query and am unsure how to make it process as quickly as possible and with as little redundancy as possible. I am really hoping someone there can help me come up with a good way of doing this.
Thanks in advance for the help!*
Okay, so here is what I have as best I can explain it. I have simplified the tables and math to just get across what I am trying to understand.
Basically I have a smallish table that never changes and will always only have 50k records like this:
Values_Table
ID Value1 Value2
1 2 7
2 2 7.2
3 3 7.5
4 33 10
….50000 44 17.2
And a couple tables that constantly change and are rather large, eg a potential of up to 5 million records:
Flags_Table
Index Flag1 Type
1 0 0
2 0 1
3 1 0
4 1 1
….5,000,000 1 1
Users_Table
Index Name ASSOCIATED_ID
1 John 1
2 John 1
3 Paul 3
4 Paul 3
….5,000,000 Richard 2
I need to tie all 3 tables together. The most results that are likely to ever be returned from the small table is somewhere in the neighborhood of 100 results. The large tables are joined on the index and these are then joined to the Values_Table ON Values_Table.ID = Users_Table.ASSOCIATED_ID …. That part is easy enough.
Where it gets tricky for me is that I need to return, as quickly as possible, a list limited to 10 results where value1 and value2 are mathematically operated on to return a new_ value where that new_value is less than 10 and the result is sorted by that new_value and any other where statements I need can be applied to the flags. I do need to be able to move along the limit. EG LIMIT 0,10 / 11,10 / 21,10 etc...
In a subsequent (or the same if possible) query I need to get the top 10 count of all types that matched that criteria before the limit was applied.
So for example I want to join all of these and return anything where Value1 + Value2 < 10 AND I also need the count.
So what I want is:
Index Name Flag1 New_Value
1 John 0 9
2 John 0 9
5000000 Richard 1 9.2
The second response would be:
ID (not index) Count
1 2
2 1
I tried this a few ways and ultimately came up with the following somewhat ugly query:
SELECT INDEX, NAME, Flag1, (Value1 * some_variable + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE (Value1 * some_variable + Value1) < 10
ORDER BY New_Value
LIMIT 0,10
And then for the count:
SELECT ID, COUNT(TYPE) as Count, (Value1 * some_variable + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE (Value1 * some_variable + Value1) < 10
GROUP BY TYPE
ORDER BY New_Value
LIMIT 0,10
Being able to filter on the different flags and such in my WHERE clause is important; that may sound stupid to comment on but I mention that because from what I could see a quicker method would have been to use the HAVING statement but I don't believe that will work in certain instance depending on what I want to use my WHERE clause to filter against.
And when filtering using the flags table :
SELECT INDEX, NAME, Flag1, (Value1 * some_variable + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE (Value1 * some_variable + Value1) < 10 AND Flag1 = 0
ORDER BY New_Value
LIMIT 0,10
...filtered count:
SELECT ID, COUNT(TYPE) as Count, (Value1 * some_variable + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE (Value1 * some_variable + Value1) < 10 AND Flag1 = 0
GROUP BY TYPE
ORDER BY New_Value
LIMIT 0,10
That works fine but has to run the math multiple times for each row, and I get the nagging feeling that it is also running the math multiple times on the same row in the Values_table table. My thought was that I should just get only the valid responses from the Values_table first and then join those to the other tables to cut down on the processing; with how SQL optimizes things though I wasn't sure if it might not already be doing that. I know I could use a HAVING clause to only run the math once if I did it that way but I am uncertain how I would then best join things.
My questions are:
Can I avoid running that math twice and still make the query work
(or I suppose if there is a good way
to make the first one work as well
that would be great)
What is the fastest way to do this
as this is something that will
be running very often.
It seems like this should be painfully simple but I am just missing something stupid.
I contemplated pulling into a temp table then joining that table to itself but that seems like I would trade math for iterations against the table and still end up slow.
Thank you all for your help in this and please let me know if I need to clarify anything here!
** To clarify on a question, I can't use a 3rd column with the values pre-calculated because in reality the math is much more complex then addition, I just simplified it for illustration's sake.
Do you have a benchmark query to compare against? Usually it doesn't work to try to outsmart the optimizer. If you have acceptable performance from a starting query, then you can see where extra work is being expended (indicated by disk reads, cache consumption, etc.) and focus on that.
Avoid the temptation to break it into pieces and solve those. That's an antipattern. That includes temp tables especially.
Redundant math is usually ok - what hurts is disk activity. I've never seen a query that needed CPU work reduction on pure calculations.
Gather your results and put them in a temp table
SELECT * into TempTable FROM (SELECT INDEX, NAME, Type, ID, Flag1, (Value1 + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE New_Value < 10)
ORDER BY New_Value
LIMIT 0,10
Return Result for First Query
SELECT INDEX, NAME, Flag1, New_Value
FROM TempTable
Return Results for count of Types
Select ID, Count(Type)
FROM TempTable
GROUP BY TYPE
Is there any chance that you can add a third column to the values_table with the pre-calculated value? Even if the result of your calculation is dependent on other variables, you could run the calculation for the whole table but only when those variables change.