Octave: Split a table into subtables after summing rows - octave

I have a table e.g table(12,4) filled with random numbers from 0 to 1. I want to sum 4 rows of each column of the table mentionned. Then, I would like to store the results in a new table 3x4. After the first 4 results, i try to count the next 4 rows of each column.
To be more clear, here's an example
0 0.087913 0 0
0.27561 0.17959 0.24402 0.20616
0.040698 0 0.056478 0.0039007
0.10768 0.25992 0.25992 0.25992
0 0.086466 0 0
0.27469 0.18798 0.25542 0.21579
0.04021 0 0.059588 0.0041156
0.092386 0.22962 0.22962 0.22962
0 0.087532 0 0
0.26506 0.18139 0.24646 0.20822
0.037734 0 0.055918 0.0038621
0.099674 0.24774 0.24774 0.24774
Sum of 4 rows for each column for example:
first result is going to be 0+0.27561+0.040698+0.10768.
second result will be 0.087913+0.17959+0+0.2599.
These results will be stored in a table like this
first result, secondresult, thirdresult, fourthresult
fifth result, sixthresult, seventhresult, eighthresult
nineth result, tenthresult, eleventhresult, twelfthresult
I tried doing this for the first 4 results like this, but I am not quite sure how to do this dynamically for each 4 columns:
kkx=zeros(12,4);
kkx=rand(12,4);
[nxx,nyy]=size(kkx);
nxxnew=nxx/4
newtable=zeros(3,4)
for ii=1:nyy
newtable(1,ii)=sum(kkx(1:4,ii));
endfor

You just have to add a second for-loop which iterates with the step size of 4 in range of the number of rows. Then you just have to add 2 additional variables.
rows_ind = 0;
for jj=1:4:nxx
jj_plus_three = jj + 3;
rows_ind += 1;
for ii=1:nyy
newtable(rows_ind,ii)=sum(kkx(jj:jj_plus_three,ii));
endfor
endfor
rows_ind indicates the row in newtable in which the next sums will be written. jj and jj_plus_three will be the range for the rows that will be summed in column ii.

Related

SSRS Conditional Max

I would like my expression to return the max value of a column where another column equals "0".
example :
0 12
0 11
0 7
1 3
1 40
1 1
This should return 12.
I tried several things but can't make it work.
Any ideas?
Try something like.
=MAX(IIF(Fields!ColumnA.Value = 0, Fields!ColumnB.Value, -99999))
Column A and B refer to your unnamed columns in your sample data. The -99999 should be a value lower than the lowest Columb B value you will ever get. If Column B is always positive then any negative value or even 0 will suffice here.
The expression reads:
"for each row, look in Column A. If Column A is zero then return Column B's value, if Column A is not zero, return -99999. Now get the MAX value from these values"

Replace custom function in Google Sheets with standard functions

I have a range of cells, and I want to accrue the number of times a column has the max value for its given row.
Sample:
headers -> a b c d e f g h
0 0 12 18* 1 0 0 0
30* 0 15 25 0 0 0 0
35 0 19 31 0 0 31 50*
40 10 19 31 0 2 5 55*
expected:
#max val per row-> 1 0 0 1 0 0 0 2
The maximum values are marked with an asterisk. The column a scores 1 because it has the maximum value in the second data row, the column d scores 1 as well because it has the maximum value in the first data row and the column h scores 2 because it has the maximum value in the third and fourth data rows. The rest of columns don't have the maximum value in any row, so they get a 0.
For just one row, I can copy this formula for for each column and it would do it, but I need something that applies the max row-wise COUNTIF(B2:B10, MAX($B2:$B10)).
I have written this google apps script, but I don't like its responsiveness (seeing the "Loading..." in the cell for almost a second is kind of exasperating compared with the snappiness you get with native functions):
function countMaxInRange(input) {
return [input.map(function(row) {
var m = Math.max.apply(null, row);
return row.map(function(x){return x === m && 1 || 0});
}).reduce(function(a, b){
var s = Array(a.length);
for (var i = 0; i < a.length; i++) {
s[i] = (a[i] + b[i]) || 0;
}
return s;
})];
}
Any ideas on how I could replace that code with built in functions? I don't care adding auxiliar rows or columns, as long as it is a constant number of them (that is, if I extend my dataset I don't want to manually add more helper rows or columns for each new data row or column).
I think I could add an extra column that collects the header of the column with the max value for each row; and then for each data column count how many times their header appears in that auxiliar column, but does not seem very clean.
FORMULA
=ArrayFormula(TRANSPOSE(
MMULT(
N(TRANSPOSE(NamedRange1)
=
INDEX(
QUERY(TRANSPOSE(NamedRange1),
"SELECT "&JOIN(",",("MAX(Col"&TRANSPOSE(ROW(NamedRange1))-INDEX(ROW(NamedRange1),1)+1)&")"
)),
2)
),
SIGN(ROW(NamedRange1))
)
))
where NamedRange1 is named range referred to the range.
Conditional formatting:
Apply to range: A1:H4
Custom formula: =A1=MAX($A1:$H1)
Explanation
Summary
The above formula no requires extra columns, just to set the range as a named range. In the formula NamedRange1 was used but it could be customized according to your preferences.
The result is a 1 x n array where n is the number of columns of NamedRange1. Each column will have the count of occurrences of maximum values by row on the correspondent column.
Featured "hacks"
ARRAYFORMULA returns an array of values.
Ranges greater than 1 x 1 are handled as arrays.
Using an array as argument with some functions and operators works in a similar way than a loop. In this case, this features is used to create a SQL statement to get the maximum value of each column of the input data. Note that the input data for QUERY is the transpose of NamedRange1.
N coerce TRUE/FALSE values to 1/0 respectively.
MMULT is used to make sums by rows
Note: the +0 shown on the image was inserted to force Google Sheets to keep the breaklines introduced on an edit of the formula without breaklines because if there are not significant changes to the formula, the breaklines are automatically removed due to the formula/result caching feature of Google Sheets.
Reference
MMULT Usage by Adam Lusk.

Check column if has the same value on all the rows

How do I check if a column has all the rows the same value?
I don't think this will work.
SELECT column FROM table WHERE value = 1
I want to make, by time each row will turn from 0 to 1 till every row has value 1, if all the values are 1 to turn all in 0
id value
1 1
2 1
3 1
4 1
5 1
6 1
7 1
You can try to use distinct
select count(distinct column) FROM table
If the result is 1 then it means there is only same value present in the column else there are different values present in your column.
Try this query :-
select count(value) from table where value =0;
if rows return count is zero that means there are no zeroes in that column.
Use count
SELECT count(value) as total FROM table
if total > 1 than more than on value

SQLite SELECT statement for column name where column has a value

What SQLite statement do I need to get the column name WHERE there is a value?
COLUMN NAME: ALPHA BRAVO CHARLIE DELTA ECHO
ROW VALUE: 0 1 0 1 1
All I want in my return is: Bravo, Delta, Echo.
Your request is not entirely clear, but you appear to be asking for a SELECT statement that will return not data but rather columns names, and not a predictable number of values but rather a number values that depend on the data in the table.
For instance,
A B C D E
0 1 0 1 1
would return (B,D,E) whereas
A B C D E
1 0 1 0 0
would return (A, C).
If that's what you're asking, this is not something that SQL does. SQL retrieves data from the table and an SQL result set always has the same number of columns per row.
To accomplish your goal, you would have to retrieve all columns that might have a value in the table and then, in your program code, check for the value in each column and accrue a list of column names that had values.
Also, consider what happens when there is more than one row to examine and the distribution of values differ. In other words, what's the expected result if the data looks like this:
A B C D E
- - - - -
0 1 0 1 1
1 0 1 0 0
[Also, note that all the columns in your example have values, some 0, some 1. What you really want is a list of column names where the column contains a value of 1.]
Finally, consider that your inability to easily get the results you need from your data might indicate a flaw in the data model you're using. For instance, if you were to structure your data like this:
TagName TagValue
------- --------
Alpha 0
Bravo 1
Charlie 0
Delta 1
Echo 1
you could then obtain your results with SELECT TagName FROM Tags WHERE TagValue = 1.
Furthermore, if 0 and 1 are really the only two possible values (indicating boolean "presence" or "absence" of the tag) then you could remove the TagValue column and the rows for Alpha and Charlie entirely (you'd INSERT a row into the table to add tag and DELETE a row to remove it).
A design along these lines seems to model your data more accurately and allows you to entire new tags to the system without having to issue an ALTER TABLE command.
http://sqlfiddle.com/#!9/1407e/1
SELECT CONCAT(IF(ALPHA,'ALPHA,',''),
IF(BRAVO,'BRAVO,',''),
IF(CHARLIE,'CHARLIE,',''),
IF(DELTA,'DELTA,',''),
IF(ECHO,'ECHO',''))
FROM table1

SAS : Eliminate duplicates if a condition is satisfied

I want to eliminate duplicates from a database, based on an identifier, an order and a condition.
More precisely, I have data with several observations. I have sometimes a condition that makes me want to keep that observation anyway (let fix it condition=1), but then also keep the observation with the same identifier even if this condition does not hold (condition=0).
But if I have for one identifier several observations where condition=0 then I want to elminate duplicates, with criterion being having the greatest order.
Without the condition I can do that
proc sort data=have;
by identifier descending order;
run;
proc sort nudopkey data=have;
by identifier;
run;
But how to incorporate my condition in this ?
Edit 1 : add a database example :
data Test;
input identifier $ order condition;
datalines;
1023 1 0
1023 2 0
1064 2 0
1064 1 0
1098 1 0
1098 1 1
;
Then I want to keep
1023 2 0
1064 2 0
1098 1 0
1098 1 1
Edit 2 : tried to precise my conditions
I presume you want to eliminate duplicates only when the condition for all records for an identifier is set to 0. In that case you want to keep the record with the maximum order and eliminate all other records with the same identifier.
Proc sql;
create table want as
select *
from test
group by identifier
having max (condition) ne 0
or order eq max (order)
;
Quit;
This will keep all rows for an Identifier where the maximum condition = 1,
or in the case of those where maximum condition = 0, select the row with the maximum order.
Is that what you want?
Some of this depends on how you define 'condition'. Is your condition easily verifiable on every record for that identifier? Then you can do something like this.
Evaluate the condition.
For records where it is true (you want to remove the duplicate), set flag=0. For records where it is not true, increment the condition flag by one.
If the condition is true for all records in that ID, all will have the same value (flag=0) and nodupkey on by identifier flag; will remove extras. If the condition is false for all records, those will not be removed. If it's true for some and false for some, and you want to remove only some of the records with that identifier (only the duplicates where it is true), then you have to make sure that either it's sorted to have all of the condition=true records at top, or have a separate flag counter that determines what value the flag will be (since it sometimes will go to 0 in the middle, so 0 0 0 1 2 3 0 4 5 6 is what you want, not 0 0 0 1 2 3 0 0 1 2 ).
Perhaps easier to see is to do it within a datastep. After sorting by identifier descending order:
data want;
set have;
by identifier descending order;
if (condition=true) and not (first.identifier) then delete;
run;
This will, again, work if either condition=true is always at the top, or if it's always consistent within one ID group. If it's inconsistent and mixed, then you need to keep track of whether you've kept one where it was true (assuming you want to), or it might delete all records where it is true; use a separate variable to keep track of how many you've kept. first.identifier will be 1/TRUE for the first record for that identifier only, not taking into account the condition. You could also create the flag, then sort by identifier flag descending order; and guarantee the condition=true are at the top (either by making flag=0 for true, or sorting by descending flag.)