Finding the maximum value between certain row and columns in pandas df - csv

Suppose, I have the dataframe below:
df = pd.DataFrame({'group1': ['x','xincr','xmin','xzero','yzero','ymin','s','0','1','2','3','4','5'],
'value1': [1.1,2,3,4,5,6,7,8,9,1,2,3,4]})
I want to find the maximum value in column 'value1' starting in row 7-12. Is there a way to make that specification?
Furthermore, can the output just be the value (i.e. 9).
Thank you.

This is an example of mixed indexing. Meaning you want to use labels for the columns and positions for the rows. There are a few ways to do this.
Option 1
Use .value1 to specify the columns then iloc to specify the rows 7 through 12 using 6:12.
df.value1.iloc[6:12].max()
9.0
Option 2
df.iloc[6:12, df.columns.get_loc('value1')].max()
Option 3
df.value1.values[6:12].max()
Any more options and I'll feel silly. This should do.

Related

Column mean function with many columns

This seems so simple...but I am lost.
I have a data frame set up so that I can make new columns that are means of certain columns (over 2500 columns). The column names are numeric, so I added an X to column header. I would like to get the mean of 500 columns with names and looking to write code calculating the means of columns with character headings.
enter image description here
Each row is a sample, and for each sample, I would like to get the mean of columns X350: to X420.
DF$blue <- mean(DF$X350: DF$X355), ie, the mean of columns X350 to 355 in a new column labeled blue.
Advice greatly appreciated, Thanks!
I think I have to first make the data numeric, I have tried to unlist, but lost as to what to do with the list.
x <- list(DF,4:2504)
x_num <- as.numeric(unlist(x))

SPSS equivalent to Excel IFS statement

I am wondering how can I have it so that if the value in one column is 1, the value in this column is 5, and if the value is 0 the value in this column is 10.
With what I've seen I am only able to have it so that if the value in one column is 1 the value in this column is 5, and if the value is anything else the cell in this column is blank.
This will do it (and yes, so much easier than ifs :) -
recode thiscolumn (1=5)(0=10) into thatcolumn.
Now this solves your example, but recode can also solve more complex scenarios. Here I combine a few examples:
recode thiscolumn (lo thr 0=-1)
(0 3=sysmis)
(1 2 4=1)
(5 thr 12=12)
(22 thr hi=22)
(miss=99)
(else=copy)
into thatcolumn.

How do I replace values in a column in KNIME?

I have a column of countries with 50 different values that I want to reduce to United States and Other.
Can someone help me with that?
Another example is Age which has 48 values that I'd like to reduce to only 4 like 1 to 18 = youth, 18-27 = starting, etc.
I've actually got about 5 columns that I want to reduce the values of. So would I need to repeat the process multiple times in KNIME or can I accomplish multiple column value replacements at once?
The latter on can easily be achieved with the Rule Engine
$Col0$ > 1 AND $Col0$ <18 => "youth"
For the First problem I'd use a String Replace (Dictionary).
I don't think you replace all at once but you can loop over columns.
For the second case I would use Numeric Binner:
For each column a number of intervals - known as bins - can be
defined. Each of these bins is given a unique name (for this column),
a defined range, and open or closed interval borders. They
automatically ensure that the ranges are defined in descending order
and that interval borders are consistent. In addition, each column is
either replaced with the binned, string-type column, or a new binned,
string-type column is appended.

finding the max value of every 24 rows in a matrix of a certain column

I have imported a huge excel file into matlab. The file is a database with 5 columns and 175000 rows. I want the maximum value of every 24 rows of the third column.
can anyone help me plz?
I hope I got what you want right,
I believe you can do something like this:
(forgive me I'm not writing matlab coding)
col = 3
for i = 1 to number_of_rows
Add the element at (i, col) to a new array
i=i+23
end for
then fine the maximum value in the new array you created in the loop, hope this helps

How do I retrieve only the top x rows from a flatfile in SSIS

I have a flatfile connection and I'm only interested in the first 10 rows of data. How can I just import the first 10 rows?
Row sampling is random so I can't use that. Is there some way I can have some sort of derived column which is an automatic row number or something and then data-split to only keep rows with that id <= 10?
Any help much appreciated!
I've used this component --> http://www.sqlis.com/post/Row-Number-Transformation.aspx
The component creates a new variable with a row number. You can use a conditional split to take the first 10 records based on the variable the component creates.
One catch is that you will need to read in the entire file. Depending on your file size you may want to seek another solution.
There isn't a direct way of doing that. You can try a work around method by using the "Data rows to skip" property:
You can "invert" your file and skip all first rows -10
Just use a lineCount component with a user variable and a conditional Split based on the value of that variable/