Count cells with same string in dynamic range - google-apps-script

I've read many articles on Google and StackOverflow, but haven't found any that mention how to count cells (under the same column) containing same string value. The count only considers a part of the sheet: many cells are added/removed in a short time, so the range keeps changing length. In the same sheet there are several ranges, separated by a blank row.
The counters should refer to a single range (counter_1 --> range_1; counter_2 --> range_2 , etc.).
e.g.: if cells can show 4 different options AND there are 5 dynamic ranges in the sheet --> there will be 4 counters for each range (4*5).
Following several websites (like this, this and this), I attempted to implement this check function directly from the sheet, without involving AppsScript.
E.g.: if I add this function in D2, E2, F2, G2 (watch the table below for reference):
COUNTIF(B2:B2,”1st option”) in D2 ; COUNTIF(B2:B2,”2nd option”) in E2 ; COUNTIF(B2:B2,”3rd option”) in F2 ; COUNTIF(B2:B2,”4th option”) in G2
Each counter will check its condition and update its cell value. This will be done only for cells grouped under "1st department".
The problem is that I have to add 16 counters manually (4 options for 4 departments) and, if an item is added/removed, all counters will throw an error. I can't divide departments in different sheets as a workaround.
My sheet is as follows:
Department
Option
1st option counter
2nd option counter
3rd option counter
4th option counter
1st
"2nd option"
0
1
0
0
2nd
"1st option"
1
1
0
0
2nd
"2nd option"
3rd
"4th option"
0
1
0
1
3rd
"2nd option"
4th
"3rd option"
0
2
1
0
4th
"2nd option"
4th
"2nd option"
After some items were added/removed:
Department
Option
1st option counter
2nd option counter
3rd option counter
4th option counter
1st
"2nd option"
0
2
0
0
1st
"2nd option"
2nd
"1st option"
2
1
0
0
2nd
"2nd option"
2nd
"1st option"
3rd
"4th option"
0
1
1
1
3rd
"2nd option"
3rd
"3rd option"
4th
"3rd option"
0
0
2
0
4th
"3rd option"
Any help would be appreciated.

Here's a non-array answer. The only reason this might be helpful compared to the above two answers is if you have a lot of calculations going and you begin to hit some performance issues. The obvious drawback to the below formula is that you would have to reapply it by dragging down after changes were made. You could build in an app script to reapply the formula as an r1C1 during an onEdit event.
Put this in all cells in columns D:G and assuming D1:G1 have the matching count syntax (i.e D1=1st Option)
=if(And($A2<>"",OR(Row($A2)=2,$A1="")),SUMPRODUCT((--($A:$A=$A2))*(--(D$1=$B:$B))),)
Again the first two answers offer a dynamic solution, which is probably better, but I figured I'd add this just for illustration or maybe to ignite some other ideas.

You can try with this formula in D2:
=MAKEARRAY(ROWS(A2:A);4;LAMBDA(r;c;IF(AND(INDEX(A2:A;r)<>"";INDEX(A1:A;r)<>INDEX(A2:A;r));COUNTIFS(A2:A;INDEX(A2:A;r);B2:B;INDEX(D1:1;c));"")))
You can see it working here

function countcellswithsamestring() {
const ss = SpreadsheetApp.getActive();
const sh = ss.getSheetByName("Sheet0");
const osh = ss.getSheetByName("Sheet1");
const sr = 2;//data start row
const rg = sh.getRange(sr, 1, sh.getLastRow() - sr + 1, sh.getLastRow());
const row = rg.getRow();
const col = rg.getColumn();
const vs = rg.getDisplayValues();
let co = {pA:[]};
vs.forEach((r,i) => {
r.forEach((c,j) => {
if(!co.hasOwnProperty(c)) {
co[c] = {count:1,loc:[sh.getRange(row + i,col + j).getA1Notation()]}
co.pA.push(c);
} else {
co[c].count++;
co[c].loc.push(sh.getRange(row + i,col + j).getA1Notation())
}
})
})
let o = co.pA.map(c => [c,co[c].count,co[c].loc.join(',')]);
osh.clearContents();
o.unshift(["String","Count","Locations"])
osh.getRange(1,1, o.length,o[0].length).setValues(o);
}
Data:
COL1
COL2
COL3
COL4
COL5
COL6
COL7
COL8
COL9
COL10
6
10
0
5
1
2
4
5
2
3
5
7
1
5
8
0
9
8
3
8
5
1
5
5
0
4
8
6
0
3
7
4
0
6
3
8
9
8
3
5
4
7
5
1
7
9
4
6
3
9
0
0
0
7
4
7
9
2
6
1
4
2
10
10
4
4
6
6
6
9
7
0
10
0
2
10
8
0
8
1
0
0
0
0
6
9
1
4
7
8
8
9
5
3
5
8
1
4
1
6
9
5
6
7
1
4
2
5
8
7
Output:
String
Count
Locations
6
11
A2,H4,D5,H6,I7,G8,H8,I8,E10,J11,C12
10
5
B2,C8,D8,C9,F9
0
15
C2,F3,E4,I4,C5,A7,B7,C7,B9,D9,H9,A10,B10,C10,D10
5
13
D2,H2,A3,D3,A4,C4,D4,J5,C6,C11,E11,B12,H12
1
10
E2,C3,B4,D6,J7,J9,G10,G11,I11,E12
2
6
F2,I2,H7,B8,E9,G12
4
12
G2,F4,B5,A6,G6,E7,A8,E8,F8,H10,H11,F12
3
7
J2,I3,J4,E5,I5,I6,D11
22
K2,L2,K3,L3,K4,L4,K5,L5,K6,L6,K7,L7,K8,L8,K9,L9,K10,L10,K11,L11,K12,L12
7
10
B3,A5,B6,E6,D7,F7,A9,I10,D12,J12
8
12
E3,H3,J3,G4,F5,H5,G9,I9,J10,A11,F11,I12
9
9
G3,G5,F6,J6,G7,J8,F10,B11,A12

Related

Google Sheets - Script to move columns given the column header

Given this table schema:
Col_France
Col_Argentina
Col_Croatia
Col_Morocco
x
x
x
x
x
x
x
x
I want to create a Google Script that rearranges the columns so the order is always:
Col_Argentina -> Column 1
Col_France -> Column 2
Col_Croatia -> Column 3
Col_Morocco -> Column 4
Because the original column orders of the given table is not always as described above, I cannot simply use:
var sheet = SpreadsheetApp.getActiveSheet();
// Selects Col_France.
var columnSpec = sheet.getRange("A1");
sheet.moveColumns(columnSpec, 2);
and so on... In other words, the table schema can possibly be:
Col_Morocco
Col_Croatia
Col_France
Col_Argentina
x
x
x
x
x
x
x
x
but the desired outcome should always be the defined above. The script should be scalable. In the future, more than 4 columns should be rearranged.
My approach would be:
Define the range of columns to rearrange (they are all together)
For the first column, get the value of the column header
Depending on the value, move the column to a predefined index
Move to the next column and repeat
Iterate until end of range
Can somebody please point me to the required functions?
In your situation, when moveColumns is used, how about the following sample script?
Sample script:
function myFunction() {
var order = ["Col_Argentina", "Col_France", "Col_Croatia", "Col_Morocco"]; // This is from your question.
var sheet = SpreadsheetApp.getActiveSheet();
var obj = sheet.getRange(1, 1, 1, sheet.getLastColumn()).getValues()[0].reduce((ar, h, i) => [...ar, { from: i + 1, to: order.indexOf(h) + 1 }], []).sort((a, b) => a.to > b.to ? 1 : -1);
for (var i = 0; i < obj.length; i++) {
if (obj[i].from != obj[i].to) {
sheet.moveColumns(sheet.getRange(1, obj[i].from), obj[i].to);
obj.forEach((e, j) => {
if (e.from < obj[i].from) obj[j].from += 1;
});
}
}
}
When this script is run, the columns are rearranged by order you give. In this case, the text and cell format are also moved.
When moveColumns(columnSpec, destinationIndex) is used, the indexes of columns are changed after moveColumns(columnSpec, destinationIndex) was run. So, please be careful about this. In the above script, the changed indexes are considered.
References:
moveColumns(columnSpec, destinationIndex)
reduce()
forEach()
Order Columns:
function ordercols() {
const ss = SpreadsheetApp.getActive();
const sh = ss.getSheetByName("Sheet0");
const [h,...vs] = sh.getDataRange().getValues();
const idx = {};
h.forEach((h,i) => idx[h]=i);
const o = vs.map(r => [r[idx['COL4']],r[idx['COL3']],r[idx['COL2']],r[idx['COL1']]]);
sh.clearContents();
o.unshift(['COL4','COL3','COL2','COL1']);
sh.getRange(1,1,o.length,o[0].length).setValues(o);
}
Data:
COL1
COL2
COL3
COL4
24
5
2
9
16
0
13
18
22
24
23
16
12
12
4
17
6
20
17
14
7
13
4
2
2
20
4
22
3
5
3
4
16
5
7
23
ReOrdered:
COL4
COL3
COL2
COL1
9
2
5
24
18
13
0
16
16
23
24
22
17
4
12
12
14
17
20
6
2
4
13
7
22
4
20
2
4
3
5
3
23
7
5
16

How to check for the rows continuity of an event in Google Sheet

I need help in Google Sheets to check for the continuity of an event which is a cell value say "3". The continuity needs to be checked for the last 7 cells. If the condition is satisfied (doesn't matter how many times) the value in the Result column is 1.
Please help in solving the problem.
Refer to the attached image for illustration.
Try in P2
=--ARRAYFORMULA(MAX((IFERROR({SPLIT(TEXTJOIN("|",TRUE,IF(A2:O2="",COLUMN(A2:N2),"")),"|"),16}-IFERROR({0,SPLIT(TEXTJOIN("|",TRUE,IF(A2:O2="",COLUMN(A2:N2),"")),"|")})-1)))>=7)
drag to bottom.
I've compared single cells with next, if different retrieve column number. Next create two array, shifiting the second to the right. The difference create the intervals of "3", using MAX check if exists the continuity that return TRUE or FALSE. The double minus transform that in 1 or 0 returning the value asked by OP
Continuity Checker
function continuityChecker() {
const ss = SpreadsheetApp.getActive();
const sh = ss.getSheetByName("Sheet0");
const vs = sh.getRange(2, 1, sh.getLastRow() -1, 15).getValues()
const o = vs.reduce((a, r, j) => {
r.forEach((c, i) => {
if (i == 0) { a.sum = 0; a.o.push([0]) }
if (c){ a.sum++;}else{a.sum = 0;}
if (a.sum == 7) a.o[a.o.length - 1] = [1];
})
return a;
}, { o: [], sum: 0, output: function () { return this.o; } }).output();
sh.getRange(2, 16, o.length, o[0].length).setValues(o);
}
Output:
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Result
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
1
3
3
3
3
3
3
3
3
3
3
3
3
3
0
3
3
3
3
3
3
3
3
3
3
3
3
3
1
3
3
3
3
3
3
3
3
3
3
3
3
3
3
1

Printing rows with certain text in column 2 with app script

I need to print off certain rows in a google sheet depending on what is in column 2 of that row. I know how to find the rows with a for loop but the rest eludes me. Perhaps my googling skills are rusty.
This is what I have.
var app = SpreadsheetApp;
var rows = app.getActive().getSheetByName("Sheet").getMaxRows().toString();
var rows = rows.replace(".0","");
function findRows(){
for (var counter = 1; counter <= rows; counter = counter+1){
if(app.getActive().getSheetByName("Sheet").getRange(counter, 2) == "example" || "example2"){
}
}
Find the correct rows
function findrows() {
const ss = SpreadsheetApp.getActive();
const sh = ss.getSheetByName("Sheet0");
const osh = sh.getSheetByName("Sheet1");
const vs = sh.getDataRange().getValues();
let s = vs.map(r => {
if(r[1] == "Example" || r[1] == "example2") {
return r;
}
}).filter(e => e);
Logger.log(JSON.stringify(s));
//you can output to a sheet with something like
//sheet.getRange(1,1,s.length,s[0].length).setValues(s);
osh.getRange(1,1,s.length,s[0].length).setValues(s);//put on another sheet
}
Execution log
4:56:34 PM Notice Execution started
4:56:35 PM Info [[2,"Example",4,5],[5,"Example",7,8],[9,"Example",11,12],[12,"Example",14,15]]
4:56:35 PM Notice Execution completed
Data:
COL1
COL2
COL3
COL4
1
2
3
4
2
Example
4
5
3
4
5
6
4
5
6
7
5
Example
7
8
6
7
8
9
7
8
9
10
8
9
10
11
9
Example
11
12
10
11
12
13
11
12
13
14
12
Example
14
15
13
14
15
16
BTW Printing is not easily done from Javascript or Google Apps Script

Iterating through CSV reader to slice data frame

I have a data frame that contains 508383 rows. I am only showing the first 10 row.
0 1 2
0 chr3R 4174822 4174922
1 chr3R 4175400 4175500
2 chr3R 4175466 4175566
3 chr3R 4175521 4175621
4 chr3R 4175603 4175703
5 chr3R 4175619 4175719
6 chr3R 4175692 4175792
7 chr3R 4175889 4175989
8 chr3R 4175966 4176066
9 chr3R 4176044 4176144
I want to iterate through each row and check the value of column #2 of the first row to the value of the next row. I want to check if the difference between these values is less than 5000. If the difference is greater than 5000 then I want to slice the data frame from the first row to the previous row and have this be a subset data frame.
I then want to repeat this process and create a second subset data frame. I've only manage to get this done by using CSV reader in combination with Pandas.
Here is my code:
#!/usr/bin/env python
import pandas as pd
data = pd.read_csv('sort_cov_emb_sg.bed', sep='\t', header=None, index_col=None)
import csv
file = open('sort_cov_emb_sg.bed')
readCSV = csv.reader(file, delimiter="\t")
first_row = readCSV.next()
print first_row
count_1 = 0
while count_1 < 100000:
next_row = readCSV.next()
value_1 = int(next_row[1]) - int(first_row[1])
count_1 = count_1 + 1
if value_1 < 5000:
continue
else:
break
print next_row
print count_1
print value_1
window_1 = data[0:63]
print window_1
first_row = readCSV.next()
print first_row
count_2 = 0
while count_2 < 100000:
next_row = readCSV.next()
value_2 = int(next_row[1]) - int(first_row[1])
count_2 = count_2 + 1
if value_2 < 5000:
continue
else:
break
print next_row
print count_2
print value_2
window_2 = data[0:74]
print window_2
I wanted to know if there is a better way to do this process )without repeating the code every time) and get all the subset data frames I need.
Thanks.
Rodrigo
This is yet another example of the compare-cumsum-groupby pattern. Using only rows you showed (and so changing the diff to 100 instead of 5000):
jumps = df[2] > df[2].shift() + 100
grouped = df.groupby(jumps.cumsum())
for k, group in grouped:
print(k)
print(group)
produces
0
0 1 2
0 chr3R 4174822 4174922
1
0 1 2
1 chr3R 4175400 4175500
2 chr3R 4175466 4175566
3 chr3R 4175521 4175621
4 chr3R 4175603 4175703
5 chr3R 4175619 4175719
6 chr3R 4175692 4175792
2
0 1 2
7 chr3R 4175889 4175989
8 chr3R 4175966 4176066
9 chr3R 4176044 4176144
This works because the comparison gives us a new True every time a new group starts, and when we take the cumulative sum of that, we get what is effectively a group id, which we can group on:
>>> jumps
0 False
1 True
2 False
3 False
4 False
5 False
6 False
7 True
8 False
9 False
Name: 2, dtype: bool
>>> jumps.cumsum()
0 0
1 1
2 1
3 1
4 1
5 1
6 1
7 2
8 2
9 2
Name: 2, dtype: int32

how to select/add a column to pandas dataframe based on a non trivial function of other columns

This is a followup question for this one: how to select/add a column to pandas dataframe based on a function of other columns?
have a data frame and I want to select the rows that match some criteria. The criteria is a function of values of other columns and some additional values.
Here is a toy example:
>> df = pd.DataFrame({'A': [1,2,3,4,5,6,7,8,9],
'B': [randint(1,9) for x in xrange(9)],
'C': [4,10,3,5,4,5,3,7,1]})
>>
A B C
0 1 6 4
1 2 8 10
2 3 8 3
3 4 4 5
4 5 2 4
5 6 1 5
6 7 1 3
7 8 2 7
8 9 8 1
I want select all rows for which some non trivial function returns true, e.g. f(a,c,L), where L is a list of lists and f returns True iff a and c are not part of the same sublist.
That is, if L = [[1,2,3],[4,2,10],[8,7,5,6,9]] I want to get:
A B C
0 1 6 4
3 4 4 5
4 5 2 4
6 7 1 3
8 9 8 1
Thanks!
Here is a VERY VERY hacky and non-elegant solution. As another disclaimer, since your question doesn't state what you want to do if a number in the column is in none of the sub lists this code doesn't handle that in any real way besides any default functionality within isin().
import pandas as pd
df = pd.DataFrame({'A': [1,2,3,4,5,6,7,8,9],
'B': [6,8,8,4,2,1,1,2,8],
'C': [4,10,3,5,4,5,3,7,1]})
L = [[1,2,3],[4,2,10],[8,7,5,6,9]]
df['passed1'] = df['A'].isin(L[0])
df['passed2'] = df['C'].isin(L[0])
df['1&2'] = (df['passed1'] ^ df['passed2'])
df['passed4'] = df['A'].isin(L[1])
df['passed5'] = df['C'].isin(L[1])
df['4&5'] = (df['passed4'] ^ df['passed5'])
df['passed7'] = df['A'].isin(L[2])
df['passed8'] = df['C'].isin(L[2])
df['7&8'] = (df['passed7'] ^ df['passed8'])
df['PASSED'] = df['1&2'] & df['4&5'] ^ df['7&8']
del df['passed1'], df['passed2'], df['1&2'], df['passed4'], df['passed5'], df['4&5'], df['passed7'], df['passed8'], df['7&8']
df = df[df['PASSED'] == True]
del df['PASSED']
With an output that looks like:
A B C
0 1 6 4
3 4 4 5
4 5 2 4
6 7 1 3
8 9 8 1
I implemented this rather quickly hence the utter and complete ugliness of this code, but I believe you can refactor it any way you would like (e.g. iterate over the original set of lists with for sub_list in L, improve variable names, come up with a better solution, etc).
Hope this helps. Oh, and did I mention this was hacky and not very good code? Because it is.