Merge duplicate cells - Google Sheets - google-apps-script

The intended operation is to search column A for duplicate values (column is already sorted). Each duplicate value in A should be merged into 1 cell. Also, merge the same rows in B,C,D,E,F,G,H (take the top value if different, but safe to assume they are the same).
THANK YOU!
From this:
https://imgur.com/a/WBZEB4M
To this:
https://imgur.com/a/4rkusg4
I'm doing that manually for each order that is created and it's a huge waste of time.

Try Data > Data cleanup > Remove duplicates.
Alternatively, you can Insert > New sheet and put this formula in cell A2 of the new sheet:
=arrayformula(
iferror(
vlookup(
unique(Sheet1!A2:A),
Sheet1!A2:H,
column(Sheet1!A2:H),
false
)
)
)

Related

Preserve associated row data while using importRange in the destination sheet

For our construction company, we have a sheet that has all of the bills listed, along with relevant data that our accounting person would add to the master sheet.
I then have another sheet that pulls this data for the relevant people in the accounts for them to complete those steps. It filters to only the relevant columns (specifically, based on Column H - either "Yes" or "No") using query and importRange.
query(IMPORTRANGE("https://docs.google.com/spreadsheets/d/1pY53-XaGnUQ3BPmLh90mLSqIwSo7S2_QOPbD6JBQHOA/edit#gid=0","Master!A3:G"), "Select Col1, Col2, Col4, Col5, Col6, Col7 where Col6 is not null")
I want to include a few details in the destination sheet, which I have done.
The problem is typically associated with column H in the master sheet (Work Done required or not). For most cases, it is either a yes or no. However, in some cases, the accounting person doesn't know for sure whether it is a yes or no. But he wants to keep on adding other bill details.
When he fills the empty column later, the entered data on this second sheet doesn't dynamically shift with the imported data, thus causing the rows to misalign.
Unfortunately, as I mentioned, the rows don't stick together so as the dynamic order of the imported columns changes, the static order of the manual columns causes a mismatch.
Is there a way to make this work?
A solution would be to add an onEdit trigger in order to check if any changes have been made to the H column and later add the corresponding rows to the destination sheet, something similar to this:
function onEdit(e) {
let destinationSheet = SpreadsheetApp.openById("SS_ID").getSheetByName('Input Sheet');
let sourceSheet = SpreadsheetApp.getActiveSheet();
let lastRow = destinationSheet.getLastRow();
if (e.range.getColumn() == 8 && e.range.getValue() != null) {
let billNo = sourceSheet.getRange(e.range.getRow(),1).getValue();
// retrieve all the other values needed from the current row of the edit
// current row of the edit > e.range.getRow()
destinationSheet.getRange(lastRow+1,1).setValue(billNo);
// copying the values to the destination sheet by using setValue()
}
}
The above code makes use of the e event object in order to check if the edit has been made in the H column and if the value is different from null. If this condition checks, the values needed to be copied are being retrieved (here I have illustrated how to retrieve the billNo) and later set them in the destination sheet by retrieving the last row of it.
Note
You can also add a for loop in order to manage the copying the data easier and also add more conditions such that it the new entry is being pasted after a certain Actual Date or simply sort the values after the entry is being pasted.
Reference
Apps Script Triggers;
Apps Script Event Objects;
Apps Script Range Class.
I have performed testing with the following formula, I found no issue even I remove certain value for certain cell in "Col H", and the result is still correct, am I miss out certain information? I just change the filter criteria based on Col H only
=query(IMPORTRANGE("https://docs.google.com/spreadsheets/d/1pY53-XaGnUQ3BPmLh90mLSqIwSo7S2_QOPbD6JBQHOA/edit#gid=0","Master!A3:H"),"Select Col1, Col2, Col4, Col5, Col6, Col7 where Col8 is not null")

Tabulate JSON into Sheets

I've been trying to get a readable database of a JSON file from a URL.
I've used fastfedora's script on Github, https://github.com/fastfedora/google-docs/blob/master/scripts/ImportJSON/Code.gs, to import JSON from the URL to Sheets. I'm using the basic:
=TRANSPOSE(ImportJSON("https://rsbuddy.com/exchange/summary.json"))
I used transpose as it was easier to work with two long columns rather than two long rows.
The data that's been imported however, is very messy: https://docs.google.com/spreadsheets/d/1mKnRQmshbi1YFG9HHg7-mKlZZzpgDME6-eGjDJKzbRY/edit?usp=sharing. It's basically 1 long column of descriptive data, (name, id, price etc.) and another column of the variable (the actual name of the item and it's price in digits).
Is it possible to manipulate the resultant Sheets page so that the common factors in the first column can be lined up with the pseudo-table beside two initial columns? E.g. for the first item, the ID will be '2', the name will be 'Cannonball', the Sp will be '5' etc.
Thanks in advance. Do forgive me for my ignorance.
Example
Simple formula
I think, faster way to get IDs:
=QUERY(QUERY(A2:B,"select B where A <> '' offset 4"),"skipping 7")
and if you want Names:
=QUERY(QUERY(A2:B,"select B where A <> '' offset 1"),"skipping 7")
when you change offset from 0 to 6, you get different columns
outputs.
7 is the number of columns in Data.
The result is autocompleted column with Data.
Hard formula
Also possible to get the whole result with one formula:
paste =COUNTA(A:A) in cell E2
paste 7 in E3, this is the number of columns in Data
=E2/E3 in E4
And then in cell G2 or somewhere on right paste the formula:
=ArrayFormula(vlookup(if(COLUMN(OFFSET(A1,,,1,E3)),
(row(OFFSET(A1,,,E4))-1)*E3+COLUMN(OFFSET(A1,,,1,E3))),
{row(OFFSET(A1,,,E2)),OFFSET(B2,,,E2)},2,0))
It works slow, but gives the whole table.
or Script?
I've also tried to use script UDF function. Here's test formula:
=ConvertTo2D(TRANSPOSE(R3:R16),7)
where R3:R16 is small range which was splited into table with 7 columns. The script is pretty short:
function ConvertTo2D(Arr, index) {
var newArr = [];
while(Arr[0].length) newArr.push(Arr[0].splice(0,index));
return newArr;
}
Sounds good. But! It is ve-e-e-e-ery slow. So This solution is good only for quick test.
If the data is structured and every object will always have the same structure you can use a simple offset to do this:
=OFFSET($B$2,
(ROW($B2) - 2) * 7 +
COLUMN(D$1) - 4,
0)
Put that in D2 and drag to the right and down.
It is possible to immediately return the data in this fashion but for that you need to meddle with the script.

Update script cell references when columns are moved

We're migrating a lot of our business logic to scripts behind the scenes, but I'm worried that they'll be much more fragile when columns move.
On Sheet Updates Automagically
For example, If I have a formula on a spreadsheet like this:
=If(A1=5,"Yes","No")
And then I Insert 1 Column Left of A, the formula will be automatically updated like this:
=If(B1=5,"Yes","No")
Apps scripts doesn't update
For example, if I have the formula in the script section:
function myFunction() {
var value = SpreadsheetApp.getActiveSheet().getRange("A1").getValue();
var output = (value == 5) ? 'Yes' : 'No';
Logger.log(output);
}
It will not update when the sheet changes.
Q: How can I get stable references in the code behind for columns that could potentially move?
This is a general problem when hardcoding strings or numbers in code.
In general the javascript parser can't tell which strings might be used on a sheet function call. Its sometimes not trivial to solve.
Two approaches are:
If the columns/cells/ranges are known beforehand, use named ranges:
Define a named range and use NamedRange in code. Use the range to directly write to it or query its row/column position.
Another for column based ranges like yours is that your code does this naming manually by using the column header as the column names. Code uses those names and reads the header to build the mapping.

Copy data from one file to another for rows with a same index value

I tried to look for previous answers but none of the scripts I found can suit my needs.
I have two Google spreadsheets:
The source file file2 has 8 columns. The first column is offernumber:
The other spreadsheet, file1, has a corresponding ordernumber column.
Where there is a match on ordernumber to offernumber, I need an automatic way to copy the data in file 2 into the corresponding row in file 1.
I managed to copy values manually when I enter a query into the spreadsheet, such as:
=QUERY( ImportRange( "keyspreadsheet" ;"Sheet1!A2:H1000"),
"select Col2,Col3,Col4,Col5,Col6,Col7,Col8 where Col1='6/2012' ";0)
But am not able to figure out how to script this. Any help would be appreciated!
You can do this entirely with spreadsheet functions, no need for a script. (Although I think a script would handle very large spreadsheets more efficiently.)
In file1, put this function in cell AG2, replace both instances of "keyspreadsheet" with the ID of file2, then copy to the rest of the rows in column AG:
=index(ImportRange( ʺkeyspreadsheetʺ ;ʺSheet1!$B$2:$Hʺ),match(AF2,ImportRange( ʺkeyspreadsheetʺ ;ʺSheet1!$A$2:$Aʺ),0))
This finds the row in File2 that has a value in column A matching the ordernumber in File1 Column AF, and copies the values from the entire row, starting with column B (thus skipping 'Offer N.'). The result looks like this:

Selecting the last value of a column

I have a spreadsheet with some values in column G. Some cells are empty in between, and I need to get the last value from that column into another cell.
Something like:
=LAST(G2:G9999)
except that LAST isn't a function.
Similar answer to caligari's answer, but we can tidy it up by just specifying the full column range:
=INDEX(G2:G, COUNT(G2:G))
So this solution takes a string as its parameter. It finds how many rows are in the sheet. It gets all the values in the column specified. It loops through the values from the end to the beginning until it finds a value that is not an empty string. Finally it retunrs the value.
Script:
function lastValue(column) {
var lastRow = SpreadsheetApp.getActiveSheet().getMaxRows();
var values = SpreadsheetApp.getActiveSheet().getRange(column + "1:" + column + lastRow).getValues();
for (; values[lastRow - 1] == "" && lastRow > 0; lastRow--) {}
return values[lastRow - 1];
}
Usage:
=lastValue("G")
EDIT:
In response to the comment asking for the function to update automatically:
The best way I could find is to use this with the code above:
function onEdit(event) {
SpreadsheetApp.getActiveSheet().getRange("A1").setValue(lastValue("G"));
}
It would no longer be required to use the function in a cell like the Usage section states. Instead you are hard coding the cell you would like to update and the column you would like to track. It is possible that there is a more eloquent way to implement this (hopefully one that is not hard coded), but this is the best I could find for now.
Note that if you use the function in cell like stated earlier, it will update upon reload. Maybe there is a way to hook into onEdit() and force in cell functions to update. I just can't find it in the documentation.
Actually I found a simpler solution here:
http://www.google.com/support/forum/p/Google+Docs/thread?tid=20f1741a2e663bca&hl=en
It looks like this:
=FILTER( A10:A100 , ROW(A10:A100) =MAX( FILTER( ArrayFormula(ROW(A10:A100)) , NOT(ISBLANK(A10:A100)))))
LAST() function is not implemented at the moment in order to select the last cell within a range. However, following your example:
=LAST(G2:G9999)
we are able to obtain last cell using the couple of functions INDEX() and COUNT() in this way:
=INDEX(G2:G; COUNT(G2:G))
There is a live example at the spreedsheet where I have found (and solved) the same problem (sheet Orzamentos, cell I5). Note that it works perfectly even refering to other sheets within the document.
Summary:
=INDEX( FILTER( G2:G , NOT(ISBLANK(G2:G))) , COUNTA(G2:G) )
Details:
I've looked through and tried several answers, and here's what I've found:
The simplest solution (see Dohmoose' answer) works if there are no blanks:
=INDEX(G2:G; COUNT(G2:G))
If you have blanks, it fails.
You can handle one blank by just changing from COUNT to COUNTA (See user3280071's answer):
=INDEX(G2:G; COUNTA(G2:G))
However, this will fail for some combinations of blanks. (1 blank 1 blank 1 fails for me.)
The following code works (See Nader's answer and jason's comment):
=INDEX( FILTER( G2:G , NOT(ISBLANK(G2:G))) , ROWS( FILTER( G2:G , NOT(ISBLANK(G2:G)) ) ) )
but it requires thinking about whether you want to use COLUMNS or ROWS for a given range.
However, if COLUMNS is replaced with COUNT I seem to get a reliable, blank-proof implementation of LAST:
=INDEX( FILTER( G2:G , NOT(ISBLANK(G2:G))) , COUNT( FILTER( G2:G , NOT(ISBLANK(G2:G)) ) ) )
And since COUNTA has the filter built in, we can simplify further using
=INDEX( FILTER( G2:G , NOT(ISBLANK(G2:G))) , COUNTA(G2:G) )
This is somewhat simple, and correct. And you don't have to worry about whether to count rows or columns. And unlike script solutions, it automatically updates with changes to the spreadsheet.
And if you want to get the last value in a row, just change the data range:
=INDEX( FILTER( A2:2 , NOT(ISBLANK(A2:2))) , COUNTA(A2:2) )
In order to return the last value from a column of text values you need to use COUNTA, so you would need this formula:
=INDEX(G2:G; COUNTA(G2:G))
try this:
=INDIRECT("B"&arrayformula(max((B3:B<>"")*row(B3:B))))
Suppose the column in which you are looking for the last value is B.
And yes, it works with blanks.
This one works for me:
=INDEX(I:I;MAX((I:I<>"")*(ROW(I:I))))
It looks like Google Apps Script now supports ranges as function parameters. This solution accepts a range:
// Returns row number with the last non-blank value in a column, or the first row
// number if all are blank.
// Example: =rowWithLastValue(a2:a, 2)
// Arguments
// range: Spreadsheet range.
// firstRow: Row number of first row. It would be nice to pull this out of
// the range parameter, but the information is not available.
function rowWithLastValue(range, firstRow) {
// range is passed as an array of values from the indicated spreadsheet cells.
for (var i = range.length - 1; i >= 0; -- i) {
if (range[i] != "") return i + firstRow;
}
return firstRow;
}
Also see discussion in Google Apps Script help forum: How do I force formulas to recalculate?
I looked at the previous answers and they seem like they're working too hard. Maybe scripting support has simply improved. I think the function is expressed like this:
function lastValue(myRange) {
lastRow = myRange.length;
for (; myRange[lastRow - 1] == "" && lastRow > 0; lastRow--)
{ /*nothing to do*/ }
return myRange[lastRow - 1];
}
In my spreadsheet I then use:
= lastValue(E17:E999)
In the function, I get an array of values with one per referenced cell and this just iterates from the end of the array backwards until it finds a non-empty value or runs out of elements. Sheet references should be interpreted before the data is passed to the function. Not fancy enough to handle multi-dimensions, either. The question did ask for the last cell in a single column, so it seems to fit. It will probably die on if you run out of data, too.
Your mileage may vary, but this works for me.
function lastRow(column){
var sheet = SpreadsheetApp.getActiveSpreadsheet();
var lastRow = sheet.getLastRow();
var lastRowRange=sheet.getRange(column+startRow);
return lastRowRange.getValue();
}
no hard coding.
In a column with blanks, you can get the last value with
=+sort(G:G,row(G:G)*(G:G<>""),)
This gets the last value and handles empty values:
=INDEX( FILTER( H:H ; NOT(ISBLANK(H:H))) ; ROWS( FILTER( H:H ; NOT(ISBLANK(H:H)) ) ) )
The answer
$ =INDEX(G2:G; COUNT(G2:G))
doesn't work correctly in LibreOffice. However, with a small change, it works perfectly.
$ =INDEX(G2:G100000; COUNT(G2:G100000))
It always works only if the true range is smaller than (G2:G10000)
Is it acceptable to answer the original question with a strictly off topic answer:)
You can write a formula in the spreadsheet to do this. Ugly perhaps? but effective in the normal operating of a spreadsheet.
=indirect("R"&ArrayFormula(max((G:G<>"")*row(G:G)))&"C"&7)
(G:G<>"") gives an array of true false values representing non-empty/empty cells
(G:G<>"")*row(G:G) gives an array of row numbers with zeros where cell is empty
max((G:G<>"")*row(G:G)) is the last non-empty cell in G
This is offered as a thought for a range of questions in the script area that could be delivered reliably with array formulas which have the advantage of often working in similar fashion in excel and openoffice.
function getDashboardSheet(spreadsheet) {
var sheetName = 'Name';
return spreadsheet.getSheetByName(sheetName);
}
var spreadsheet = SpreadsheetApp.openByUrl(SPREADSHEET_URL);
var dashboardSheet = getDashboardSheet(spreadsheet);
Logger.log('see:'+dashboardSheet.getLastRow());
I was playing with the code given by #tinfini, and thought people might benefit from what I think is a slightly more elegant solution (note I don't think scripts worked quite the same way when he created the original answer)...
//Note that this function assumes a single column of values, it will
//not function properly if given a multi-dimensional array (if the
//cells that are captured are not in a single row).
function LastInRange(values)
{
for (index = values.length - 1; values[index] == "" && index > 0; index--) {}
return String(values[index]);
}
In usage it would look like this:
=LastInRange(D2:D)
Regarding #Jon_Schneider's comment, if the column has blank cells just use COUNTA()
=INDEX(G2:G; COUNT**A**(G2:G))
I found another way may be it will help you
=INDEX( SORT( A5:D ; 1 ; FALSE) ; 1 ) -will return last row
More info from anab here:
https://groups.google.com/forum/?fromgroups=#!topic/How-to-Documents/if0_fGVINmI
Found a slight variation that worked to eliminate blanks from the bottom of the table.
=index(G2:G,COUNTIF(G2:G,"<>"))
I'm surprised no one had ever given this answer before. But this should be the shortest and it even works in excel :
=ARRAYFORMULA(LOOKUP(2,1/(G2:G<>""),G2:G))
G2:G<>"" creates a array of 1/true(1) and 1/false(0). Since LOOKUP does a top down approach to find 2 and Since it'll never find 2,it comes up to the last non blank row and gives the position of that.
The other way to do this, as others might've mentioned, is:
=INDEX(G2:G,MAX((ISBLANK(G2:G)-1)*-ROW(G2:G))-1)
Finding the MAXimum ROW of the non blank row and feeding it to INDEX
In a zero blank interruption array, Using INDIRECT RC notation with COUNTBLANK is another option. If V4:V6 is occupied with entries, then,
V18:
=INDIRECT("R[-"&COUNTBLANK(V4:V17)+1&"]C",0)
will give the position of V6.
to get the last value from a column you can also use MAX function with IF function
=ARRAYFORMULA(INDIRECT("G"&MAX(IF(G:G<>"", ROW(G:G), )), 4)))
I have gone through way too many of these implementations of last-row for a specific column. Many solutions work but are slow for large or multiple datasets. One of my use cases requires me to check the last row in specific columns across multiple spreadsheets. What I have found is that taking the whole column as a range and then iterating through it is too slow, and adding a few of these together makes the script sluggish.
My "hack" has been this formula:
=ROW(index(sheet!A2:A,max(row(sheet!A2:A)*(sheet!A2:A<>""))))-1
Example: Add this to Cell A1, to find the last row in column A. Can be added anywhere, just make sure to manage the "-1" at the end depending on which row the formula is placed. You can also place this is another col, rather than the one you're trying to count, and you don't need to manage the -1. You could also count FROM a starting Row, like "C16:C" - will count values C16 onwards
This formula is reliably giving me the last row, including blanks in the middle of the dataset
To use this value in my GS code, I am simply reading the cell value from A1. I understand that Google is clear that spreadsheet functions like read/write are heavy (time-consuming), but this is much faster than column count last-row methods in my experience (for large datasets)
To make this efficient, I am getting the last row in a col once, then saving it as a global variable and incrementing in my code to track which rows I should be updating. Reading the cell every-time your loop needs to make an update will be too inefficient. Read once, iterate the value, and the A1 cell formula (above) is "storing" the updated value for the next time your function runs
Please let me know if this was helpful to you! If I encounter any issues I will comment on this answer.
=QUERY({G2:G9999,ARRAYFORMULA(ROW(G2:G9999))},"Select Col1 where Col1 is not null Order By Col2 desc limit 1",0)
In the query, Col1 refers to column G, and Col2 refers to a virtual column, populated with the row numbers returned by ARRAYFORMULA(ROW(G2:G9999)).
I haven't evaluated the other answers, so I can't say if this is the best way, but it worked for me.
Bonus: to return the first non-empty cell:
QUERY({G2:G9999},"Select Col1 where Col1 is not null limit 1",0)
Refs: QUERY, ARRAYFORMULA, ROW.