I wrote a simple script to read a thousand xlsx files, with files having 400~500 Sheets and names with more than 50 characters. After obtaining the sheet names, the script would save those names into csv files that would eventually upload to a DB. Here is the script:
extension = 'XLSX'
xlsxfiles = [i for i in glob.glob('*.{}'.format(extension))]
for xlsxfile in xlsxfiles:
fins = op.load_workbook(xlsxfile,read_only=True)
sheetnames = fins.sheetnames
with open('test_xlsx-'+xlsxfile+'.csv','w',newline = '') as fout:
fout.write(str(xlsxfile))
I have two issues that need help:
Openpyxl load_workbook only returned 31 characters of the sheetnames. If more than 31, it truncates to “Sheetname something something_4””, but it should be
“Sheetname something something Real”
I tried Pandas.ExcelFile.sheet_names but got the same issue.
The CSV file saved the sheetnames as a column by column.
[‘Cover Page’ ‘Sheetname something something_4’ ‘Sheetname other’]
But I need the data as a row by row and drop all “[“ or “ ’ “.
Cover Page
Sheetame something something Real
Sheetname other
I am a novice in Python. All ideas and comments are welcome.
Still unable to get how to fix the first 31-characters issue.
For the second issue, I add a for loop for going through each sheet name and treat each one as a list. Here is code.
extension = 'XLSX'
xlsxfiles = [i for i in glob.glob('*.{}'.format(extension))]
for xlsxfile in xlsxfiles:
fins = op.load_workbook(xlsxfile,read_only=True)
sheetnames = fins.sheetnames
with open('test_xlsx-'+xlsxfile+'.csv','w',newline = '') as fout:
sheetnameout = csv.writer(fout)
for name in sheetnames:
sheetnameout.writerow([name]) # That "[]" took me 8 hours.
fout.close()
Again, I am novice in Python. All ideas and comments are welcome.
Related
For the most part, using copyValuesToRange works very well (99.9% of the time) in a function with the following statements:
let source = ss.getSheetByName("Update List");
let destination = ss.getSheetByName("Power Level");
source.getRange('TIER1DescImport').copyValuesToRange(destination, 12, 12, 6, 29); // 'PowerLevel'!L6:L29
The source is on a sheet "Update List" with a named data range "TIER1DescImport". The imported data is from a separate Google Sheet file using IMPORTRANGE.
ONCE, the destination range was entirely overwritten with empty or blank cells. How do I prevent this from happening again? I don't know why this occurred. The source range seemed fine. After closing the Google sheet and opening it 30 minutes later, everything was working properly again.
Is there a way of determining if the source range is okay, and if so, proceed with copyValuesToRange?
I'm a novice using Apps Script. I didn't know how to debug this issue. The only thing that worked was closing the files and coming back to it later. After that, it all seemed to be working again. I don't know why the destination was overwritten with blank or empty data but I'd like to prevent it from happening again.
You can test whether the source range has a meaningful amount of data with something like this:
const targetSheet = ss.getSheetByName('Power Level');
const sourceSheet = ss.getSheetByName('Update List');
const sourceRange = sourceSheet.getRange('TIER1DescImport');
if (sourceRange.getDisplayValues.join('').length > 10) {
sourceRange.copyValuesToRange(targetSheet, 12, 12, 6, 29);
// ...or: sourceRange.copyTo(targetSheet.getRange('L6'));
}
I did set up routines with the following code to parse CSVs into specific spreadsheets:
function updateGmvAndNmv() {
const threads = GmailApp.search("from:(sender#xxx.de) subject:(uniqueHeader)");
const messages = threads[0].getMessages();
const length = messages.length;
const lastMessage = messages[length - 1];
const attachemnt = lastMessage.getAttachments()[0];
const csvData = Utilities.parseCsv(attachemnt.getDataAsString(), ",");
const ss = SpreadsheetApp.openById("spreadsheetID").getSheetByName("sheetName")
const ssOriginalRange = ss.getRange("A:E");
const ssToPaste = ss.getRange(1,1,csvData.length,csvData[0].length);
ssOriginalRange.clear();
ssToPaste.setValues(csvData)
}
With the latest CSV that I want to parse, I encounter an issue, where I am stuck. I tried to play around with the settings in the app that sends me the report but I can not change the way the CSV is being constructed. When I look at the CSV with a text Editor, I see something like this:
GMV and NMV per partner
"Merchant",,"NMV","GMV bef Cancellation","GMV bef Return"
When I let the above code run, it gets the file and outputs the following in my spreadsheet:
Spreadsheet Example
Which brings up the following questions:
Why do I have "" (double quotes) in row 5? I assumed the parseCsv-function removes those.
With my other CSVs I did not have any issues, but there I did not have any double quotes. Can someone explain the difference in CSVs, once with double quotes and once without?
How can I treat this data correctly, in order to get the data without the "" into the spreadsheet?
Why do I see some ? symbols (please look at the fx input field, row 1 and 7) and how do I get rid of them? The export should be without any format (CSV) and in a text editor I do see all values normally - without any ?.
The issue was the encoding. The correct encoding of the file is UTF-16, while the standard encoding of .getDataAsString() is UTF-8.
I'm looking for a mean to convert my TXT file into a Google Sheets :
function convert_txt_gsheets(){
var file = DriveApp.getFilesByName('file.txt').next();
var body = file.getBlob().getDataAsString().split(/\n/);
var result = body.map( r => r.split(/\t/));
SpreadsheetApp.getActive().getSheets()[0].getRange(1,1,result.length,result[0].length).setValues(result);
return;
}
An error occured "The number of columns in the data does not match the number of columns in the range. The data has 1 but the range has 18."
Does someone have an idea ?
If I import the txt file manually it works but I need to do it through an G apps script.
I only see typos/wrong method names for getBlob and getFilesByName (you used getBlobl and getFileByName), but aside from that, the only possible issue that will cause this is that something from the file is written unexpectedly.
Update:
Upon checking, your txt file has a line at the bottom containing a blank row. Delete that and you should successfully write the file. That's why the error is range is expecting 18 columns but that last row only has 1 due to not having any data.
You could also filter the file before writing. Removing rows that doesn't have 18 columns will fix the issue. See code below:
Modification:
var result = body.map( r => r.split(/\t/)).filter( r => r.length == 18);
Successful run:
Could someone help me?
I have xlsx file, with 2 sheets.
Second sheet contain cells linked to another(first) sheet.
When I save sheet to CSV file:
$objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, 'CSV');
This function doesn't save (linked) cells value...
My code looking like this:
$objPHPExcel = new PHPExcel();
// Read your Excel workbook
try
{
$inputFileType = PHPExcel_IOFactory::identify($excelFile);
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$objReader->setLoadSheetsOnly('list');
$objReader->setLoadSheetsOnly('main');
/* I also tried like this:
$worksheetList = $objReader->listWorksheetNames($excelFile);
$sheetname = $worksheetList[0];
$sheetname2 = $worksheetList[1];
$objReader->setLoadSheetsOnly($worksheetList[0]);
$objReader->setLoadSheetsOnly($sheetname);
*/
$objPHPExcel = $objReader->load($excelFile);
}
$objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, 'CSV');
$objWriter->save($filename);
I also tried to save to EXCEL files (xls and xlsx) -> the same problem, the cells (which was linked) they are empty...
My linked cells looked like this: "=list!C46"
Many hours of looking for answer, I have found not good solution:
I've removed any of these lines: $objReader->setLoadSheetsOnly('list');
and add:
$activeSheetData = $objPHPExcel->getActiveSheet()->toArray(null, true, true, true);
$objPHPExcel->getActiveSheet()->fromArray($activeSheetData, false);
just after:
$objPHPExcel = $objReader->load($excelFile);
and also add: $objWriter->setSheetIndex(1);
Now it will works, but with problems...
One column in original format have linked cells like this: "=list!$AV$46"
I mean with $ symbol.
More detailed:
If I have: $objReader->setReadDataOnly(true);
and have those cells: "=list!$AV$46" then they are empty in output.
But if I remove: $objReader->setReadDataOnly(true);
then those cells: "=list!$AV$46" works good and have a value, but with format
like: 11/17/2017.
As I removed $objReader->setReadDataOnly(true);, then I can't apply
this my code:
$objPHPExcel->setActiveSheetIndex(1)->getStyle('AV1:AV'.$highestRow)->getNumberFormat()->setFormatCode(PHPExcel_Style_NumberFormat::FORMAT_DATE_YYYYMMDD);
Then second question, how to write new date format?
I also wanted to say, that initial date format was:
17.11.2017.
And I wanted 17-11-17 (FORMAT_DATE_YYYYMMDD).
And again, with cells which looks like "=list!AV46" all works good.
UPDATE: Solvation: $objReader->setLoadSheetsOnly(['list', 'main']); + $objPHPExcel->setActiveSheetIndex(1); before any of: $objPHPExcel->getActiveSheet()->...
Here's your first problem
$objReader->setLoadSheetsOnly('list');
$objReader->setLoadSheetsOnly('main');
You're only ever loading one sheet, the last one that you tell PHPExcel to load, which is main. Multiple calls to setLoadSheetsOnly() overwrite the setting of the previous call.
If you want to load both sheets, then you need to pass an array listing all the sheetnames that you want to load
$objReader->setLoadSheetsOnly(['list', 'main']);
This is explained in the PHPExcel Documentation
The documentation also says not to use $objReader->setReadDataOnly(true); unless you understand what it is doing; it tells PHPExcel to load only the rw data, not the formatting of data; and it is the formatting that differentiates a number from a date in Excel.
I am creating a copy of a Spreadsheet. How do I make a script to copy the users that can view/edit/own the file across aswell?
Currently I cannot change the owner nor add editors or viewers that are as part of a googlegroup.
Thank you very much for your replies. GREATLY APPRECIATED.
I used this ...
var originalowner = originalSpreadsheet.getOwner();
var originaleditors = originalSpreadsheet.getEditors();
var originalviewers = originalSpreadsheet.getViewers();
// Logger.log(originalowner);
// Logger.log(originaleditors);
// Logger.log(originalviewers);
newSheet.addEditor(originaleditors);
enter code here newSheet.addViewer(originalviewers);
and get an error
Invalid email: xxxxxxx#googlegroups.com, yyyyyyyyy#googlegroups.com
If you read the documentation you will see that the argument for addEditor() is a single email as a string. You are trying to use an array of strings and the error message returned is quite explicit Invalid email: xxxxxxx#googlegroups.com, yyyyyyyyy#googlegroups.com
You should use a simple loop to add each viewer / editor one by one.
for(var n = 0 ; n<originaleditors.length ; ++n){
newSheet.addEditor(originaleditors[n]);
}
and do the same for viewers...
EDIT : or, as mentioned in Fred's comment (good point ! ) use the addEditors() method (with an s) that takes an array of editors emails as argument.
Note : as I said, (re)reading the documentation is always a good idea, that's true for me too ;-)