I have used the "extract" command, but it never was able to find as much information as FOCA found on these excel spreadsheets I am dealing with.
For example, I am using the FOCA application to harvest and download files from the web. Afterwards, it is extracting metadata from all of the files.
With regards to excel files, it appears that these files are containing more metadata than the average pdf file. That being said, FOCA is able to detect printer names, email addresses, and a few other things that are stored within this spreadsheet file. However, I cannot find any way to get this same information in Linux using the "extract" command.
Anyone know a way to extract files within Linux and grab ALL of its metadata? Seems like the extract command may be limited from what I understand.
Thanks,
Excel files store a lot of meta data within the file, so you would have to parse the file itself to get at it. Since you're on Linux and can't use the Excel interop, you could try to use an Excel library like ExcelWriter or something similar. ExcelWriter is written for .Net, so you'd have to use mono.
Related
I have a folder with hundreds of files that were saved on a specific format of a given software (in this case it is the Qualisys Track Manager and the file format is .qtm).
This software has the option of exporting the files to another format such as TSV, MAT, C3D,...
My problem: I want to export all my files to TSV format but the only way I know is open the software, go to File->Export->To TSV. And doing this for hundreds of files is time consuming. So I was thinking on writing a script where I could call my files, access the software and it would do the export automatically.
But I have no clue how to do this, I was thinking on writing a script on Notepad++, running on the command window and then I would get all the files on TSV format.
[EDIT] After some research I think maybe a Batch script or a PowerShell script may help me but I have no idea how to run automatically the commands of the software of if it is even possible... (I am using Windows10)
It is highly likely to be a perpetual file format(.qtm) and Powershell/batch would not understand it. Unless this file can be read in a known way (Text XML etc), they would not be able to convert it.
I googled it and seems QTM have a REST API interface. It would be the best chance you have. I'm not sure if the documentation is available publicly, I didn't find it. I'd recommend you contact their support for REST API document/ask if their REST API can handle this task/sample code to get you start.
Then you can make REST API calls with Invoke-RestMethod in a loop from powershell.
Hello and thanks in advance for any help you can provide. I am a real newbie at all this.
I am trying to export my autocorrected words from Microsoft Word so I can use them in the chrome autocorrect extension 'spelling bee .' The chrome extension allows you to upload files of misspelled words and their autocorrection in csv format.
The problem I face is that my autocorrected list from microsoft word is an acl file. I have spent quite a while trying to figure out how to convert my acl file to a csv file with no success. I thought that I could just manually format the file myself When I open the acl file as a text file, but the formatting and spacing are so off, it would take forever to do it manually.
Is there a straight forward way to open the acl file in simple delimited format? There are many posts online about how to transfer acl files from one system to another, but I could not figure out how to simply convert the acl file into a csv or other appropriately delimited file. If there is a thread out there that addresses this that I may have overlooked please let me know. Thanks again for your consideration.
example of what my acl file looks like when I open it in text editor:
must of had
must have hadmyseflmyselfmyumynaivenaÔvenecassarilynecessarily necassary necessaryneccessarilynecessarily
neccessary necessary
necesarilynecessarilynecesary necessary
negotiaingnegotiatingnkowknownothignnothingnvernevernwenewnwonowobediantobedientocasionoccasion occassionoccasionoccuredoccurred occurence
occurrence
At this link there is a 18kb utitily which will backup your autocorrect to a word document, you can then copy that table into excel and make it into a .csv Autocorrect utility . It's made for Word 97 and 2000, but I used it no problems with Word 2007 and 2010
I want to create .csv files with the Report Generation Toolkit in Labview.
They must actually be .csv files which can be opened with Notepad or something similar.
Creating a .csv is not that hard, it's just a matter of adding the extension to the file name that's going to be created.
If I create a .csv file this way it opens nicely in excel just the way it should, but if I open it in Notepad it shows all kind of characters and it doesn't even come close to the data I wrote to the file.
I create the files with the Labview code below:
Link to image (can't post image yet because I've got to few points)
I know .csv files can be created with the Write to Spreadsheet VI but I would like to use the Report Generation Toolkit because it's pretty easy to add columns and rows to the file and that is something I really need.
you can use the Robust CSV package on the lavag.org forum to read and write 2D arrays to CSV files.
http://lavag.org/files/file/239-robust-csv/
Calling a file "csv" does not make it a CSV file. I never used the toolkit to generate an Excel file, but I'm assuming it creates an XLS or XLSX file, regardless of what extension you give it, which is why you're seeing gibberish (probably XLS, since it's been around for a while and I believe XLSX is XML, not binary).
I'm not sure what your problem is with the write spreadsheet VI. It has an append input, so I assume you can use that to at least add rows directly to a file, although I can't say I ever tried it. I would prefer handling all the data in memory explicitly, where you can easily use the array functions to add rows or columns to the array and then overwrite the entire file.
My workplace has a whole bunch of unannotated .zip files that need to be uploaded to the new file server (Windows). I've used perl to parse through through the excel files within the .zip files to create an annotation.txt file for each .zip file that contains information about the .zip file. I have 1000's of zip files and do not want to manually enter in information for each entry if there's a way to automate it. I am proficient in perl and mysql, and wondering if there is any way to utilize my skillsets to port this information into the Microsoft Sharepoint website.
Thank you in advance for any advice or suggestions.
There a many, many ways to meet your requirement.
You could write a event receiver to parse the files once uploaded and set metadata.
A better approach for your use case might be to write a .NET based console application and reference Microsoft.SharePoint.Client and then upload your files using the Client side object model (CSOM) and set the metadata during that process as outlined here: Upload a document to a SharePoint list from Client Side Object Model
There are also REST and ASMX webservices that you could call from a non .NET runtime process.
Plenty of options, pick the one that fits your needs and skills best.
I am trying to find the best way to import all of our Lighthouse data (which I exported as JSON) into JIRA, which wants a CSV file.
I have a main folder containing many subdirectories, JSON files and attachments. The total size is around 50MB. JIRA allows importing CSV data so I was thinking of trying to convert the JSON data to CSV, but all convertors I have seen online will only do a file, rather than parsing recursively through an entire folder structure, nicely creating the CSV equivalent which can then be imported into JIRA.
Does anybody have any experience of doing this, or any recommendations?
Thanks, Jon
The JIRA CSV importer assumes a denormalized view of each issue, with all the fields available in one line per issue. I think the quickest way would be to write a small Python script to read the JSON and emit the minimum CSV. That should get you issues and comments. Keep track of which Lighthouse ID corresponds to each new issue key. Then write another script to add things like attachments using the JIRA SOAP API. For JIRA 5.0 the REST API is a better choice.
We just went through a Lighthouse to JIRA migration and ran into this. The best thing to do is in your script, start at the top-level export directory and loop through each ticket.json file. You can then build a master CSV or JSON file to import into JIRA that contains all tickets.
In Ruby (which is what we used), it would look something like this:
Dir.glob("path/to/lighthouse_export/tickets/*/ticket.json") do |ticket|
JSON.parse(File.open(ticket).read).each do |data|
# access ticket data and add it to a CSV
end
end