How to read contents of PC Sampling Continuous output (dat file)? - cuda

I am looking into the code samples provided in /usr/local/cuda-11.8/extras/CUPTI/samples/pc_sampling_continuous/. I am trying to get the PC stall reasons (from an executable) without modifying the source code using CUPTI. pc_sampling_continuous seems to be doing exactly the same.
When I am running the command :
./libpc_sampling_continuous.pl --collection-mode 1 --sampling-period 7 --file-name pcsampling.dat --verbose --app "./executable_name", it outputs a dat file.
To read this .dat file, /usr/local/cuda-11.8/extras/CUPTI/samples/pc_sampling_utility/ contains an utility file. This is run with the command:
./pc_sampling_utility --file-name 1_pcsampling.dat and it is able to output in a human readable format.
I have 2 problems regarding this:
It always says lineNumber:0, fileName: ERROR_NO_CUBIN, dirName: , in each line. However it is able to show me the stalls. But without correlating to the line number in the code and SASS (both) , it is of no use.
README file tells that I should be using the cubin file for the source correlation. I am able to generate the cubin file (as cuobjdump -xelf all ./executable_name and renaming it to 1.cubin). But I am not able to understand how to input this cubin file together with the .dat file to pc_sampling_utility.
Any help is appreciated.

Related

Getting Minizinc output as .txt from IDE

Imagine I ran a .mzn with .dzn and got an output in IDE as follows:
Supplier01,100,100,100,100,100,100,100,100,100,100,100,100
Supplier02,200,200,200,200,200,200,200,200,200,200,200,200
Supplier03,40,49,359,834,1067,1377,334,516,761,1001,1251,1583
Supplier04,500,500,500,500,500,500,500,500,500,500,500,500
Supplier05,161,200,200,200,200,200,200,200,200,200,200,200
Supplier06,500,500,500,500,500,500,500,500,500,500,500,500
----------
==========
Is there any way that I can generate this output in a .txt or .csv file in a preferred location on my computer? I know that we can perform this in command prompt, but is there any way we can do using the IDE it self?
The MiniZinc IDE currently does not include functionality to export solutions for other applications.
The current expectation is that if you want to integrate MiniZinc with other applications that you would use something like MiniZinc Python, iMiniZinc, or the command line tools, to facilitate the connection. In your case using MiniZinc Python or iMiniZinc might be a good solution since Python can generate csv files using the csv module. If you want to see and interact with the solution as well as outputting the csv file, then iMiniZinc can provide the right tooling in Jupyter Notebook to do both.
If you are very happy with the MiniZinc IDE and you want to continue using it, then the other option would to just minimize the inconvenience. Your output statement already provides the solution in csv style. So the only remaining part is making the file. The MiniZinc IDE can open .csv files. So my suggestion would in this case be to create an empty .csv file, open it in the IDE. Once you get the solution from your instance in the output window, then you copy directly into the file.

How should one zip a large folder in Windows 10, upload it to GDrive, then unzip it?

I have a directory consisting of 22 sub-directories. Altogether, the directory is about 750GB in size and I need this data on GDrive so that I can work with it in Google Colab. Obviously uploading this takes an absolute age (particularly with my slow connection) so I would like to zip it, upload it, then unzip it in the cloud.
I am using 7zip and zipping each subdirectory using the zip format and "normal" compression level. (EDIT: Can now confirm that I get the same error for 7z and tar format). Each subdirectory ends up between 14 and 20GB in size. I then upload this and attempt to unzip it in Google Colab using the following code:
drive.mount('/content/gdrive/')
!apt-get install p7zip-full
!7za x "/content/gdrive/My Drive/av_tfrecords/drumming_7zip.zip" -o"/content/gdrive/My Drive/unzipped_av_tfrecords/" -aos
This extracts some portion of the zip file before throwing an error. There are a variety of errors and sometimes the code will not even begin unzipping the file before throwing an error. This is the most common error:
Can not open the file as archive
ERROR: Unknown error -2147024891
Archives with Errors: 1
If I then attempt to rerun the !7za command, it may extract one or 2 files more from the zip file before throwing this error:
terminate called after throwing an instance of 'CInBufferException'
It may also complain about particular files within the zip archive:
ERROR: Headers Error : drumming/yt-g0fi0iLRJCE_23.tfrecords
I have also tried using:
!unzip -n "/content/gdrive/My Drive/av_tfrecords/drumming_7zip.zip" -d "/content/gdrive/My Drive/unzipped_av_tfrecords/"
But that just begins throwing errors:
file #254: bad zipfile offset (lseek): 8137146368
file #255: bad zipfile offset (lseek): 8168710144
file #256: bad zipfile offset (lseek): 8207515648
Although I would prefer a solution in Colab, I have also tried using an app available in GDrive named "Zip Extractor". But that too throws an error and has a dataquota.
This has now happened across 4 zip files and each time I try something new, it takes an a long time to try it out because of the upload speeds. Any explanations for why this is happening and how I can resolve the issue would be greatly appreciated. Also I understand there are probably alternatives to what I am trying to do and they would be appreciated also, even if they do not directly answer the question. Thank you!
I got same problem
Solve it by
new ProcessBuilder(new String[] {"7z", "x", fPath, "-o" + dir)
Use command line array, not just full line!
Luck!
Why does this command behave differently depending on whether it's called from terminal.app or a scala program?

Managing a large SPSS (*.sav) file (4.2 GB)

I have received an SPSS file from survey fielded by another company that allegedly only contains ~1500 respondents, but the file size somehow has ballooned 4.2GB. My hunch is that the reason for this is that the file was from a global survey and the 1500 records that have been selected are from the US only so there are a series of blank variables, metadata for those variables that are included in this file and may also be in multiple languages/alphabets.
I only need a subset of this data, and can likely work with it if I removed the metadata but my issue has been that I can't get the damn thing open to cut down on the number of variables. I have been using the tools at my disposal to try the following workarounds, though I'm sure there are better options:
Opening the file using PSPP (freeware SPSS) - this causes the PSPP to stop responding
Using the R command read.spss (from the foreign package) to write a .csv - this claims that the file has a duplicate variable name and won't proceed further
Using the R command spss.system.file to write a .csv - when I tried this, R has spend a lot of time thinking as it as it attempts to run this and has been running for a couple hours with no apparent success.
Using the PSPP text conversion tool (https://pspp.benpfaff.org/) to create either a dictionary or a .csv file - both of these options crash after the file has completed uploading.
I've gone back to the other company to try have them work on reducing the file size, however I wasn't sure if anyone else had any ideas to do either of the following:
Open the file using another program/converter that could turn it into a .csv or other similarly skinny file format
Use another program to at least read only the variable names included in the file so that I can provide the other company with the specific variables I need
The following command from PSPP should do what you need:
$ pspp-convert originalFile.sav output.csv
In case it doesn't, please provide terminal error message.

get data from .csv file, analyze, produce output - python3

I am trying to complete an assignment in Python3. It is very similar to the pdf found here
I have a few questions on both the execution of how to get the information I need, and if possible, some code that could move me along. I am new to python. As right now from the code I have, I keep getting the error "directory not found" after running a function to try and read the data. I know the .csv file should be in the directory where I save it to in WingIDE, but I can't get it to work correctly.
My first question is after getting each line of the .csv file to read from my get_file_list, what is the best way to take each category and throw it into an efficiency equation?
Here is my get_data_list function:
def get_data_list(filename):
data_file = open(filename, "r")
data_list = [ ]
for line_str in data_file:
data_list.append(line_str.strip().split(','))
return data_list
when I run get_data_list("player_regular_season.csv") I get the following error:
builtins.IOError: [Errno 2] No such file or directory:'player_regular_season.csv'
For the first try, you can put the data file to the same directory with the Python program and launch it from the directory.
Try also a single purpose script to learn how to work with directories. Learn the functions from the standard doc 15.1.5. Files and Directories, namely os.getcwd(), os.chdir(path), and then 10.1. os.path — Common pathname manipulations, namely os.path.isfile(path).
But read also the doc of other functions in the documents to learn what is available.
When knowing how to work with filenames and paths, have a look at the 13.1. csv — CSV File Reading and Writing. Not to be scared of all the stuff, start from the end -- 13.1.5. Examples of using the csv module.

tiffcp.exe merging a results file with a results file in a loop

I am building a web app that takes several tiff image files and merges them together into one single tiff image file using GNUWin32 tiffcp.exe from command line.
The way I was doing it was to loop through the file list and build a string of file names to merge into one single variable.
strfileList = "c:folder\folder\folder\aased98-def-wsdeff-434fsdsd-dvv.tif c:folder\folder\folder\aased98-def-wsdeff-434fsdsd-axs.tif c:folder\folder\folder\aased98-def-wsdeff-434fsdsd-dxzs.tif"
Then I would just write to the command line:
tiffcp.exe strFileList results.tif
The file names are guids and so the paths are fairly long and I do not have any control to shorten them. So if I have a bunch of these documents (over 20 files or so), the length of the string variable exceeds the limits for windows command line and the merge fails.
Since this process is just merging files, my next thought was instead of writing the file names to a string, just do the merge one file at a time. So the first time the loop runs the following type of code:
tiffcp.exe file1.tif results.tif
The result is a perfect 476k tif file. But the next iteration of the loop needs to merge the second file plus the contents of the first "results" tif file. So I do this:
tiffcp.exe results.tif file2.tiff results.tif
The results each time are a blank 1K tiff file?
All the examples I can find of tiffcp.exe say file1.tif file2.tif results.tif, none use the results file to write back to itself?
Any suggestions on how to do this?
Try the -a switch to tiffcp.exe
I'm doing something similar in Python and inside my file processing loop I'm issuing the command:
tiffcpp.exe -a temp.tif output.tif
works fine.
For an ASP.NET project you may want to try LibTiff.Net (free, open source, BSD license). That port of libtiff library contains tiffcp utility with source code. You may try to use it in your code.
Disclaimer: I am one of the maintainers of the library.
I believe your problem is caused by the use of results.tif as both input as output. If you increment the file name (i.e. results1.tif to results2.tif etc.) I believe it should work.
This is a rather inefficient approach (tiff1 is copied 9 times if you have 10 files). Since you refer to libtiff, you may take a look at the source of libtiff cp and check if it is worthwhile to embed it.