CSV reader issue in Enthought Canopy - csv

I'm trying to read a csv file. The issue is that it is too large and I have had to use an error handler. Within the error handler, I have to call csv.field_size_limit(). Which does not work even by itself as I keep receiving a 'limit must be an integer' error. From further research, I have found that this is probably an install error. I've installed all third party tools using the Package Manager so I am not sure what could be going wrong. Any ideas about how to correct this issue?
import sys
import csv
maxInt = sys.maxsize
decrement = True
while decrement:
decrement = False
try:
csv.field_size_limit(maxInt)
except OverflowError:
maxInt = int(maxInt/10)
decrement = True
with open("Data.csv", 'rb') as textfile:
text = csv.reader(textfile, delimiter=" ", quotechar='|')
for line in text:
print ' '.join(line)

Short answer: I am guessing that you are on 64-bit Windows. If so, then try using sys.maxint instead of sys.maxsize. Actually, you will probably still run into problems because I think that csv.field_size_limit() is going to try to preallocate memory of that size. You really want to estimate the actual field size that you need and maybe double it. Both sys.maxint and sys.maxsize are much too big for this.
Long explanation: Python int objects store C long integers. On all relevant 32-bit platforms, both the size of a pointer or memory offset and C long integers are 32-bits. On most UNIXy 64-bit platforms, both the size of a pointer or memory offset and C long integers are 64-bits. However, 64-bits Windows decided to keep C long integers 32-bits while bumping up the pointer size to 64-bits. sys.maxint represents the biggest Python int (and thus C long) while sys.maxsize is the biggest memory offset. Consequently, on 64-bit Windows, sys.maxsize is a Python long integer because the Python int type cannot hold a number of that size. I suspect that csv.field_size_limit() actually requires a number that fits into a bona fide Python int object. That's why you get the OverflowError and the limit must be an integer errors.

Related

How long is this memory section specified in this .dtb file?

I feel like I'm not understanding how to interpret the format of dtb/dts files, and was hoping you could help. After running these commands:
qemu-system-riscv64 -machine virt -machine dumpdtb=riscv64-virt.dtb
dtc -I dtb -O dts -o riscv-virt.dts riscv-virt.dtb
The resulting riscv-virt.dts contains the definition of the memory for the machine:
/dts-v1/;
/ {
#address-cells = <0x02>;
#size-cells = <0x02>;
compatible = "riscv-virtio";
model = "riscv-virtio,qemu";
...other memory definitions...
memory#80000000 {
device_type = "memory";
reg = <0x0 0x80000000 0x0 0x8000000>;
};
};
I have a few questions:
Why are there multiple pairs of reg definitions? Based on this link, it appears the second 0x0 0x8000000 overwrites what was just set in the previous pair, 0x0 0x80000000.
How long is this memory bank? Which value tells me this?
The first line says memory#80000000, but then the reg commands start at 0x0. Does the memory start at 0x0 or 0x80000000?
Basically, I just feel like I don't understand how to interpret this. In plain English, what is being defined here?
In dts-specification p. 13 u can read it partially. Reg is given in (address,length) pairs. In your case address and length are given in 64 byte, which is done by using 2! 32-bit values.
Thus the address is 0x80000000, and the size 0x8000000
Edit:
The variables #address-cells and #size-cells specify how many cells (32-bit values) are used for either address and size. In an original dts it is always specified within the device's mother node. Maybe you can find it within your decompiled dts

CUDA out of memory message after using just ~2.2GB of memory on a GTX1080

I'm doing matrix multiplication on a GTX1080 GPU using JCuda, version 0.8.0RC with CUDA 8.0. I load two matrices A and B into the device in row-major vector form, and read the product matrix from the device. But I'm finding that I run out of device memory earlier than I would expect. For example, if matrix A is dimensioned 100000 * 5000 = 500 million entries = 2GB worth of float values, then:
cuMemAlloc(MatrixA, 100000 * 5000 * Sizeof.FLOAT);
works fine. But if I increase the number or rows to 110000 from 100000, I get the following error on this call (which is made before the memory allocations for matrices B and C, so those are not part of the problem):
Exception in thread "main" jcuda.CudaException: CUDA_ERROR_OUT_OF_MEMORY
at jcuda.driver.JCudaDriver.checkResult(JCudaDriver.java:344)
at jcuda.driver.JCudaDriver.cuMemAlloc(JCudaDriver.java:3714)
at JCudaMatrixMultiply.main(JCudaMatrixMultiply.java:84) (my code)
The issue is that allocating a matrix of this size on the device should take only about 2.2GB, and the GTX1080 has 8GB of memory, so I don't see why I'm running out of memory. Does anyone have any thoughts on this? It's true that I'm using the JCuda 0.8.0RC with the release version of CUDA 8, but I tried downloading the RC version of CUDA 8 (8.0.27) to use with JCuda 0.8.0RC and had some problems getting it to work. If versions compatibility is likely to be the issue, however, I can try again.
Matrices of 100000 * 5000 are pretty big, of course, and I won't need to work with larger matrices for a while on my neural network project, but I would like to be confident that I can use all 8GB of memory on this new card. Thanks for any help.
tl;dr:
When calling
cuMemAlloc(MatrixA, (long)110000 * 5000 * Sizeof.FLOAT);
// ^ cast to long here
or alternatively
cuMemAlloc(MatrixA, 110000L * 5000 * Sizeof.FLOAT);
// ^ use the "long" literal suffix here
it should work.
The last argument to cuMemAlloc is of type size_t. This is an implementation-specific unsigned integer type for "arbitrary" sizes. The closest possible primitive type in Java for this is long. And in general, every size_t in CUDA is mapped to long in JCuda. In this case, the Java long is passed as a jlong into the JNI layer, and this is simply cast to size_t for the actual native call.
(The lack of unsigned types in Java and the odd plethora of integer types in C can still cause problems. Sometimes, the C types and the Java types just don't match. But as long as the allocation is not larger than 9 Million Terabytes (!), a long should be fine here...)
But the comment by havogt lead to the right track. What happens here is indeed an integer overflow: The computation of the actual value
110000 * 5000 * Sizeof.FLOAT = 2200000000
is by default done using the int type in Java, and this is where the overflow happens: 2200000000 is larger than Integer.MAX_VALUE. The result will be a negative value. When this is cast to the (unsigned) size_t value in the JNI layer, it will become a ridiculosly large positive value, that clearly causes the error.
When doing the computation using long values, either by explicitly casting to long or by appending the L suffix to one of the literals, the value is passed to CUDA as the proper long value of 2200000000.

idl making big numbers = 0.0

I'm trying to the the mass of the black hole at the center of this galaxy, I have the mass in solar masses, but need it in kg. However when I try to convert (1Msolar = 1.989*10^30kg) idl just gives me 0.0000. I have no idea what I'm doing wrong and I tried just telling idl to print both 1.989*10^30 and 1989000000000000000000000000000 and the outputs are 0.00000 and -1 respectively. Can someone please explain why this is happening?
This is a type conversion error/overflow issue. When you use large numbers you either need to explicitly define them as long or long64 (i.e., 64-bit long integer) for integer numbers. For real numbers, you can use float or double and to do this, the easiest way is the following:
msun = 1.989d30
which is equivalent to 1.989 x 1030 as a double-precision floating point number. If you want single precision, then just do the following:
msun = 1.989e30
To make a 32- or 64-bit long integer, just use:
msun = 1989L * 10L^(27)
or for 64-bit
msun = 1989LL * 10LL^(27)
I agree with #honeste_vivere's answer about overflow and data types, but I would add that I often change units to avoid this. I frequently have densities that are order 1e19/m^3, so I cast density in units of 1e19/m^3 and then deal with numbers that are order 1. This prevents math errors during least squares fits and other operations that might do things like squaring my data.

Efficient read some bytes from DataReader?

I have a stream with ANSI string. It is prefixed with bytes length. How can I read it into std::string?
Something like:
short len = reader.readInt16();
char[] result = reader.readBytes(len); // ???
std::string str = std::copy(result, result + len);
but there is no method readBytes(int).
Side question: is it slow to read with readByte() from DataReader one byte at a time?
According to MSDN, DataReader::ReadBytes exists and is what you are looking for: http://msdn.microsoft.com/en-us/library/windows/apps/windows.storage.streams.datareader.readbytes
It takes an Platform::Array<unsigned char> as an argument, which presumably you'll initialize using the prefixed length, which on returning will contain your bytes. From there it's a tedious-but-straightforward process to construct the desired std::string.
The basic usage will look something like this (apologies, on a Mac at the moment, so precise syntax might be a little off):
auto len = reader->ReadInt16();
auto data = ref new Platform::Array<uint8>(len);
reader->ReadBytes(data);
// now data has the bytes you need, and you can make a string with it
Note that the above code is not production-ready - it's definitely possible that reader does not have enough data buffered, and so you'll need to reader.LoadAsync(len) and create a continuation to process the data when it is available. Despite that, hopefully this is enough to get you going.
EDIT:
Just noticed your side question. The short answer is, yes, it is much slower to read a byte at a time, for the reason that it is much more work.
The long answer: Consider what goes in to each byte:
A function call happens - stack frame allocation
Some logic of reading a byte from the buffer happens
The function returns - stack frame is popped, result is pushed, control returns
You take the byte, and push it into a std::string, occasionally causing dynamic re-allocation (unless you've already str.resize(len), that is)
Of all the things that happen, the dynamic reallocation is the really performance killer. That being said, if you have lots of bytes the work of function-calling will dominate the work of reading a byte.
Now, consider what happens when you read all the bytes at once:
A function call happens - stack frame, push the result array
(in the happy path where all requested data is there) memcpy from the internal buffer to your pre-allocated array
return
memcpy into the string
This is of course quite a bit faster - your allocations are constant with respect to the number of bytes read, as are the number of function calls.

Why do I have an "unaligned memory accesses not supported" error?

I got an "unaligned memory accesses not supported error" and did a Google search for that
but there were no clear explanations.
The whole error message is:
/c:\cuda\include\math_functions_dbl_ptx1.h(1308): Error: Unaligned memory accesses not supported
The following code caused the error:
for (j = low; j <= high; j++)
The variables j and high are declared as int.
This kind of error was encountered before but resolved by itself (I did nothing).
Can anybody explain about this matter?
Theory
On many machines - but not Intel IA32 ones or their relatives - if you want to access a 2-byte quantity (integer), the address must be even, not odd; if you want to access a 4-byte quantity (integer or float) the address must be a multiple of 4 bytes; if you want to access an 8-byte quantity (integer or double), the address must be a multiple of 8 bytes; and so on.
Superficially, then, you have somehow got your code trying dereference a pointer which has bits set in the low-order parts when you should not. For example, one way of forcing the issue (in C) would be:
long l = 0x12345678;
void *v = (char *)&l + 1;
long *lp = v;
l = *lp;
By the time you've gone through the pointer arithmetic, the address in lp is not 4-byte (or 8-byte) aligned; it is off-by-one because of the +1. The last line would give an unaligned memory access pointer.
Practice
Since you have not shown the declarations for your code, we can't be sure what causes the trouble (though you do say j and high are int variables; no comment about low). Indeed, almost independent of the declarations, it seems unlikely that the quoted for loop is the source of the trouble. It may be code close to that, but it probably is not that line.
There is a chance that you have a buffer over-writing problem somewhere and are modifying a pointer by accident, and that modified pointer generates the error. But, since that line does not seem to include any pointers, it is unlikely to be that line that is actually triggering the trouble.