In VHDL is there a way using std_texio to read elements of a .csv file and store them in a variable to then use? - csv

I can currently use the readline and read function to read a line from the file and store the characters in variables governed by the size of the variable im putting them into for example if the first line of the file was
hello,world
and I wanted to store the two words in different variables I would do something along the lines of
file in_file : text open READ_MODE is "hello_world.csv";
variable in_line : line;
variable first_word : string(1to5);
variable second_word : string(1to5);
begin
readline(infile,inline);
read(inline,first_word);
read(inline,second_word);
however that is dependent upon the size of the elements I want to be able to generically read the first element before the comma and assign that to a variable then look for the next element until the next comma and store that in a different variable if that makes sense.
Many Thanks

The open source VUnit test framework has a standalone string operations package containing a split function
impure function split (
constant s : string;
constant sep : string;
constant max_split : integer := -1)
return lines_t;
that you can use to split a string (s) into its parts which are separated by sep. For example
parts := split("hello,world",",");
parts is a vector of elements of type line so parts[0].all would in this case equal hello. Have a look at the testbench for the package and look for the "Test split" test case to see the details on how the function handles various normal inputs and corner cases.
Since we use the line type rather than string we don't have to know the length of individual elements.
I'm one of the authors for VUnit.

Related

How do I load just one non-header record, using SSIS, to a table?

I have a flat file with several hundred thousand rows. This file has no header rows. I need to load just the first row into a hold table and then read the last field into a variable. This hold table has just two columns, first one for most of the row, second for the field I need to move into the variable. Optionally, how can I read this one field, from the flat file, into a variable?
I should note that I am currently loading the entire file, then reading just the first row to get the FILE_NBR into a variable. I would like to speed it up a bit by only loading that first row, instead of the entire file.
My source is a fixed position file, so I am putting all fields except for the last 6 bytes into one field and then the last 6 bytes into the FILE_NBR field.
I am looking to only load one record, instead of the entire file, as I only need that field from one record (the number is the same on every record in the file), for comparison to another table.
For the use case you're describing, I would likely use a Data Flow Task that is a Script Component (acting as source) to an OLE/ADO Destination.
Assumptions
A variable named #[User::CurrentFileName] exists, is of type String and is populated with a fully qualified path to the source file.
The Script Component, acting as Source, will have two columns (ROR, FILENBR) defined of the appropriate length (not to exceed 4000 characters) and 6 and the output buffer is left as the default of Output0
Approximate source component code (ensure you set CurrentFileName as a ReadOnly variable in the component)
// A variable for holding our data
string inputRow = "";
// Convert the SSIS space variable into a C# space variable
string sourceFile = Dts.Variables["CurrentFileName"].Value.ToString();
// Read from the source file
// (I was lazy, feel free to improve this)
foreach (string line in System.IO.File.ReadLines(sourceFile))
{
inputRow = line;
// We have the one row we want, let's blow this popsicle stand
break;
}
// TODO split line into RestOfRow and FileNumber
// Guessing here, likely have the logic wrong
// and am off by one is all but guaranteed
int lineLen = line.Length;
// Slice out to the final 6 characters
string ror = line.Substring(0,lineLen-6);
// Python would much more elegant
string fileNumber = line.Substring(lineLen);
// Now that we have the two pieces we need, let's do the SSIS specific thing
// Create a row in our output buffer and assign values
Output0Buffer.AddRow();
Output0Buffer.ROR = ror;
Output0Buffer.FILENBR = fileNumber;
Ref
Is File.ReadLines buffering read lines?

Provide mean pixel values to Caffe's python classify.py

I'd like to test a Caffe model with the Python wrapper:
python classify.py --model_del ./deploy.prototxt --pretrained_model ./mymodel.caffemodel input.png output
Is there a simple way to give mean_pixel values to the python wrapper? It seems to only support a mean_file argument?
The code makes use of args.mean_file variable to read a numpy format data to a variable mean. The easiest method will be to bring on a new parser argument named args.mean_pixel which has a single mean value, store it a mean_pixel variable, then create an array called mean which has the same dimensions as that of input data and copy the mean_pixel value to all the elements in the array. The rest of the code will function as normal.
parser.add_argument(
"--mean_pixel",
type=float,
default=128.0,
help="Enter the mean pixel value to be subtracted."
)
The above code segment will try to take a command line argument called mean_pixel.
Replace the code segment:
if args.mean_file:
mean = np.load(args.mean_file)
with:
if args.mean_file:
mean = np.load(args.mean_file)
elif args.mean_pixel:
mean_pixel = args.mean_pixel
mean = np.array([image_dims[0],image_dims[1],channels]) #where channels is the number of channels of the image
mean.fill(mean_pixel)
This will make the code to pick the mean_pixel value passed on as an argument, if mean_file is not passed as an argument. The above code will create an array with the dimensions as that of the image and fill it with the mean_pixel value.
The rest of the code needn't be changed.

Can I read the rest of the line after a positive value of IOSTAT?

I have a file with 13 columns and 41 lines consisting of the coefficients for the Joback Method for 41 different groups. Some of the values are non-existing, though, and the table lists them as "X". I saved the table as a .csv and in my code read the file to an array. An excerpt of two lines from the .csv (the second one contains non-exisiting coefficients) looks like this:
48.84,11.74,0.0169,0.0074,9.0,123.34,163.16,453.0,1124.0,-31.1,0.227,-0.00032,0.000000146
X,74.6,0.0255,-0.0099,X,23.61,X,797.0,X,X,X,X,X
What I've tried doing was to read and define an array to hold each IOSTAT value so I can know if an "X" was read (that is, IOSTAT would be positive):
DO I = 1, 41
(READ(25,*,IOSTAT=ReadStatus(I,J)) JobackCoeff, J = 1, 13)
END DO
The problem, I've found, is that if the first value of the line to be read is "X", producing a positive value of ReadStatus, then the rest of the values of those line are not read correctly.
My intent was to use the ReadStatus array to produce an error message if JobackCoeff(I,J) caused a read error, therefore pinpointing the "X"s.
Can I force the program to keep reading a line after there is a reading error? Or is there a better way of doing this?
As soon as an error occurs during the input execution then processing of the input list terminates. Further, all variables specified in the input list become undefined. The short answer to your first question is: no, there is no way to keep reading a line after a reading error.
We come, then, to the usual answer when more complicated input processing is required: read the line into a character variable and process that. I won't write complete code for you (mostly because it isn't clear exactly what is required), but when you have a character variable you may find the index intrinsic useful. With this you can locate Xs (with repeated calls on substrings to find all of them on a line).
Alternatively, if you provide an explicit format (rather than relying on list-directed (fmt=*) input) you may be able to do something with non-advancing input (advance='no' in the read statement). However, as soon as an error condition comes about then the position of the file becomes indeterminate: you'll also have to handle this. It's probably much simpler to process the line-as-a-character-variable.
An outline of the concept (without declarations, robustness) is given below.
read(iunit, '(A)') line
idx = 1
do i=1, 13
read(line(idx:), *, iostat=iostat) x(i)
if (iostat.gt.0) then
print '("Column ",I0," has an X")', i
x(i) = -HUGE(0.) ! Recall x(i) was left undefined
end if
idx = idx + INDEX(line(idx:), ',')
end do
An alternative, long used by many many Fortran programmers, and programmers in other languages, would be to use an editor of some sort (I like sed) and modify the file by changing all the Xs to NANs. Your compiler has to provide support for IEEE NaNs for this to work (most of the current crop in widespread use do) and they will correctly interpret NAN in the input file to a real number with value NaN.
This approach has the benefit, compared with the already accepted (and perfectly good) answer, of not requiring clever programming in Fortran to parse input lines containing mixed entries. Use an editor for string processing, use Fortran for reading numbers.

X++ container to CSV string

I was wondering whether a container with values such as ["abc", 50, myDate, myRealNumber] can be converted to "abc","50","1/1/1900","-50.34" using a single function.
The con2Str global function fails if the input type is anything other than str.
I tried creating my own version of con2str function to use an "anyType" instead of str, but it fails because anyType cannot be assigned a different type after the first assignment.
If such a function exists (it does not), it would have to deal with strings containing quotes etc.
This is all handled in class CommaIo method writeExp.
But it writes to a file of cause.
Regarding your problem with anytype you could use the class SysAnyType which wraps your value in another object so that multiple assignments are possible.

Multiple row delimiters

How to define multiple row delimiters for a Flat File Connection in SSIS?
for example for a text file containing this string:
Civility is required at all times; rudeness will not be tolerated.
I want to have this two rows after using ';' and '.' for row delimiter:
Civility is required at all times
rudeness will not be tolerated
For source data, I created a 3 line file
Civility is required at all times; rudeness will not be tolerated.
The quick brown fox jumped over the lazy dogs.
I am but a single row with no delimiter beyond the carriage return
The general approach I have taken below is to use a flat file connection manager with a format of Ragged Right and my header row delimiter is {CR}{LF}. I defined one columns, InputRow as String 8000. YMMV
In my data flow, after the flat file source, I add a script component as a data transformation called Split Rows.
On the Input Columns tab, check the InputRow and leave it as ReadOnly so the script can access the value. It'd be nice if you could switch it to ReadWrite and modify the outgoing values but that's not applicable for this type of operation.
By default, a script task is a synchronous component, meaning there's a 1:1 relationship between rows in and rows out. This will not suit your needs so you will need to switch it over to Asynchronous mode. I renamed the Output 0 to OutputSplit and changed the value of SynchronousInput from "Input 0 (16)" to None. Your value for 16 may vary.
On your Output Columns for OutputSplit, Add a Column with a name of SplitRow DT_STR 8000.
Within your script transformation, you only need to be concerned with the ProcessInputRow method. The string class offers a split method that takes an array of character values that will work as the splitters. Currently, it is hard coded below in the array initializer but it could just as easily be defined as a variable and passed into the script. That is left as an exercise to the poster.
/// <summary>
/// we have to make this an async script as 1 input row can be many output rows
/// </summary>
/// <param name="Row"></param>
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
string[] results = Row.InputRow.Split(new char[]{';', '.'});
foreach (string line in results)
{
// Remove this line if it is desirable to have empty strings in the buffer
if (!string.IsNullOrEmpty(line))
{
OutputSplitBuffer.AddRow();
// You might want to call trim operations on the line
OutputSplitBuffer.SplitRow = line;
}
}
}
With all of this done, I hit F5 and voila,
This is going to be a fairly memory intensive package depending on how much data you run through it. I am certain there are optimizations one could make but this should be sufficient to get you going.