I'm trying to load csv containing knn data (3 columns no names)
e.g
4 3 a
1 3 a
3 3 a
4 5 b
I have been able to load the file into a string.
When I try to move that into a table I get no errors, however when I print the table to screen I get values of nil.
I tried changing contents of file which gives the same result and if changed to (knn_data) I get the path address of the csv in all keys.
I'm trying to get the csv data to appear within the indexed table and in its 3 columns.
Here is the code:
--load kNN file.
local knn_data = system.pathForFile("knn.csv", system.ResourceDirectory)
local file, errorString = io.open(knn_data, "r")
if not file then
print("File Error: File Unavailable")
else
local contents = file:read("*a")
print(contents)
io.close(file)
end
file = nil
-- load data into table
dataset = {}
for line in io.lines(knn_data) do
dataset[#dataset+1] = (contents)
Previously attached screenshot of code
...
else
local contents= file:read("*a")
print(contents)
--io.close(file)
end
contents is a local variable in your else statement.
Outside of it, contents is nil.
dataset = {}
for line in io.lines(iknn_data) do
dataset[#dataset+1] = (contents)
So dataset[#dataset+1]= (contents) is equivalent to dataset[#dataset+1]= nil
Within that generic for loop, line contains the line read from the file. So actually you should work with that.
Related
I have been using the following Lua code for a while to do simply csv to array conversions, but everything previously had a value in every column, but this time on a csv formatted bank statement there are empty values, which this does not handle.
Here’s an example csv, with debit and credits.
Transaction Date,Transaction Type,Sort Code,Account Number,Transaction Description,Debit Amount,Credit Amount,Balance
05/04/2022,DD,'11-70-79,6033606,Refund,,10.00,159.57
05/04/2022,DEB,'11-70-79,6033606,Henry Ltd,30.00,,149.57
05/04/2022,SO,'11-70-79,6033606,NEIL PARKS,20.00,,179.57
01/04/2022,FPO,'11-70-79,6033606,MORTON GREEN,336.00,,199.57
01/04/2022,DD,'11-70-79,6033606,WORK SALARY,,100.00,435.57
01/04/2022,DD,'11-70-79,6033606,MERE BC,183.63,,535.57
01/04/2022,DD,'11-70-79,6033606,ABC LIFE,54.39,,719.20
I’ve tried different patterns (https://www.lua.org/pil/20.2.html), but none seem to work, I’m beginning to think I can’t fix this via the pattern as it’ll break how it works for the rest? I appreciate it if anyone can share how they would approach this…
local csvfilename = "/mnt/nas/Fireflyiii.csv"
local MATCH_PATTERN = "[^,]+"
local function create_array_from_file(csvfilename)
local file = assert(io.open(csvfilename, "r"))
local arr = {}
for line in file:lines() do
local row = {}
for match in string.gmatch(line, MATCH_PATTERN) do
table.insert(row, match)
end
table.insert(arr, row)
end
return arr
end
Using Lua, i’m downloading a .csv file and then taking the first line and last line to help me validate the time period visually by the start and end date/times provided.
I’d also like to scan through the values and create a variety of variables e.g the highest, lowest and average value reported during that period.
The .csv is formatted in the following way..
created_at,entry_id,field1,field2,field3,field4,field5,field6,field7,field8
2021-04-16 20:18:11 UTC,6097,17.5,21.1,20,20,19.5,16.1,6.7,15.10
2021-04-16 20:48:11 UTC,6098,17.5,21.1,20,20,19.5,16.3,6.1,14.30
2021-04-16 21:18:11 UTC,6099,17.5,21.1,20,20,19.6,17.2,5.5,14.30
2021-04-16 21:48:11 UTC,6100,17.5,21,20,20,19.4,17.9,4.9,13.40
2021-04-16 22:18:11 UTC,6101,17.5,20.8,20,20,19.1,18.5,4.4,13.40
2021-04-16 22:48:11 UTC,6102,17.5,20.6,20,20,18.7,18.9,3.9,12.40
2021-04-16 23:18:11 UTC,6103,17.5,20.4,19.5,20,18.4,19.2,3.5,12.40
And my code to get the first and last line is as follows
print("Part 1")
print("Start : check 2nd and last row of csv")
local ctr = 0
local i = 0
local csvfilename = "/home/pi/shared/feed12hr.csv"
local hFile = io.open(csvfilename, "r")
for _ in io.lines(csvfilename) do ctr = ctr + 1 end
print("...... Count : Number of lines downloaded = " ..ctr)
local linenumbera = 2
local linenumberb = ctr
for line in io.lines(csvfilename) do i = i + 1
if i == linenumbera then
secondline = line
print("...... 2nd Line is = " ..secondline) end
if i == linenumberb then
lastline = line
print("...... Last line is = " ..lastline)
-- return line
end
end
print("End : Extracted 2nd and last row of csv")
But I now plan to pick a column, ideally by name (as I’d like to be able to use this against other .csv exports that are of a similar structure.) And get the .csv into a table/array...
I’ve found an option for that here - Csv file to a Lua table and access the lines as new table or function()
See below..
#!/usr/bin/lua
print("Part 2")
print("Start : Convert .csv to table")
local csvfilename = "/home/pi/shared/feed12hr.csv"
local csv = io.open(csvfilename, "r")
local items = {} -- Store our values here
local headers = {} --
local first = true
for line in csv:gmatch("[^\n]+") do
if first then -- this is to handle the first line and capture our headers.
local count = 1
for header in line:gmatch("[^,]+") do
headers[count] = header
count = count + 1
end
first = false -- set first to false to switch off the header block
else
local name
local i = 2 -- We start at 2 because we wont be increment for the header
for field in line:gmatch("[^,]+") do
name = name or field -- check if we know the name of our row
if items[name] then -- if the name is already in the items table then this is a field
items[name][headers[i]] = field -- assign our value at the header in the table with the given name.
i = i + 1
else -- if the name is not in the table we create a new index for it
items[name] = {}
end
end
end
end
print("End : .csv now in table/array structure")
But I’m getting the following error ??
pi#raspberrypi:/ $ lua home/pi/Documents/csv_to_table.lua
Part 2
Start : Convert .csv to table
lua: home/pi/Documents/csv_to_table.lua:12: attempt to call method 'gmatch' (a nil value)
stack traceback:
home/pi/Documents/csv_to_table.lua:12: in main chunk
[C]: ?
pi#raspberrypi:/ $
Any ideas on that ?
I can confirm that the .csv file is there ?
Once everything (hopefully) is in a table - I then want to be able to generate a list of variables based on the information in a chosen column, which I can then use and send within a push notification or email (which I already have the code for).
The following is what I’ve been able to create so far, but I would appreciate any/all help to do more analysis of the values within the chosen column so I can see all things like get highest, lowest, average etc.
print("Part 3")
print("Start : Create .csv analysis values/variables")
local total = 0
local count = 0
for name, item in pairs(items) do
for field, value in pairs(item) do
if field == "cabin" then
print(field .. " = ".. value)
total = total + value
count = count + 1
end
end
end
local average = tonumber(total/count)
local roundupdown = math.floor(average * 100)/100
print(count)
print(total)
print(total/count)
print(rounddown)
print("End : analysis values/variables created")
io.open returns a file handle on success. Not a string.
Hence
local csv = io.open(csvfilename, "r")
--...
for line in csv:gmatch("[^\n]+") do
--...
will raise an error.
You need to read the file into a string first.
Alternatively can iterate over the lines of a file using file:lines(...) or io.lines as you already do in your code.
local csv = io.open(csvfilename, "r")
if csv then
for line in csv:lines() do
-- ...
You're iterating over the file more often than you need to.
Edit:
This is how you could fill a data table while calculating the maxima for each row on the fly. This assumes you always have valid lines! A proper solution should verify the data.
-- prepare a table to store the minima and maxima in
local colExtrema = {min = {}, max = {}}
local rows = {}
-- go over the file linewise
for line in csvFile:lines() do
-- split the line into 3 parts
local timeStamp, id, dataStr = line:match("([^,]+),(%d+),(.*)")
-- create a row container
local row = {timeStamp = timeStamp, id = id, data = {}}
-- fill the row data
for val in dataStr:gmatch("[%d%.]+") do
table.insert(row.data, val)
-- find the biggest value so far
-- our initial value is the smallest number possible
local oldMax = colExtrema[#row.data].max or -math.huge
-- store the bigger value as the new maximum
colExtrema.max[#row.data] = math.max(val, oldMax)
end
-- insert row data
table.insert(rows, row)
end
I received a flat file that cannot be generated in other way. The delimited is a comma and the text qualifier is a double quote. The problem is that sometimes a have a double quote in the value. In example:
"0","12345", "Centre d"edu et de recherche", "B8E7"
Because of the double quote in the value, I received this error:
[Flat File Source [58]] Error: The column delimiter for column "XYZ" was not found.
[Flat File Source [58]] Error: An error occurred while processing file "C:\somefile.csv" on data row 296.
What can I do to process this file?
I use SSIS 2016 with Visual Studio 2015
You can use the Flat File Source error output to redirect bad rows to another flat file and correct values manually while all valid rows will be processed.
There are many links online to learn more about Flat File Source Error Output:
Flat File source Error Output connection in SSIS
How to Avoid Package Design Flaws When Sourcing Data From Flat Files
Flat File Source Editor (Error Output Page)
Update 1 - Workaround using Script Component and conditional split
Since Flat File error output is not working you can use a script component with a conditional split to filter bad rows, the following update is a step by step guide to implement that:
Add a Flat File connection manager, Go To advanced Tab, Delete all columns except one column and change it length to 4000
Add a script component, Go to Input and Output Column Tab, add desired output columns (in this example 4 columns) and add a Flag Column of type DT_BOOL
Inside the Script Component write the following script to check if the number of columns is 4 then Flag = True which means this is a valid row else set Flag as False which mean that this is a bad row:
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
if (!Row.Column0_IsNull && !String.IsNullOrWhiteSpace(Row.Column0))
{
string[] cells = Row.Column0.Split(new string[] { "\",\"" }, StringSplitOptions.None);
if (cells.Length == 4)
{
Row.Col1 = cells[0].TrimStart('\"');
Row.Col2 = cells[1];
Row.Col3 = cells[2];
Row.Col4 = cells[3].TrimEnd('\"');
Row.Flag = true;
}
else
{
bool cancel;
Row.Flag = false;
}
}
else
{
Row.Col1_IsNull = true;
Row.Col2_IsNull = true;
Row.Col3_IsNull = true;
Row.Col4_IsNull = true;
Row.Flag = true;
}
}
}
Add a conditional split to split rows based on Flag column
Map the Valid Rows output to the OLEDB Destination, and the Bad Rows output to another flat file where you only map Column0
Hi good day I'm very new to powerbuilder and I'm using PB 11.5
Can someone know how to import comma delimited text file into datawindow.
Example Text file
"1234","20141011","Juan, Delacruz","Usa","001992345456"...
"12345","20141011","Arc, Ino","Newyork","005765753256"...
How can I import the third column which is the full name and the last column which is the account number. I want to transfer the name and account number into my external data window. I've tried to use the ImportString(all the rows are being transferred in one column only). I have three fields in my external data window.the Name and Account number.
Here's the code
ls_File = dw_2.Object.file_name[1]
li_FileHandle = FileOpen(ls_File)
li_FileRead = FileRead(li_FileHandle, ls_Text)
DO WHILE li_FileRead > 0
li_Count ++
li_FileRead = FileRead(li_FileHandle, ls_Text)
ll_row = dw_1.ImportString(ls_Text,1)
Loop.
Please help me with the code! Thank You
It seems that PB expects by default a tab-separated csv file (while the 'c' from 'csv' stands for 'coma'...).
Add the csv! enumerated value in the arguments of ImportString() and it should fix the point (it does in my test box).
Also, the columns defined in your dataobject must match the columns in the csv file (at least for the the first columns your are interested in). If there are mode columns in the csv file, they will be ignored. But if you want to get the 1st (or 2nd) and 3rd columns, you need to define the first 3 columns. You can always hide the #1 or #2 if you do not need it.
BTW, your code has some issues :
you should always test the return values of function calls like FileOpen() for stopping processing in case of non-existent / non-readable file
You are reading the text file twice for the first row: once before the while and another inside of the loop. Or maybe it is intended to ignore a first line with column headers ?
FWIF, here is a working code based on yours:
string ls_file = "c:\dev\powerbuilder\experiment\data.csv"
string ls_text
int li_FileHandle, li_fileread, li_count
long ll_row
li_FileHandle = FileOpen(ls_File)
if li_FileHandle < 1 then
return
end if
li_FileRead = FileRead(li_FileHandle, ls_Text)
DO WHILE li_FileRead > 0
li_Count ++
ll_row = dw_1.ImportString(csv!,ls_Text,1)
li_FileRead = FileRead(li_FileHandle, ls_Text)//read next line
Loop
fileclose(li_fileHandle)
use datawindow_name.importfile(CSV!,file_path) method.
Within JMeter, I am running a script which uses a .CSV file to enter data as well as verify results. It is working correctly, but I cannot figure out how to tell which row/line of the .CSV caused the individual failures. Is there a way to do this?
Somewhat of an example scenario (not specific to what I'm doing, but similar):
Each row of the .CSV file contains a mathematical equation as well as the expected result.
On page 1, enter the equation (2+2)
Then on Page 2, you get the response: 3.
That test would obviously be a failure.
Say there are 1,000 tests being ran, some that pass and some that do not. How can I tell which .CSV row/line didn't pass?
Do you have any columns in your CSV file which help you to uniquely identify a row?
Let me assume you have a column called 'TestCaseNo' which has values as TC001, TC002,TC003...etc
Add a Beanshell Post Processor to store the result for each iteration.
Add the below code. I assume the you have the PASS or FAIL result stored in the 'Result' variable.
import java.io.File;
import org.apache.jmeter.services.FileServer;
f = new FileOutputStream("someptah/tcstatus.csv", true);
p = new PrintStream(f);
p.println( vars.get("TestCaseNo") + "," + vars.get("Result"));
p.close();
f.close();
The above code creates a CSV file with the results for each testcase.
EDIT:
Do the assertion yourself in the Beanshell post processor.
import java.io.File;
import org.apache.jmeter.services.FileServer;
Result = "FAIL";
Response = prev.getResponseDataAsString();
if (Response.contains("value")) // replace the value with the expected text
Result = "PASS";
f = new FileOutputStream("someptah/tcstatus.csv", true);
p = new PrintStream(f);
p.println( vars.get("TestCaseNo") + "," + Result);
p.close();
f.close();
I would use the following approach
__CSVRead() function - to get data from the .csv file.
__counter function - to track CSV file position. You can include counter variable name into Sampler's label so current .csv file line could be reported. See below image for example
For more information on aforementioned and other useful JMeter functions see How to Use JMeter Functions posts series.