Lua - Match pattern for CSV import to array, that factors in empty values (two commas next to each other) - csv

I have been using the following Lua code for a while to do simply csv to array conversions, but everything previously had a value in every column, but this time on a csv formatted bank statement there are empty values, which this does not handle.
Here’s an example csv, with debit and credits.
Transaction Date,Transaction Type,Sort Code,Account Number,Transaction Description,Debit Amount,Credit Amount,Balance
05/04/2022,DD,'11-70-79,6033606,Refund,,10.00,159.57
05/04/2022,DEB,'11-70-79,6033606,Henry Ltd,30.00,,149.57
05/04/2022,SO,'11-70-79,6033606,NEIL PARKS,20.00,,179.57
01/04/2022,FPO,'11-70-79,6033606,MORTON GREEN,336.00,,199.57
01/04/2022,DD,'11-70-79,6033606,WORK SALARY,,100.00,435.57
01/04/2022,DD,'11-70-79,6033606,MERE BC,183.63,,535.57
01/04/2022,DD,'11-70-79,6033606,ABC LIFE,54.39,,719.20
I’ve tried different patterns (https://www.lua.org/pil/20.2.html), but none seem to work, I’m beginning to think I can’t fix this via the pattern as it’ll break how it works for the rest? I appreciate it if anyone can share how they would approach this…
local csvfilename = "/mnt/nas/Fireflyiii.csv"
local MATCH_PATTERN = "[^,]+"
local function create_array_from_file(csvfilename)
local file = assert(io.open(csvfilename, "r"))
local arr = {}
for line in file:lines() do
local row = {}
for match in string.gmatch(line, MATCH_PATTERN) do
table.insert(row, match)
end
table.insert(arr, row)
end
return arr
end

Related

Save decoded JSON values in Lua Variables

Following script describes the decoding of a JSON Object, that is received via MQTT. In this case, we shall take following JSON Object as an example:
{"00-06-77-2f-37-94":{"publish_topic":"/stations/test","sample_rate":5000}}
After being received and decoded in the handleOnReceive function, the local function saveTable is called up with the decoded object which looks like:
["00-06-77-2f-37-94"] = {
publish_topic = "/stations/test",
sample_rate = 5000
}
The goal of the saveTable function is to go through the table above and assign the values "/stations/test" and 5000 respectively to the variables pubtop and rate. When I however print each of both variables, nil is returned in both cases.
How can I extract the values of this table and save them in mentioned variables?
If i can only save the values "publish_topic = "/stations/test"" and "sample_rate = 5000" at first, would I need to parse these to get the values above and save them, or is there another way?
local pubtop
local rate
local function saveTable(t)
local conversionTable = {}
for k,v in pairs(t) do
if type(v) == "table" then
conversionTable [k] = string.format("%q: {", k)
printTable(v)
print("}")
else
print(string.format("%q:", k) .. v .. ",")
end
end
pubtop = conversionTable[0]
rate = conversionTable[1]
end
local lua_value
local function handleOnReceive(topic, data, _, _)
print("handleOnReceive: topic '" .. topic .. "' message '" .. data .. "'")
print(data)
lua_value = JSON:decode(data)
saveTable(lua_value)
print(pubtop)
print(rate)
end
client:register('OnReceive', handleOnReceive)
previous question to thread: Decode and Parse JSON to Lua
The function I gave you was to recursively print table contents. It was not ment to be modified to get some specific values.
Your modifications do not make any sense. Why would you store that string in conversionTable[k]? You obviously have no idea what you're doing here. No offense but you should learn some basics befor you continue.
I gave you that function so you can print whatever is the result of your json decode.
If you know you get what you expect there is no point in recursively iterating through that table.
Just do it like that
for k,v in pairs(lua_value) do
print(k)
print(v.publish_topic)
print(v.sample_rate)
end
Now read the Lua reference manual and do some beginners tutorials please.
You're wasting a lot of time and resources if you're trying to implement things like that if you do not know how to access the elements of a table. This is like the most basic and important operation in Lua.

Need help rounding Mysql results that are returned from a python function

I am relatively new to python and I am working on creating a program in my fun time to automatically generate a sales sheet. It has several functions that pull the necessary data from a database, and reportlab and a few other tools to place the results onto the generated pdf. I am trying to round the results coming from the Mysql server. However, I have hit a point where I am stuck and all the ways I have tried to round the results throw an error code and do not work. I need a few examples to look at so I can see how this would work and any relevant feedback that would help me learn.
I have tried to use the mysql round function to round the results but that failed. I have also tried to round the results as part of the function that generates the unit cost itself. However, that has failed as well.
A large amount of the code has been deleted due to the security hole it would generate. Code provided is to show what I have done so far. Print result line is to verify that the code is working during development. It is not throwing any erroneous results and will be removed during the last stage of the project.
def upcpsfunc(self, upc):
mycursor = self.mydb.cursor()
command = "Select Packsize from name"" where UPC = %(Upc)s"
mycursor.execute(command, {'Upc': upc})
result = mycursor.fetchone()
print(result[0])
return result[0]
def unitcost(self,upc):
#function to generate unit cost
mycursor = self.mydb.cursor()
command = "Select Concat((Cost - Allow)/Packsize) as total from name
where UPC = %(Upc)s"
mycursor.execute(command, {'Upc': upc})
result = mycursor.fetchone()
print (result[0])
return result[0]
As for the expected results, I would prefer the mysql command round the results before it sends it to Reportlab for placement. So far the results are 4 or 5 digits, which is not ideal. I want the results to have two decimal places, since it would be money. The desired output is 7.50 instead 7.5025
The round function can be used to round numbers:
>>> round(7.5025, 2)
7.5
To get the extra 0 on the end, you can use the following code:
>>> def round_money(n):
s = str(round(n, 2))
if len(s) == 1: # exact dollar
return s + ".00"
elif len(s) == 3: # exact x10 cents
return s + "0"
return s
>>> round_money(6)
'6.00'
>>> round_money(7.5025)
'7.50'
Note that this function returns a string, because 7.50 cannot be represented by an integer in python.
Just as an alternative way to the one already provided, you can do the same thing with string formatting (it'll truncate the decimals though, so you can still round beforehand):
>>> '{:,.2f}'.format(0)
'0.00'
>>> '{:,.2f}'.format(15342.62412)
'15,342.62'

Unable to Extract simple Csv file using U-SQL

I have this csv file,
Almost all the records are getting processed fine, however there are two cases in which i am experiencing an issue.
Case 1:
A record containing quotes within quotes:
"some data "some data" some data"
Case 2:
A record containing comma within quotes:
"some data, some data some data"
i have looked into this issue, and got my way around looking into quoting parameter of the extractor, but i have observed that setting (quoting:false) solves case 1 and fails for case 2 and setting (quoting:true) solves case 2 but fails for case 1.
constraints: There is no room for changing the data file, the future data will be tailored accordingly but for this existing data i have to resolve this.
Try this, import records as one row and fix the row text using double quotes (do the same for the commas):
DECLARE #input string = #"/Samples/Data/Sample1.csv";
DECLARE #output string = #"/Output/Sample1.txt";
// Import records as one row
#data =
EXTRACT rowastext string
FROM #input
USING Extractors.Text('\n', quoting: false );
// Fix the row text using double quotes
#query =
SELECT Regex.Replace(rowastext, "([^,])\"([^,])", "$1\"\"$2") AS rowascsv
FROM #data;
OUTPUT #query
TO #output
USING Outputters.Csv(quoting : false);

EOF Error During Dict Slice

I am trying to compile monthly data in to an existing JSON file that I loaded via import json. Initially, my json data just had one property which is 'name':
json_data['features'][1]['properties']
>>{'name':'John'}
But the end result with the monthly data I want is like this:
json_data['features'][1]['properties']
>>{'name':'John',
'2016-01': {'x1':0, 'x2':0, 'x3':1, 'x4':0},
'2016-02': {'x1':1, 'x2':0, 'x3':1, 'x4':0}, ... }
My monthly data are on separate tsv files. They have this format:
John 0 0 1 0
Jane 1 1 1 0
so I loaded them via import csv and parsed through a list of urls and set about placing them in a collective dictionary like so:
file_strings = ['2016-01.tsv', '2016-02.tsv', ... ]
collective_dict = {}
for i in strings:
with open(i) as f:
tsv_object = csv.reader(f, delimiter='\t')
collective_dict[i[:-4]] = rows[0]:rows[1:5] for rows in tsv_object
I checked how things turned out by slicing collective_dict like so:
collective_dict['2016-01']['John'][0]
>>'0'
Which is correct; it just needs to be cast into an integer.
For my next feat, I attempted to assign all of the monthly data to the respective json members as part of their external properties:
for i in file_strings:
for j in range(len(json_data['features'])):
json_data['features'][j]['properties'][i[:-4]] = {}
json_data['features'][j]['properties'][i[:-4]]['x1'] = int(collective_dict[i[:-4]][json_data['features'][j]['properties']['name']][0])
json_data['features'][j]['properties'][i[:-4]]['x2'] = int(collective_dict[i[:-4]][json_data['features'][j]['properties']['name']][1])
json_data['features'][j]['properties'][i[:-4]]['x3'] = int(collective_dict[i[:-4]][json_data['features'][j]['properties']['name']][2])
json_data['features'][j]['properties'][i[:-4]]['x4'] = int(collective_dict[i[:-4]][json_data['features'][j]['properties']['name']][3])
Here I got an arrow pointing at the last few characters:
Syntax Error: unexpected EOF while parsing
It is a pretty complicated slice, I suppose user error is not to be ruled out. However, I did double and triple check things. I also looked up this error. It seems to come up with input() related calls. I'm left a bit confused, I don't see how I made a mistake (although I'm already mentally prepared to accept that).
My only guess was that something somewhere was not a string. When I checked collective_dict and json_data, everything that was supposed to be a string was a string ('John', 'Jane' et all). So, I guess it's something else.
I made the problem as simple as I could while keeping the original structure of the data and for loops and so forth. I'm using Python 3.6.
Question
Why am I getting the EOF error? How can I build my external properties data without encountering such an error?
Here I have rewritten your last code block to:
for i in file_strings:
file_name = i[:-4]
for j in range(len(json_data['features'])):
name = json_data['features'][j]['properties']['name']
file_dict = json_data['features'][j]['properties'][file_name] = {}
for x in range(4):
x_string = 'x{}'.format(x+1)
file_dict[x_string] = int(collective_dict[file_name][name][x])
from:
for i in file_strings:
for j in range(len(json_data['features'])):
json_data['features'][j]['properties'][i[:-4]] = {}
json_data['features'][j]['properties'][i[:-4]]['x1'] = int(collective_dict[i[:-4]][json_data['features'][j]['properties']['name']][0])
json_data['features'][j]['properties'][i[:-4]]['x2'] = int(collective_dict[i[:-4]][json_data['features'][j]['properties']['name']][1])
json_data['features'][j]['properties'][i[:-4]]['x3'] = int(collective_dict[i[:-4]][json_data['features'][j]['properties']['name']][2])
json_data['features'][j]['properties'][i[:-4]]['x4'] = int(collective_dict[i[:-4]][json_data['features'][j]['properties']['name']][3])
That is just to make it a bit more readable, but that shouldn't change anything.
A thing I noticed in your other part of code is the following:
collective_dict[i[:-4]] = rows[0]:rows[1:5] for rows in tsv_object
The thing I refer to is the = rows[0]:rows[1:5] for rows in tsv_object part. In my IDE, that does not work, and I'm not sure if that is a typo in your question or of that is actually in your code, but I imagine you want it to actually be
collective_dict[i[:-4]] = {rows[0]:rows[1:5] for rows in tsv_object}
or something like that. I'm not sure if that could confuse the parser think that there is an error at the end of the file.
The ValueError: Invalid literal for int()
If your tsv-data is
John 0 0 1 0
Jane 1 1 1 0
Then it should be no problem to do int() of the string value. E.g.: int('42') will become an int with value 42. However, if you have an error in one, or several, lines of your files, then use something like this block of code to figure out which file and line it is:
file_strings = ['2016-01.tsv', '2016-02.tsv', ... ]
collective_dict = {}
for file_name in file_strings:
print('Reading {}'.format(file_name))
with open(file_name) as f:
tsv_object = csv.reader(f, delimiter='\t')
for line_no, (name, *x_values) in enumerate(tsv_object):
if len(x_values) != 4:
print('On line {}, there is only {} values!'.format(line_no, len(x_values)))
try:
intx = [int(x) for x in x_values]
except ValueError as e:
# Catch "Invalid literal for int()"
print('Line {}: {}'.format(line_no, e))

How to import comma delimited text file into datawindow (powerbuilder 11.5)

Hi good day I'm very new to powerbuilder and I'm using PB 11.5
Can someone know how to import comma delimited text file into datawindow.
Example Text file
"1234","20141011","Juan, Delacruz","Usa","001992345456"...
"12345","20141011","Arc, Ino","Newyork","005765753256"...
How can I import the third column which is the full name and the last column which is the account number. I want to transfer the name and account number into my external data window. I've tried to use the ImportString(all the rows are being transferred in one column only). I have three fields in my external data window.the Name and Account number.
Here's the code
ls_File = dw_2.Object.file_name[1]
li_FileHandle = FileOpen(ls_File)
li_FileRead = FileRead(li_FileHandle, ls_Text)
DO WHILE li_FileRead > 0
li_Count ++
li_FileRead = FileRead(li_FileHandle, ls_Text)
ll_row = dw_1.ImportString(ls_Text,1)
Loop.
Please help me with the code! Thank You
It seems that PB expects by default a tab-separated csv file (while the 'c' from 'csv' stands for 'coma'...).
Add the csv! enumerated value in the arguments of ImportString() and it should fix the point (it does in my test box).
Also, the columns defined in your dataobject must match the columns in the csv file (at least for the the first columns your are interested in). If there are mode columns in the csv file, they will be ignored. But if you want to get the 1st (or 2nd) and 3rd columns, you need to define the first 3 columns. You can always hide the #1 or #2 if you do not need it.
BTW, your code has some issues :
you should always test the return values of function calls like FileOpen() for stopping processing in case of non-existent / non-readable file
You are reading the text file twice for the first row: once before the while and another inside of the loop. Or maybe it is intended to ignore a first line with column headers ?
FWIF, here is a working code based on yours:
string ls_file = "c:\dev\powerbuilder\experiment\data.csv"
string ls_text
int li_FileHandle, li_fileread, li_count
long ll_row
li_FileHandle = FileOpen(ls_File)
if li_FileHandle < 1 then
return
end if
li_FileRead = FileRead(li_FileHandle, ls_Text)
DO WHILE li_FileRead > 0
li_Count ++
ll_row = dw_1.ImportString(csv!,ls_Text,1)
li_FileRead = FileRead(li_FileHandle, ls_Text)//read next line
Loop
fileclose(li_fileHandle)
use datawindow_name.importfile(CSV!,file_path) method.