Retrieve data by Ignoring null values and header row from csv file - csv

Working on Groovy Script in soapui 5.3.0 and facing the below issue while extracting the values from file to a list.
Purpose of below code is, the list retrieved has to be compared with another list with valid values only.
Attaching the code snippet and the sample csv file for reference.
code to retrieve the values:
def DBvalue= context["csvfile"] //csv file containing the data
def count= context["dbrowcount"] //here the rowcount is 23
for (i=0;i<count;i++) {
def lines= ""
lines= DBvalue.text.split('\n')
list<string> rows = lines.collect{it.split(';)}
log.info "list is"+rows
}
Sample CSV file on which am working contains 600 column of data with 23 rows
abc;null;1;2;3;5;8;null
cdf;null;2;3;6;null;5;6
hgf;null;null;null;jr;null;II
Currently my code is fetching the below output:
[[abc,null,1,2,3,5,8,null]]
[[abc,null,1,2,3,5,8,null]]
[[abc,null,1,2,3,5,8,null]]
Desired output:
[1,2,3,5,8]
[2,3,6,5,6]
[jr,II]

You should be able achieve it with below, and follow in-line comments.
//Provide your file path; change if needed
def file = new File('/tmp/test.csv')
//To hold all the rows
def list = []
//Change delimiter if needed
def delimiter = ';'
file.readLines().eachWithIndex { line, index ->
if (index) {
//Get the row data by split, filter
def lineData = line.split(delimiter).findAll { 'null' != it && it}
log.info lineData
list << lineData
}
}
//Print all the row data
log.info list
Input:
Output:

Related

How to check for specific field values based on some condition while converting csv file to json format

Below is the code to convert csv file to json format in python.
I have two fields 'recommendation' and 'rating'. Based on the recommendation value I need to set the value for rating field like if recommendation is 1 then rating =1 and vice versa. With the answer I got I'm getting output for only one record entry instead of getting all the records. I think it's overriding. Do I need to create separate list for that and append each record entry to the list to get the output for all records.
here's the updated code:
def main(input_file):
csv_rows = []
with open(input_file, 'r') as csvfile:
reader = csv.DictReader(csvfile, delimiter='|')
title = reader.fieldnames
for row in reader:
entry = OrderedDict()
for field in title:
entry[field] = row[field]
[c.update({'RATING': c['RECOMMENDATIONS']}) for c in reader]
csv_rows.append(entry)
with open(json_file, 'w') as f:
json.dump(csv_rows, f, sort_keys=True, indent=4, ensure_ascii=False)
f.write('\n')
I want to create the nested format like the below:
"rating": {
"user_rating": {
"rating": 1
},
"recommended": {
"rating": 1
}
After you've read the file in, using the csv.DictReader, you'll have a list of dicts. Since you want to set the values now, it's a simple dict manipulation. There are several ways, of which one is:
[c.update({'rating': c['recommendation']}) for c in read_csvDictReader]
Hope that helps.

Groovy csv to string

I am using Dell Boomi to map data from one system to another. I can use groovy in the maps but have no experience with it. I tried to do this with the other Boomi tools, but have been told that I'll need to use groovy in a script. My inbound data is:
132265,Brown
132265,Gold
132265,Gray
132265,Green
I would like to output:
132265,"Brown,Gold,Gray,Green"
Hopefully this makes sense! Any ideas on the groovy code to make this work?
It can be elegantly solved with groupBy and the spread operator:
#Grapes(
#Grab(group='org.apache.commons', module='commons-csv', version='1.2')
)
import org.apache.commons.csv.*
def csv = '''
132265,Brown
132265,Gold
132265,Gray
132265,Green
'''
def parsed = CSVParser.parse(csv, CSVFormat.DEFAULT.withHeader('code', 'color')
parsed.records.groupBy({ it.code }).each { k,v -> println "$k,\"${v*.color.join(',')}\"" }
The above prints:
132265,"Brown,Gold,Gray,Green"
Well, I don't know how are you getting your data, but here is a general way to achieve your goal. You can use a library, such as the one bellow to parse the csv.
https://github.com/xlson/groovycsv
The example for your data would be:
#Grab('com.xlson.groovycsv:groovycsv:1.1')
import static com.xlson.groovycsv.CsvParser.parseCsv
def csv = '''
132265,Brown
132265,Gold
132265,Gray
132265,Green
'''
def data = parseCsv(csv)
I believe you want to associate the number with various values of colors. So for each line you can create a map of the number and the colors associated with that number, splitting the line by ",":
map = [:]
for(line in data) {
number = line.split(',')[0]
colour = line.split(',')[1]
if(!map[number])
map[number] = []
map[number].add(colour)
}
println map
So map should contain:
[132265:["Brown","Gold","Gray","Green"]]
Well, if it is not what you want, you can extract the general idea.
Assuming your data is coming in as a comma separated string of data like this:
"132265,Brown 132265,Gold 132265,Gray 132265,Green 122222,Red 122222,White"
The following Groovy script code should do the trick.
def csvString = "132265,Brown 132265,Gold 132265,Gray 132265,Green 122222,Red 122222,White"
LinkedHashMap.metaClass.multiPut << { key, value ->
delegate[key] = delegate[key] ?: []; delegate[key] += value
}
def map = [:]
def csv = csvString.split().collect{ entry -> entry.split(",") }
csv.each{ entry -> map.multiPut(entry[0], entry[1]) }
def result = map.collect{ k, v -> k + ',"' + v.join(",") + '"'}.join("\n")
println result
Would print:
132265,"Brown,Gold,Gray,Green"
122222,"Red,White"
Do you HAVE to use scripting for some reason? This can be easily accomplished with out-of-the-box Boomi functionality.
Create a map function that prepends the ID field to a string of your choice (i.e. 222_concat_fields). Then use that value to set a dynamic process prop with that value.
The value of the process prop will contain the result of concatenating the name fields. Simply adding this function to your map should take care of it. Then use the final value to populate your result.
Well it depends upon the data how is it coming.
If the data which you have posted in the question is coming in a single document, then you can easily handle this in a map with groovy scripting.
If the data which you have posted in the question is coming into multiple documents i.e.
doc1: 132265,Brown
doc2: 132265,Gold
doc3: 132265,Gray
doc4: 132265,Green
In that case it cannot be handled into map. You will need to use Data Process Step with Custom Scripting.
For the code which you are asking to create in groovy depends upon the input profile in which you are getting the data. Please provide more information i.e. input profile, fields etc.

Undefined variable error, though it's defined in function

def append_customer(file_variable, fields):
"""
-------------------------------------------------------
Appends a customer record to a sequential file.
-------------------------------------------------------
Preconditions:
file_variable - the file to search (file)
fields - data to append to the file (list)
Postconditions:
The data in fields are appended to file_variable.
-------------------------------------------------------
"""
if os.path.exists(file_variable) == True:
file = open(file_variable, "w", encoding="utf-8")
file.seek(0)
print("{0}".format(fields), file=file_variable)
file.close()
return
This is a function i wrote to append data to a file, the only error being that apparently "file_variable is undefined"... even though it's clearly defined by the function. I have no clue as to why this error is occurring.
This seems closer to what you want:
def append_customer(file_variable, fields):
"""
-------------------------------------------------------
Appends a customer record to a sequential file.
-------------------------------------------------------
Preconditions:
file_variable - the file to search (file)
fields - data to append to the file (list)
Postconditions:
The data in fields are appended to file_variable.
-------------------------------------------------------
"""
if os.path.exists(file_variable):
with open(file_variable, "a", encoding="utf-8") as fobj:
fobj.write("{0}\n".format(fields))
You need to indent the function body.
If you want to open an existing file for appending you need to use 'a' which means "open for writing, appending to the end of the file if it exists". The with statement opens the file and closes it when leaving the indentation.

I'm trying read 3 or more column from CSV file, it giving me an index error. anyone have an idea

//Create a new filereader object, using the context variable so it can be used between test components
context.fileReader = new BufferedReader(new FileReader('C:/data.csv'))
//Read in the first line of the data file
//this is the code fro the testcase
firstLine = context.fileReader.readLine()
//Split the first line into a string array and assign the array elements to various test case properties
String[] propData = firstLine.split(",")
testCase.setPropertyValue("data1",propData[0])
testCase.setPropertyValue("data2",propData[1])
testCase.setPropertyValue("data3",propData[2])
//Rename request test steps for readability in the log; append the element name to the test step names
testCase.getTestStepAt(0).setName("data1-" + propData[0])
testCase.getTestStepAt(1).setName("data2-" + propData[1])
testCase.getTestStepAt(2).setName("data3-" + propData[2])
//this is the Code that reads from CSV file
context.fileReader = new BufferedReader(new FileReader('C:/data.csv'))
/*Read in the next line of the file
We can use the same fileReader created in the Setup script because it
was assigned to the context variable.*/
nextLine = context.fileReader.readLine()
/*If the end of the file hasn't been reached (nextLine does NOT equal null)
split the line and assign new property values, rename test request steps,
and go back to the first test request step*/
if(nextLine != null){
String[] propData = nextLine.split(",")
curTC = testRunner.testCase
curTC.setPropertyValue("data1",propData[0])
curTC.setPropertyValue("data2",propData[1])
curTC.setPropertyValue("data3",propData[2])
curTC.getTestStepAt(0).setName("data1-" + propData[0])
curTC.getTestStepAt(1).setName("data2-" + propData[1])
curTC.getTestStepAt(2).setName("data3-" + propData[2])
testRunner.gotoStep(0)
}
This is the error that I'm getting. Does anyone have any idea? I'm trying to read more than 3 columns from the CSV file, please help.
TestCase failed [java.lang.IndexOutOfBoundsException: Index: 2, Size: 2:java.lang.IndexOutOfBoundsException: Index: 2, Size: 2], time taken = 0
Here is CSV file data:
Hydrogen,1,H,1.00797,20.4
Carbon,6,C,12.0115,5100
Oxygen,8,O,15.9994,90.2
Gold,79,Gd,196.967,3239
Uranium,92,U,238.03,4091
Use OpenCSV instead for parsing CSV files with Java or Groovy. You can add the jar to the Groovy classpath (and dynamically resolve their dependencies) by using Grapes like this:
#Grab(group='com.opencsv', module='opencsv', version='3.3')
You're in luck. It's no problem that you're new to SoapUI, because OpenCSV doesn't have anything to do with SoapUI :)
How to read a CSV file using OpenCSV and Groovy
#Grab('com.opencsv:opencsv:3.5')
import com.opencsv.CSVReader
/*
* Mock some CSV data
*/
def reader = new StringReader(
'''column1,column2,column3,column4,column5
Hydrogen,1,H,1.00797,20.4
Carbon,6,C,12.0115,5100
Oxygen,8,O,15.9994,90.2
Gold,79,Gd,196.967,3239
Uranium,92,U,238.03,4091''')
/*
* A nice mapping to give each field in the CSV file a name.
* Much better than a bunch of propData[n] all over the place.
*/
def field = [
ELEMENT: 0
]
reader.withReader {
new CSVReader(it).eachWithIndex {list, index ->
if(index == 0) {
/*
* Do whatever you need to do with the header of the CSV file.
* Example:
* testCase.setPropertyValue("data1",list[field.ELEMENT])
*/
} else {
/*
* Do whatever you need to do with the remaining rows.
* Example:
* curTC.setPropertyValue("data1",list[field.ELEMENT])
*/
}
}
}
Header and Data
You'll notice that in the eachWithIndex() loop there's an if-else. This makes it possible to process the header and then proceed with the remaining rows without having to restart reading the file.
You should be able to plug your SoapUI-specific code into the appropriate section.
Varying number of fields
If for some reason your data rows don't all have the same number of fields, you can check how many fields there are like this: list.size()

Use JSON Input step to process uneven data

I'm trying to process the following with an JSON Input step:
{"address":[
{"AddressId":"1_1","Street":"A Street"},
{"AddressId":"1_101","Street":"Another Street"},
{"AddressId":"1_102","Street":"One more street", "Locality":"Buenos Aires"},
{"AddressId":"1_102","Locality":"New York"}
]}
However this seems not to be possible:
Json Input.0 - ERROR (version 4.2.1-stable, build 15952 from 2011-10-25 15.27.10 by buildguy) :
The data structure is not the same inside the resource!
We found 1 values for json path [$..Locality], which is different that the number retourned for path [$..Street] (3509 values).
We MUST have the same number of values for all paths.
The step provides Ignore Missing Path flag but it only works if all the rows misses the same path. In that case that step acts as as expected an fills the missing values with null.
This limits the power of this step to read uneven data, which was really one of my priorities.
My step Fields are defined as follows:
Am I missing something? Is this the correct behavior?
What I have done is use JSON Input using $.address[*] to read to a jsonRow field the full map of each element p.e:
{"address":[
{"AddressId":"1_1","Street":"A Street"},
{"AddressId":"1_101","Street":"Another Street"},
{"AddressId":"1_102","Street":"One more street", "Locality":"Buenos Aires"},
{"AddressId":"1_102","Locality":"New York"}
]}
This results in 4 jsonRows one for each element, p.e. jsonRow = {"AddressId":"1_101","Street":"Another Street"}. Then using a Javascript step I map my values using this:
var AddressId = getFromMap('AddressId', jsonRow);
var Street = getFromMap('Street', jsonRow);
var Locality = getFromMap('Locality', jsonRow);
In a second script tab I inserted minified JSON parse code from https://github.com/douglascrockford/JSON-js and the getFromMap function:
function getFromMap(key,jsonRow){
try{
var map = JSON.parse(jsonRow);
}
catch(e){
var message = "Unparsable JSON: "+jsonRow+" Desc: "+e.message;
var nr_errors = 1;
var field = "jsonRow";
var errcode = "JSON_PARSE";
_step_.putError(getInputRowMeta(), row, nr_errors, message, field, errcode);
trans_Status = SKIP_TRANSFORMATION;
return null;
}
if(map[key] == undefined){
return null;
}
trans_Status = CONTINUE_TRANSFORMATION;
return map[key]
}
You can solve this by changing the JSONPath and splitting up the steps in two JSON input steps. The following website explains a lot about JSONPath: http://goessner.net/articles/JsonPath/
$..AddressId
Does in fact return all the AddressId's in the address array, BUT since Pentaho is using grid rows for input and output [4 rows x 3 columns], it can't handle a missing value aka null value when you want as results return all the Streets (3 rows) and return all the Locality (2 rows), simply because there are no null values in the array itself as in you can't drive out of your garage with 3 wheels on your car instead of the usual 4.
I guess your script returns null (where X is zero) values like:
A S X
A S X
A S L
A X L
The scripting step can be avoided same by changing the Fields path of the first JSONinput step into:
$.address[*]
This is to retrieve all the 4 address lines. Create a next JSONinput step based on the new source field which contains the address line(s) to retrieve the address details per line:
$.AddressId
$.Street
$.Locality
This yields the null values on the four address lines when a address details is not available in an address line.