I have a CSV file which has 3 columns and file does not have any header column but it has fixed pattern (like for first column, It will have url, Second and third column will have checksum). To process individual column values, I am using CSVBeanReader.
CSVBeanReader reads values from 2nd line with below code:
ICsvBeanReader beanReader = new CsvBeanReader(new FileReader(path),
CsvPreference.EXCEL_NORTH_EUROPE_PREFERENCE);
String[] header = beanReader.getHeader(true);
header = new String[] { "docURL", "shaCheckSum", null };
CellProcessor[] processors = new CellProcessor[3];
processors = getChecksumProcessors();
ValueObj docRecord;
while ((docRecord = beanReader.read(ValueObj.class, header, processors)) != null) {
docRecordList.add(docRecord);
}
private static CellProcessor[] getChecksumProcessors() {
return new CellProcessor[] { new NotNull(), new NotNull(), null };
}
How should I read first line of csv file using CSVBeanReader which contains data?
CSV file contains data from first line like below:
ftp://folder_struc/filename.pdf;checksum1;checksum2
Please let me know.
I guess you should omit the String[] header = beanReader.getHeader(true); line. Try just var header = new String[] {...}
Related
Iam new to SSIS , Iam facing the below issue while parsing a text file which contains the below sample data
Below is the requirement
-> Need to Capture the number after IH1(454756567) and insert into one column as
InvoiceNumber
-> Need to insert the data between ABCD1234 to ABCD2345 into another column as
TotalRecord .
Many thanks for the help .
ABCD1234
IH1 454756567 686575634
IP2 HJKY TXRT
IBG 23455GHK
ABCD2345
IH1 689343256 686575634
IP2 HJKY TXRT
IBG 23455GHK
ABCD5678
This is the script component to process the entire file. You need to create your output and they are currently being processed as strings.
This assumes your file format is consistent. If you don't have 2 columns in IH1 and IP2 ALL the time. I would recommend a for loop from 1 to len -1 to process. And send the records to their own output.
public string recordID = String.Empty;
public override void CreateNewOutputRows()
{
string filePath = ""; //put your filepath here
using (System.IO.StreamReader sr = new System.IO.StreamReader(filePath))
{
while (!sr.EndOfStream)
{
string line = sr.ReadLine();
if (line.Substring(0, 4) == "ABCD") //Anything that identifies the start of a new record
// line.Split(' ').Length == 1 also meets your criteria.
{
recordID = line;
Output0Buffer.AddRow();
Output0Buffer.RecordID = line;
}
string[] cols = line.Split(' ');
switch (cols[0])
{
case "IH1":
Output0Buffer.InvoiceNumber = cols[1];
Output0Buffer.WhatEverTheSecondColumnIs = cols[2];
break;
case "IP2":
Output0Buffer.ThisRow = cols[1];
Output0Buffer.ThisRow2 = cols[2];
break;
case "IBG":
Output0Buffer.Whatever = cols[1];
break;
}
}
}
}
You'll need to do this in a script component.
I read some columns from a csv file and then display it in a DataGridView. The column "Value" contains some 3-digit integer values. I want to have this integer values shown in the datagridview as doubles with one decimals place. The conversion doesn't seem to work. Also when I load a large csv file (around 30k rows) it is loaded immediately but with conversion it takes too much time.
using (StreamReader str = new StreamReader(openFileDialog1.FileName)) {
CsvReader csvReadFile = new CsvReader(str);
dt = new DataTable();
dt.Columns.Add("Value", typeof(double));
dt.Columns.Add("Time Stamp", typeof(DateTime));
while (csvReadFile.Read()) {
var row = dt.NewRow();
foreach (DataColumn column in dt.Columns) {
row[column.ColumnName] = csvReadFile.GetField(column.DataType, column.ColumnName);
}
dt.Rows.Add(row);
foreach (DataRow row1 in dt.Rows)
{
row1["Value"] = (Convert.ToDouble(row1["Value"])/10);
}
}
}
dataGridView1.DataSource = dt;
Sounds like you have two questions:
How to format the value to a single decimal place of scale.
Why does the convert section of code take so long?
Here are possibilities
See this answer which specifies a possibility of using
String.Format("{0:0.##}", (Decimal) myTable.Rows[rowIndex].Columns[columnIndex]);
You are iterating over every row of the datatable every time you read a line. That means when you read line 10 of the CSV, you will iterate over rows 1-9 of the DataTable again! And so on for each line you read! Refactor to pull that loop out of the ReadLine... something like this:
using (StreamReader str = new StreamReader(openFileDialog1.FileName)) {
CsvReader csvReadFile = new CsvReader(str);
dt = new DataTable();
dt.Columns.Add("Value", typeof(double));
dt.Columns.Add("Time Stamp", typeof(DateTime));
while (csvReadFile.Read()) {
var row = dt.NewRow();
foreach (DataColumn column in dt.Columns) {
row[column.ColumnName] = csvReadFile.GetField(column.DataType, column.ColumnName);
}
dt.Rows.Add(row);
}
foreach (DataRow row1 in dt.Rows)
{
row1["Value"] = (Convert.ToDouble(row1["Value"])/10);
}
}
dataGridView1.DataSource = dt;
I wish to take selected data from a collection of csv files, i have written code but confused on its behaviour, it reads them all, what am i doing wrong please.
string[] array1 = Directory.GetFiles(WorkingDirectory, "00 DEV1 2????????????????????.csv"); //excludes "repaired" files from array, and "Averaged" logs, if found, note: does not exclude duplicate files if they exist (yet)
Console.WriteLine(" Number of Files found with the filter applied = {0,6}", (array1.Length));
int i = 1;
foreach (string name in array1)
{
// sampling engine loop here, take first line only, first column DateTimeStamp and second is Voltage
Console.Write("\r Number of File currently being processed = {0,6}", i);
i++;
var reader = new StreamReader(File.OpenRead(name)); // Static for testing only, to be replaced by file filter code
reader.ReadLine();
reader.ReadLine(); // skip headers, read and do nothing
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var values = line.Split(',');
using (StreamWriter outfile = new StreamWriter(#"C:\\SampledFileResults.txt",true))
{
string content = "";
{
content = content + values[0] + ",";
content = content + values[9] + ",";
}
outfile.WriteLine(content);
Console.WriteLine(content);
}
}
} Console.WriteLine("SAMPLING COMPLETED");
Console.ReadLine();
Console.WriteLine("Test ended on {0}", (DateTime.Now));
Console.ReadLine();
}
}
You are using a while loop to read through all lines of the file. If you only want a single line, you can remove this loop.
Just delete the line:
while (!reader.EndOfStream)
{
And the accompanying close bracket
}
im trying to stop empty csv files causing errors in my simple sampling program, just grab 2 values from each .csv file in folder,
i have null check, which now catches it, but im unsure how to re-structure my code so it skips file in array to next one, any assistance greatly welcomed,
foreach (string name in array1)
{
// sampling engine loop here, take first line only, first column DateTimeStamp and second is Voltage
Console.Write("\r Number of File currently being processed = {0,6}", i);
i++;
var reader = new StreamReader(File.OpenRead(name)); // Static for testing only, to be replaced by file filter code
var line = reader.ReadLine();
if (line == null)
{
Console.WriteLine("Null value detected");
Console.ReadKey();
break;
}
var values = line.Split(',');
reader.ReadLine();
if (values.Length == 89)
{
using (StreamWriter outfile = new StreamWriter(#"C:\\SampledFileResults.txt", true))
{
string content = "";
{
content = content + values[0] + ",";
content = content + values[9] + ",";
}
outfile.WriteLine(content);
Console.WriteLine(content);
}
}
}
Console.WriteLine("SAMPLING COMPLETED");
I got into troubles trying to feed the jqgrid with required information. I did everything as suppose to be done, but apparently there is an issue.
Every second cell is order differently, so first row is ok:
[{"id":"AA1","cell":["AA1","AD + DNS + WINS","dev"]},
but the next one is ordered like below:
{"id":"AA2","cell":["dev","AD + DNS + WINS","AA2"]}
when 3rd is ok, and 4th is disordered and so on.
Code which is responsible for this process is below:
var jsonData = new
{
total = totalPages,
page = page,
records = totalRecords,
rows = (
from l in lst
select new
{
id = l.HostName,
cell = new List<string> {
l.HostName, l.Description, l.Type
}
}).ToArray()
};
return Json(jsonData, JsonRequestBehavior.AllowGet);
Why is like that? I was trying use instead of List the String[], but Linq doesn't like that and pop up error, which suggest List instead of string array.
Is there any way to sustain desired order?
What was your code to use string[]? I got this working without any trouble:
var jsonData = new
{
total = totalPages,
page = page,
records = totalRecords,
rows = (from l in lst
select new
{
id = l.HostName,
cell = new string[] {
l.HostName,
l.Description,
l.Type
}
}).ToArray()
};
You can find similar samples here (but remember that in general they are very old and I would suggest looking at more up to date ones here or here).