How to load only 50 records at a time into a table using SSIS - ssis

I have an excel sheet with a record of 6471 rows of data I need to load only 50 records at a time can anyone help me out here

I will use a script task to read the excel sheet into a data table, then read the first 50 rows of that data table.
I added some code snippet that may guide you.
Read excel into a data table
using (System.Data.OleDb.OleDbConnection xlConn = new System.Data.OleDb.OleDbConnection(connectionString))
{
xlConn.Open();
System.Data.OleDb.OleDbCommand xlCmd = xlConn.CreateCommand();
xlCmd.CommandText = "Select * from [" + ExcelTable + "$]";
xlCmd.CommandType = CommandType.Text;
using (System.Data.OleDb.OleDbDataReader rdr = xlCmd.ExecuteReader())
{
while (rdr.Read())
{
sRow = stagingTable.NewRow();// tAble.NewRow();
for (int i = 1; i < rdr.FieldCount; i++)
{
if (!HeadersCreated)
{
sCol = new DataColumn();
sCol.DataType = System.Type.GetType("System.String");
sCol.ColumnName = "Col" + i.ToString();
stagingTable.Columns.Add(sCol);
}
{
sRow["Col" + i.ToString()] = rdr[i].ToString();
}
}
stagingTable.Rows.Add(sRow);
HeadersCreated = true;
}
}
xlConn.Close();
}
read first 50 rows of data table
for (int i = 0; i < stagingTable.Rows.Count || i <= 50; i++)
{
for (int j =0; j < stagingTable.Columns.Count; j++)
{
if (!string.IsNullOrWhiteSpace(stagingTable.Rows[0][j].ToString()))
{
sRow = tAble.NewRow();
sRow["col1"] = DepartmentName.ToString();
sRow["col2"] = stagingTable.Rows[i][0].ToString();
sRow["col3"] = stagingTable.Rows[i][j].ToString();
sRow["col4"] = stagingTable.Rows[0][j].ToString();
tAble.Rows.Add(sRow);
}
}
}
Do what you want with the first 50 rows.

I am not exactly sure why you want to reduce the rows per batch but you can define that in the destination component as follows:

Related

Set XML .#<index> based on interator AS3

In AS3 is it possible to export multiple iterations of the same variable as seen in the below example:
var item:String = "obj";
var child:XML = new XML(<{item}/>);
child.#x = String(object.x);
child.#y = String(object.y);
child.#n = String(object.name);
child.#w = String(object.width);
child.#h = String(object.height);
//...instead of:
child.#s = String(object.sprite);
//...is the below possible:
for (i = 0; i < <length>; ++i) {
child.#s[i] = String(object.get_sprite(i));
}
//...desired <filename>.xml output:
obj.s0 = "sprite_0"
obj.s1 = "sprite_1"
obj.s2 = "sprite_2"
obj.s3 = "sprite_3"
etc..
Haven't worked on AS3 for a while now, but if I remember correctly you should be able to do:
for (i = 0; i < <length>; ++i) {
var attrName = "s" + i.toString();
child.#[attrName] = String(object.get_sprite(i));
}
Sorry, I don't have tools available to try it out myself, but should work.

Google Script Reference Error

In Google Sheets, I have a long list of names in column A, and tags in column B. What I want to accomplish with my function is to return only those values that I have stored in the functions list variable.
Currently I have a ReferenceError in my script project, and I don't know why.
Help would be appreciated.
function myFunction(cellInput) {
var input = cellInput;
var list = " word 1; word 2; word-3; word 4; word6; word-7;";
var splitted_input = input.split("; ");
var splitted_list = list.split("; ");
var result = "";
for (var inc1 = 0; inc1 < splitted_list.length; inc1++) {
for (var inc2 = 0; inc2 < splitted_input.length; inc2++) {
var resSet = new Set(result.split("; "));
if(splitted_list[inc1] == splitted_input[inc2] && !resSet.has(splitted_input[inc2])) {
result = result + splitted_input[inc2] + "; ";
}
}
}
return result;
}

Export html to Excel format? [duplicate]

I want to extract some data like " email addresses " .. from table which are in PDF file and use this email addresses which I extract to send email to those people.
What I have found so far through searching the web:
I have to convert the PDF file to Excel to read the data easily and use them as I want.
I find some free dll like itextsharp or PDFsharp.
But I didn't find any snippet code help to do this in C#. is there any solution ?
You absolutely do not have to convert PDF to Excel.
First of all, please determine whether your PDF contains textual data, or it is scanned image.
If it contains textual data, then you are right about using "some free dll". I recommend iTextSharp as it is popular and easy to use.
Now the controversial part. If you don't need rock solid solution, it would be easiest to read all PDF to a string and then retrieve emails using regular expression.
Here is example (not perfect) of reading PDF with iTextSharp and extracting emails:
public string PdfToString(string fileName)
{
var sb = new StringBuilder();
var reader = new PdfReader(fileName);
for (int page = 1; page <= reader.NumberOfPages; page++)
{
var strategy = new SimpleTextExtractionStrategy();
string text = PdfTextExtractor.GetTextFromPage(reader, page, strategy);
text = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(text)));
sb.Append(text);
}
reader.Close();
return sb.ToString();
}
//adjust expression as needed
Regex emailRegex = new Regex("Email Address (?<email>.+?) Passport No");
public IEnumerable<string> ExtractEmails(string content)
{
var matches = emailRegex.Matches(content);
foreach (Match m in matches)
{
yield return m.Groups["email"].Value;
}
}
Using bytescout PDF Extractor SDK we can be able to extract the whole page to csv as below.
CSVExtractor extractor = new CSVExtractor();
extractor.RegistrationName = "demo";
extractor.RegistrationKey = "demo";
TableDetector tdetector = new TableDetector();
tdetector.RegistrationKey = "demo";
tdetector.RegistrationName = "demo";
// Load the document
extractor.LoadDocumentFromFile("C:\\sample.pdf");
tdetector.LoadDocumentFromFile("C:\\sample.pdf");
int pageCount = tdetector.GetPageCount();
for (int i = 1; i <= pageCount; i++)
{
int j = 1;
do
{
extractor.SetExtractionArea(tdetector.GetPageRect_Left(i),
tdetector.GetPageRect_Top(i),
tdetector.GetPageRect_Width(i),
tdetector.GetPageRect_Height(i)
);
// and finally save the table into CSV file
extractor.SavePageCSVToFile(i, "C:\\page-" + i + "-table-" + j + ".csv");
j++;
} while (tdetector.FindNextTable()); // search next table
}
public void Convert(string fileNames) {
int pageCount = 0;
iTextSharp.text.pdf.PdfReader reader = new iTextSharp.text.pdf.PdfReader(fileNames);
pageCount = reader.NumberOfPages;
string ext = System.IO.Path.GetExtension(fileNames);
//string[] outfiles = new string[pageCount];
//Excel.Application app = new Excel.Application();
//app.Workbooks.Add("");
CSVExtractor extractor = new CSVExtractor();
//string outfilePDF1 = fileNames.Replace((System.IO.Path.GetFileName(fileNames)), (System.IO.Path.GetFileName(fileNames).Replace(".pdf", "") + "_rez" + ".csv"));
string outfilePDFExcel1 = fileNames.Replace((System.IO.Path.GetFileName(fileNames)),
(System.IO.Path.GetFileName(fileNames).Replace(".pdf", "") + "_rez" + ".xls"));
extractor.RegistrationName = "demo";
extractor.RegistrationKey = "demo";
string folderName = #"C:\Users\Dafina\Desktop\PDF_EditProject\PDF_EditProject\PDFs";
string pathString = System.IO.Path.Combine(folderName, System.IO.Path.GetFileName(fileNames).Replace(".pdf", "")) + "-CSVs";
System.IO.Directory.CreateDirectory(pathString);
for (int i = 0; i < pageCount; i++)
{
string outfilePDF = fileNames.Replace((System.IO.Path.GetFileName(fileNames)),
(System.IO.Path.GetFileName(fileNames).Replace(".pdf", "") + "_" + (i + 1).ToString()) + ext);
extractor.LoadDocumentFromFile(outfilePDF);
//string outfile = fileNames.Replace((System.IO.Path.GetFileName(fileNames)),
// (System.IO.Path.GetFileName(fileNames).Replace(".pdf", "") + "_" + (i + 1).ToString()) + ".csv");
string outfile = fileNames.Replace((System.IO.Path.GetFileName(fileNames)),
(System.IO.Path.GetFileName(fileNames).Replace(".pdf", "") + "-CSVs\\" + "Sheet_" + (i + 1).ToString()) + ".csv");
extractor.SaveCSVToFile(outfile);
}
Excel.Application xlApp = new Microsoft.Office.Interop.Excel.Application();
if (xlApp == null)
{
Console.WriteLine("Excel is not properly installed!!");
return;
}
Excel.Workbook xlWorkBook;
object misValue = System.Reflection.Missing.Value;
xlWorkBook = xlApp.Workbooks.Add(misValue);
string[] cvsFiles = Directory.GetFiles(pathString);
Array.Sort(cvsFiles, new AlphanumComparatorFast());
//string[] lista = new string[pageCount];
//for (int t = 0; t < pageCount; t++)
//{
// lista[t] = cvsFiles[t];
//}
//Array.Sort(lista, new AlphanumComparatorFast());
Microsoft.Office.Interop.Excel.Worksheet xlWorkSheet;
for (int i = 0; i < cvsFiles.Length; i++)
{
int sheet = i + 1;
xlWorkSheet = xlWorkBook.Sheets[sheet];
if (i < cvsFiles.Length - 1)
{
xlWorkBook.Worksheets.Add(Type.Missing, xlWorkSheet, Type.Missing, Type.Missing);
}
int sheetRow = 1;
Encoding objEncoding = Encoding.Default;
StreamReader readerd = new StreamReader(File.OpenRead(cvsFiles[i]));
int ColumLength = 0;
while (!readerd.EndOfStream)
{
string line = readerd.ReadLine();
Console.WriteLine(line);
try
{
string[] columns = line.Split((new char[] { '\"' }));
for (int col = 0; col < columns.Length; col++)
{
if (ColumLength < columns.Length)
{
ColumLength = columns.Length;
}
if (col % 2 == 0)
{
}
else if (columns[col] == "")
{
}
else
{
xlWorkSheet.Cells[sheetRow, col + 1] = columns[col].Replace("\"", "");
}
}
sheetRow++;
}
catch (Exception e)
{
string msg = e.Message;
}
}
int k = 1;
for (int s = 1; s <= ColumLength; s++)
{
xlWorkSheet.Columns[k].Delete();
k++;
}
releaseObject(xlWorkSheet);
readerd.Close();
}
xlWorkBook.SaveAs(outfilePDFExcel1, Microsoft.Office.Interop.Excel.XlFileFormat.xlWorkbookNormal,
misValue, misValue, misValue, misValue, Microsoft.Office.Interop.Excel.XlSaveAsAccessMode.xlExclusive,
misValue, misValue, misValue, misValue, misValue);
xlWorkBook.Close(true, misValue, misValue);
xlApp.Quit();
releaseObject(xlWorkBook);
releaseObject(xlApp);
var dir = new DirectoryInfo(pathString);
dir.Attributes = dir.Attributes & ~FileAttributes.ReadOnly;
dir.Delete(true);
}
Probably the Best code would be to use Third party dll
namespace ConsoleApp2
{
internal class Program
{
static void Main(string[] args)
{
string pathToPdf = #"D:\abc\abc.pdf";
string pathToExcel = Path.ChangeExtension(pathToPdf, ".xls");
SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();
f.ExcelOptions.ConvertNonTabularDataToSpreadsheet = false;
// 'true' = Preserve original page layout.
// 'false' = Place tables before text.
f.ExcelOptions.PreservePageLayout = true;
// The information includes the names for the culture, the writing system,
// the calendar used, the sort order of strings, and formatting for dates and numbers.
System.Globalization.CultureInfo ci = new System.Globalization.CultureInfo("en-US");
ci.NumberFormat.NumberDecimalSeparator = ",";
ci.NumberFormat.NumberGroupSeparator = ".";
f.ExcelOptions.CultureInfo = ci;
f.OpenPdf(pathToPdf);
if (f.PageCount > 0)
{
int result = f.ToExcel(pathToExcel);
// Open the resulted Excel workbook.
if (result == 0)
{
System.Diagnostics.Process.Start(pathToExcel);
}
}
}
}
}

Check if current user is the administrator of a certain Google group

I want to go through all the Google groups I am a member of and get the list of all the users of each group :
var list = GroupsApp.getGroups();
for(var i = 0; i< list.length; i++){
for(var key in headerObj){
var text = "";
if(key == "users"){
var tab = list[i].getUsers();
if(tab.length > 0){
text = tab[0].getEmail();
for(var j = 1; j < tab.length; j++){
text += ", " + tab[j].getEmail();
}
}
headerObj[key].push(text);
}
}
}
But I always get this Exception :
You do not have permission to view the member list for the group: "group email"
Is there a way to go through all the Google groups of which, I am the administrator ?
Unfortunatly such a thing is not possible there is however the workaround of a try catch:
function myFunction() {
var allGroups = GroupsApp.getGroups();
for (var i in allGroups){
try {
var users = allGroups[i].getUsers();
for (var j in users){
Logger.log(users[j]);
}
}
catch (e) { }
}
}

tab escape sequence not working when exporting from JXTable to excel (.csv) file

i am trying to export data from a JXTable to a .csv file(excel). i am using the following code for this:
public void exportToExcel(NGAFStandardTable table, File file){
int i = 0;
int j = 0;
try{
TableModel model = table.getModel();
FileWriter excel = new FileWriter(file);
for (i = 0; i < model.getColumnCount(); i++) {
excel.write(model.getColumnName(i) + "\t");
}
excel.write("\n");
for (i = 0; i < model.getRowCount(); i++){
for (j = 0; j < (model.getColumnCount()); j++){
if(model.getValueAt(i,j) == null){
excel.write("" + "\t");
}
else {
excel.write(model.getValueAt(i,j).toString() + "\t");
}
}
excel.write("\n");
}
excel.close();
}
catch(IOException e) {
System.out.println(e);
}
}
the result is i am geting a csv file which has all the values from the table but the values for each row is in a single cell(for eg., A1, A2, A3).
i.e., all the values for row1 is in A1 cell, and so on...
i am using tab escape sequence("\t") so the data moves to next column but its not happening. kindly suggest
it worked when i used "," as the delimeter in place of "\t":
excel.write(model.getValueAt(i, j).toString() + ",");