In what order Lookup transformation loads data from multiple Excel sheets in SSIS? - ssis

I created a package that uses Lookup transformation to loop through multiple Excel spreadsheet and loading data into SQL.
The question is: If excel spreadsheets are not ordered in a source folder then in what order Lookup transformation will be looping through each of the sheet?
Based on last modified date? based on name in alphabetical order?
The spreadsheets have a different dates, does it meant it loads data in order based on a date?

Related

SSIS import from excel files with multiple sheets and different amount of columns

I have SSIS with 2 loops that loop over excel files and over the sheets. I have a data flow task in the for each sheet loop with variable name for sheetname and the source is excel and odbc destination.
The table in the db has all the columns I need such as userid, username, productname, supportname.
However, some sheets can have columns username, productname and others have userid, username, productname, supportname.
How can I load the excel files? Can I add columns to a derived column task that checks if a column exists and if not add it with a default value and then map it to the destination?
thanks
SSIS is a not an any format goes at run-time data loading engine. There was a conscious design decision to make the fastest possible ETL tool and one of those requirements was that they needed to define a contract between the data source's shape and the destination. That's why you'll inevitably run into VS_NEEDSNEWMETADATA error because something has altered the shape and the package needs to be edited in designer mode to update the columns and sizes.
If you want to write the C# to make a generic Excel ingest engine, more power to you.
An alternative approach would be to have multiple data flows defined within your file and worksheet looping construct. The trick would be to conditionally enable them based on the available column set.
Columns "username and productname" detected, enable DFT UserName and ProductName. And that DFT will have default values, or a lookup, for UserId, SupportName, etc
All columns present, enable DFT All.
Finally, Azure Data Factory can "slurp and burp" whatever source to whatever destination. Perhaps that might be a better fit for your problem.

Read the last line of a CSV file and extract one value in Knime

I am working in a workflow on Knime, and I have an excel writer node as a final node of my workflow. I need to read this file and get and store the last value of one specific column (time). And with this data, I need to input in another time node, to update my API link to get a new request.
To summarize I need to extract specific information from the last line of my excel file in knime.
My question is: How can I read this file and get this value from my sheet? And then, how can update a time loop to refresh the data for inserting the current day in my API link?
UDDATE-> My question is how can I Filter always the last 90 days in my concatenate database. I have two columns in this file with dates. And I need to maintain just the last 90 days since the current day.
To read an Excel file, use the Excel Reader node.
The simplest way to get the last row of a table (assuming your date column has a value for every row of this table?) is probably to use a Rule-based Row Filter with the expression
$$ROWINDEX$$ = $$ROWCOUNT$$ => TRUE
Now you have a one-row table with the values from the last line of the Excel sheet. To help further we need to understand what you mean by update a time loop to refresh the date for inserting the current day in my API link. Can you update your question with a screenshot of your current KNIME code?

How to work with Unstructured Excel Column/Header names which are spread across multiple rows using R

There is an excel sheet http://www.censusindia.gov.in/2011census/C-series/C08.html
Please refer to the Excel sheet For row "India" for Column "C-08".
I want to analyse these data in R.However , the Excel Headers or column names are unstructured .Some headers are located in the first row, others are located in either the 2nd,3rd, or 4th row. Beneath the 4th row is the first subset of data we want to generate graphs from, there are multiple subsets as you go down the excel sheet. Each of these subsets is separated by an empty row. The excel sheet isn't in a format that can be analysed in R.
Please suggest some solution to the issue .
Thank you so much in advance!!

Extracting rows that meet certain criteria from a worksheet and populating another worksheet dynamically using functions

My first worksheet has smart markers.The column names can be different each time.
Can I populate a fresh worksheet with only those rows that meet a given criteria on one or more of the columns?
Example: If the columns are TYPE and VALUE. Can I add only rows which have TYPE= A to another sheet.
I have tried autofilter but it requires knowing column index and my column names change from time to time.
I am using aspose.cells for java.
I think the best way to cope with your scenario/case is you should filter and extract your desired rows/records in your query (e.g SQL statement) or ResultSet from data source by yourself, so once the Smart Markers are processed, you should have your desired data filled into the cells for your needs.

Copy details from one sheet to another

Brand new to Google sheets but very familiar with Excel.
I have a spreadsheet with a summary page (named Summary)and several individual pages (named for that person i.e. Fred, Joe, George).
Each individual page is the responsibility of one person who enters data once a week.
I want to copy that individual information, which is on a single row to the summary page by having the individual start a function which copies the data from their sheet to the summary sheet.
One of the cells is =TODAY() which needs to be converted to a fixed date, the other cells in the row are just numbers.
This is so easy in Excel (just write a macro) but having trouble finding how to do it automatically in Google sheets.
Can you try =TO_DATE(ABS(TODAY()))