Pentaho Microsoft Access Input with multiple files - ms-access

I am using PDI 8.3 and have a collection of Microsoft Access files in a folder. I am using Microsoft Access Input and have a table name chosen. If the table does not actually exist in onw of the files, the transformation stops at that point without any errors and does not continue on to the next file. For example, if being output to another database (like MS Access Input to mysql):
file_a.accdb => has the table "dbo_foo"
file_b.accdb => has the table "dbo_foo"
file_c.accdb => DOES NOT have the table "dbo_foo"
file_d.accdb => has the table "dbo_foo"
In this scenario, the transformation will run and data from file_a and file_b will be inserted, but the transformation will stop because the table doesn't exist in file_c. While that might be ok, I still need the data in file_d inserted, and Pentaho is not showing errors for me to even detect.

I don't know how you call list of the access database. But you can use below PROCESS , where each database will process individually (even if any one has no table match).

Related

Read Access DB Extract in Talend

I have a requirement to read Access DB Extract in Talend. There is a component in Talend 'tAccessInput' which is used to read Access DB tables. But it requires a connection to Access Database.
However, for my requirement i am given an extract of Access db, say MasterTables.accdb and it is not a live database connection. I need to extract the tables that are present in this Access DB Extract using Talend.
Also, i know there is an option of exporting from Access DB Extract by opening the extract and exporting the required tables, but i do not wont to do it manually.
So, is there a component/steps that can help me achieve my requirement using Talend.
As guided by #iMezouar(Thanks for the inputs), i was able to use
tAccessInput component of Talend and achieve my requirement. Below are
the steps i followed:
Step 1- Configure tAccessInput component. Set the Database field with
the path of the .accdb extract. Leave the username and password blank
if the extract is not password protected
Step 2- In the Table Name field provide the name of the table that you
want to read from your access extract
Step 3- Go to edit schema section and add the column details of the
TableName provided in the above step
Step 4- Now go to Query Type and select Guess Query. Once you have
clicked this button, it will populate the Query section with the
relevant query
Step 5- Next connect the tAccessInput to tMap if you intend to do any
processing else connect it directly to an output component. In my case
its tFileOutputDelimited and your job is ready to run to extract data
from the access dump
Step 6- If you get a Warning issue in the Run Console i.e 'Error in
the metadata of the table: table's row count in the metadata is XXX
but XXY records have been found and loaded by UCanAccess. All will
work fine, but it's better to repair your database', just open the
access dump, go to Database Tools tab and select 'Compact and Repair
Database'. Then save the file. This will remove the warning issue also.

No columns returned SSIS

I am implementing a SSIS package and currently trying to do the following.
Truncate the destination table
Fetch the data by executing the stored procedure and insert it into the destination table.
I have created an Execute SQL task to address step 1 and dataflow with oledb source and oledb destination to address the second point. It been working successfully so far but isn't working for one my stored procedure that uses temp tables.
When I edit the oledb source and click the preview button, I get the error no column returned
I know that SSIS has an issue with generating column while executing stored procedures that depend on temp tables. I have converted the stored proc to use temporary table variables and its now able to return columns in SSIS when I do a preview. The only downside is that the stored procedure is taking longer time to execute. Its taking 1 hour 15 mins as compared to 15 mins while using temp tables.
I did see a suggestion to use SET FMTONLY before executing the stored procedure as an alternate solution to changing to temp table variables but that didn't seem to work as I am getting syntax or permission denied error.
Could somebody tell me a solution to my problem which does not compromise on the performance.
Sounds like you've already read all the approaches to using Temp tables in SSIS, including the IF 1=0... trick? If you haven't seen that one yet, google it.
You say that using Table Variables causes your stored procedure to take about 5 times longer than using Temp Tables. The most likely reason for that is that you are indexing your temp tables but not your table variables. If you didn't know that table variables can be indexed, they can. You might try that.
Finally, a solution that you haven't mentioned is that you can replace your temporary table with a real table that gets truncated when you're done using it.
Short comment:
Try EXEC WITH RESULT SETS and specify the metadata yourself for a proc with temp tables; or use the Script Component as a source and specify the Output columns yourself.
Long comment:
Technically speaking, it is the driver/database you are using in SSIS that would decide the behavior when working with temp tables.
Metadata is an important factor when using SSIS's pipeline components. By metadata, I mean the names of the columns, their data types etc that a pipeline component uses. When designing a data flow, someone/something should provide this metadata to the components that require it.
In most cases, SSIS automatically retreives the metadata. Components that do not connect to a external data source, like Conditional Split etc, get their metadata from the other components they are connected to. For the pipeline components that connect to a external data source (like Oledb source, oledb destination, Lookup etc.), SSIS provides a mechanism to get this metadata without human involvement. This mechanism involves the driver connecting to the database and retrieving the metadata of the output. If the driver/database is capable of returning the metadata, then that metadata is used. If the driver/database is incapable, then you get the errors you are seeing. The rest of my comments are based on the assumption that you are using a SQL Server database in your question.
When working with a SQL Server database in SSIS, typically, we use the native client drivers provided by Microsoft. When trying to get the metadata, these drivers try to get the metadata without actually executing the SQL Statement (actual execution can have side effects; and also, might take more than a few seconds/minutes/hours; and you dont want side effects and long wait times during package design time.) So to get the metadata, the driver relies on the metadata of the actual objects used in the sql command. If the command uses a physical table or view, SQL Server already has the metadata available and can supply it to the driver. If it is a temp table, SQL Server does not have the metadata until it can create the temp table. If using FMT ONLY option, you can use it in such a way to create the temp tables, but avoid any heavy processing/side affects and thus be able to retrieve metadata without penalties. Post 2012, these native client drivers rely on some newer functionality to retrieve metadata than the drivers before 2012. In 2012 and after, the driver uses the sp_describe_first_result_set proc to retrieve metadata. So, whether you can get metadata or not is determined by the ability of the sp_describe_first_result_set proc.
So while SSIS can automatically get the metadata (because of the driver/database), it does not automatically get the metadata in some cases (again because of the driver/database). In cases involving the second scenario, some other process (typically a human) can help the driver infer metadata or provide the metadata to the component directly.
To help the driver, in case of SQL Server 2012 and after, you can use the WITH RESULTSETS clause to specify the output metadata. When this clause is present, the driver will use it and doesnt try to query the metadata from system objects; and thus avoid the error which you would otherwise get. If you are using the drivers that came with SQL Server 2008, you can use FMT ONLY. This option is at the driver/database level.
Another option could be to use a Script Component as the Source and in the Output columns, you can specify the columns/metadata. SSIS would not try to retrieve metadata from the datasource in this case, but would rely on the definitions you provided in the Output section of the Script Component.
As you can see, both options involve a human (or some other process) specifying the metadata instead of SSIS trying to retrieve the metadata in an automated fashion. I would prefer the first option if working with SQL Server and the second option if working with databases like MySql.

Refresh Data in Access 2016 from Read-Only MySQL ODBC`

I am attempting to "sync" data from a read-only ODBC MySQL server to Access 2016. I need to move the data into Access so that I can more easily manipulate and create better customized reports.
I have linked the data tables between Access and MySQL, however I cannot get the data in these tables to automatically refresh. I must go into Access and hit "Refresh All".
What I'm looking to do is update all of my open tables in Access once nightly so that each morning the data used to build these reports is new. Currently if I leave these tables all evening, when I get in the next morning I must hit "Refresh-All" for Access to go retrieve the most recent data.
Any ideas?
The data in linked tables is automatically refreshed by access when you attempt to read them. You can do that by displaying a datasheet view of the database, or by a form where the linked table is the data source. Beware, we have had problems which tables with lots of records being the source for drop down lists, having the database locked.
Access only does this properly (and at speed) if either the underlying table has a unique clustered index, or after having linked the tables you create an index in access.
If you want to create a copy that you can manipulate (such as write to) and the underlying tables are read only, then you will have to create matching unlinked tables and execute some form of copy sql and appropriate points in your application.

How to run append query from data macro MS Access?

I have 2 table, one is local named 'Client' and other is a linked table to MySQL DB in my web-server named 'ClientSql'.
When I insert data into the table Client, I want it to be insert into clientSql too. I try it with data macro (after insert), but it shows me an error saying
It's not possible in linked tables.
I have tried to create an append query successfully, and it works, but it just works if I execute it manually. My question is:
Is it possible to call it from a data macro? if is possible, can you show me how? If not, can you point me to a solution?
I'm not sure why you should be able to do it manually, but not via a macro. According to this link
http://dev.mysql.com/doc/connector-odbc/en/connector-odbc-examples-tools-with-access-linked-tables.html
you should be able to do it either way.
Another thought is to eliminate the local access client table and have the access program update the mySql table directly. However, if another program is accessing the client table at the same time, this could become tricky due to multi-user and locking situations.

microsoft access automated copy from one table to another

I am new to access and have mostly worked with SQL Server. I am trying to accomplish a task for our users and am not sure how to approach it.
We have a situation where the users need to manipulate some data in various access tables, then put the final results into one of several 'linked tables' that are defined in SQL Server and linked to access. The sql server tables will defined very generically with column names like 'col1', 'col2' etc to allow for different types of data to be uploaded.
What I would like to do is have some kind of macro or module that does this:
1) Lets the user select the source (access table)
2) Lets the user select destination (linked sql server) table
3) Lets the user map the columns he would like to copy from the source to the destination table. (If this is too difficult then just something that would copy the first x number of columns would work).
4) Deletes all pre-existing data from the target table
5) Copies all data from the source table to the target table.
Could someone give me an idea as to what would be the best approach or maybe even an example of some code that does something similar? Thanks in advance. We are using Microsoft Access 2010.