Including a step STM in textrecipes - tidymodels

How can I use STM in textrecipes? (tidymodels workflow)
I need this for my master's degree where I use textrecipes as the main package.
The link https://www.tidymodels.org/learn/develop/recipes/ shows how to create a new step in my recipe, but I can't get out of zero with that.

Related

SSIS - Export multiple SQL Server tables to multiple text files

I have to move data between two SQL Server DBs. My task is to export the data as text (.dat) files, move the files and import into the destination. I have to migrate over 200 tables.
This is what I tried
1) I used a Execute SQL task to fetch my tables.
2) Used a For each loop to loop through the table names from the collection.
3) Used a script task inside the for each loop to build the text file destination path.
4) Called a DFT with the table name in a variable for the source ole db and the path name in a variable for the destination flat file.
First table extracts fine but the second table bombs with a synchronization error. I see this is numerous posts but could not find one that matches my scenario. Hence posting here.
Even if I get the package to work with multiple DFTs, the second table from the second DFT does not export columns because the flat file connection manager still remembers the first table columns. Is there a way to get it to forget the columns?
Any thoughts on how I can export multiple tables to multiple text files using one DFT using dynamic source and destination variable?
Thanks and appreciate your help.
Unfortunately Bulk Import Task only enable us to use format files effectively to map the columns between source and destinations. Bulk Import Task uses BULK INSERT TSQL command to import the data, to execute user should have the BULKADMIN server privilege.
Most of the companies would not allow BULKADMIN server privilege to enable due to security reasons.
Hence using the script task to construct BCP statements is a good and simple option to Export.
You does not require to construct .bat file as script itself can execute dos commands which runs under .NET security account.
I figured out a way to do this. I thought I will share if anybody is stuck in the same situation.
So, in summary, I needed to export and import data via files. I also wanted to use a format file if at all possible for various reasons.
What I did was
1) Construct a DFT which gets me a list of table names from the DB that I need to export. I used 'oledb' as a source and 'recordset destination' as target and stored the table names inside a object variable.
A DFT is not really necessary. You can do it any other way. Also, in our application, we store the table names in a table.
2) Add a 'For each loop container' with a 'For Each ADO Enumerator' which takes my object variable from the previous step into the collection.
3) Parse the variable one by one and construct BCP statements like below inside a Script task. Create variables as necessary. The BCP statement will be stored in a variable.
I loop through the tables and construct multiple BCP statements like this.
BCP "DBNAME.DBO.TABLENAME1" out "PATH\FILENAME2.dat" -S SERVERNAME -T -t"|" -r$\n -f "PATH\filename.fmt"
BCP "DBNAME.DBO.TABLENAME1" out "PATH\FILENAME2.dat" -S SERVERNAME -T -t"|" -r$\n -f "PATH\filename.fmt"
The statements are put inside a .bat file. This is also done inside the script task.
4) A execute process task will next execute the .BAT file. I had to do this because, I do not have the option to use the 'master..xp_cmdShell' command or the 'BULK INSERT' command in my company. If I had the option to execute cmdshell, I could have directly run the command from the package.
5) Again add a 'For each loop container' with a 'For Each ADO Enumerator' which takes my object variable from the previous step into the collection.
6) Parse the variable one by one and construct BCP statements like this inside a Script task. Create variables as necessary. The BCP statement will be stored in a variable.
I loop through the tables and construct multiple BCP statements like this.
BCP "DBNAME.DBO.TABLENAME1" in "PATH\FILENAME2.dat" -S SERVERNAME -T -t"|" -r$\n -b10000 -f "PATH\filename.fmt"
BCP "DBNAME.DBO.TABLENAME1" in "PATH\FILENAME2.dat" -S SERVERNAME -T -t"|" -r$\n -b10000 -f "PATH\filename.fmt"
The statements are put inside a .bat file. This is also done inside the script task.
The -b10000 was put so I can import in batches. Without this many of my large tables could not be copied due to less space in the tempdb.
7) Run the .bat file to import the file again.
I am not sure if this is the best solution. I still thought I will share what satisfied my requirement. If my answer is not clear, I would be happy to explain if you have any questions. We can also optimize this solution. The same can be done purely via VB Scripts but you have to write some code to do that.
I also created a package configuration file where I can change the DB name, server name, the data and format file locations dynamically.
Thanks.

store mysql query information chef

I am trying to query my mysql database. I am using the database cookbook and can setup a connection with my database. I trying to query my database for information so now the question is how do I store than information so I can access it in another resource. Where do the results of the query get stored? This is my recipe:
mysql_database "Get admin users" do
connection mysql_connection_info
sql "Select * from #{table_name}"
action :query
end
Thanks in advance
If you don't have experience with Ruby, this might be really confusing. There's no way to "return" the result of a provider from a Chef resource. The mysql_database is a Chef::Recipe DSL method that gets translated to Chef::Provider::Database::Mysql at runtime. This provider is defined in the cookbook.
If you take some time to dive into that provider, you'll can see how it executes queries, using the db object. In order to get the results of a query, you'll need to create your own connection object in the recipe and execute a command against it. For example
require 'mysql'
db = ::Mysql.new('host', 'username', 'password', nil, 'port', 'socket') # varies with setup
users = db.query('SELECT * FROM users')
#
# You might need to manipulate the result into a more manageable data
# structure by splitting on a carriage return, etc...
#
# Assume the new object is an Array where each entry is a username.
#
file '/etc/group' do
contents users.join("\n")
end
I find using good old Chef::Mixin:ShellOut / shell_out() fairly sufficient for this job and it's DB agnostic (assuming you know your SQL :) ). It works particularly well if all you are querying is one value; for multiple rows/columns you will need to parse the SQL query results. You need to hide row counts, column headers, eat preceding white-space, etc. from your result set to just get the query results you want. For example, below works on SQL Server :
Single item
so = shell_out!("sqlcmd ... -Q \"set nocount on; select file_name(1)\" -h-1 -W")
db_logical_name = so.stdout.chop
Multiple rows/columns (0-based position of a value within a row tells you what this column is)
so = shell_out!("sqlcmd ... -Q \"set nocount on; select * from my_table\" -h-1 -W")
rows_column_data = so.stdout.chop
# columns within rows are space separated, so can be easily parsed

Backup database(s) using query without using mysqldump

I'd like to dump my databases to a file.
Certain website hosts don't allow remote or command line access, so I have to do this using a series of queries.
All of the related questions say "use mysqldump" which is a great tool but I don't have command line access to this database.
I'd like CREATE and INSERT commands to be created at the same time - basically, the same performance as mysqldump. Is SELECT INTO OUTFILE the right road to travel, or is there something else I'm overlooking - or maybe it's not possible?
Use mysqldump-php a pure-PHP solution to replicate the function of the mysqldump executable for basic to med complexity use cases - I understand you may not have remote CLI and/or mysql direct access, but so long as you can execute via an HTTP request on a httpd on the host this will work:
So you should be able to just run the following purely PHP script straight from a secure-directory in /www/ and have an output file written there and grab it with a wget.
mysqldump-php - Pure PHP mysqldump on GitHub
PHP example:
<?php
require('database_connection.php');
require('mysql-dump.php')
$dumpSettings = array(
'include-tables' => array('table1', 'table2'),
'exclude-tables' => array('table3', 'table4'),
'compress' => CompressMethod::GZIP, /* CompressMethod::[GZIP, BZIP2, NONE] */
'no-data' => false,
'add-drop-table' => false,
'single-transaction' => true,
'lock-tables' => false,
'add-locks' => true,
'extended-insert' => true
);
$dump = new MySQLDump('database','database_user','database_pass','localhost', $dumpSettings);
$dump->start('forum_dump.sql.gz');
?>
With your hands tied by your host, you may have to take a rather extreme approach. Using any scripting option your host provides, you can achieve this with just a little difficulty. You can create a secure web page or strait text dump link known only to you and sufficiently secured to prevent all unauthorized access. The script to build the page/text contents could be written to follow these steps:
For each database you want to back up:
Step 1: Run SHOW TABLES.
Step 2: For each table name returned by the above query, run SHOW CREATE TABLE to get the create statement that you could run on another server to recreate the table and output the results to the web page. You may have to prepend "DROP TABLE X IF EXISTS;" before each create statement generated by the results of these queryies (!not in your query input!).
Step 3: For each table name returned from step 1 again, run a SELECT * query and capture full results. You will need to apply a bulk transformation to this query result before outputing to screen to convert each line into an INSERT INTO tblX statement and output the final transformed results to the web page/text file download.
The final web page/text download would have an output of all create statements with "drop table if exists" safeguards, and insert statements. Save the output to your own machine as a ".sql" file, and execute on any backup host as needed.
I'm sorry you have to go through with this. Note that preserving mysql user accounts that you need is something else entirely.
Use / Install PhpMySQLAdmin on your web server and click export. Many web hosts already offer you this as a service pre-configured, and it's easy to install if you don't already have it (pure php): http://www.phpmyadmin.net/
This allows you to export your database(s), as well as perform other otherwise tedious database operations very quickly and easily -- and it works for older versions of PHP < 5.3 (unlike the Mysqldump.php offered as another answer here).
I am aware that the question states 'using query' but I believe the point here is that any means necessary is sought when shell access is not available -- that is how I landed on this page, and PhpMyAdmin saved me!

What does #__ mean at the beginning of a table name in a Joomla mysql query?

I have a mysql query that looks like this $query="SELECT * FROM #__content".
What does the #__ at the start of the table name mean?
#__ is a prefix of your tables
#__ is simply the database table prefix and is defined in you configuration.php
If it wasn't defined, people would have to manually have to input their prefixes into every extension that requires access to the database, which you can imagine would be annoying.
So for example, if you database table prefix is j25, then:
#__content = j25_content
As others have said, hash underscore sequence '#_' is the prefix used for table names by Joomla!'s JDatabase class. (N.B. there is only one underscore, the second underscore is maintained to for readability in table names.)
When you first setup Joomla! you are given the option of setting a prefix or using the one randomly generated at the time. You can read about how to check the prefix here.
When you access the database using the JDatabase class it provides you with an abstraction mechanism so that you can interact with the database that Joomla is using without you having to code specifically for MySQL or MSSQL or PostgreSQL etc.
When JDatabase prepares a query prior to executing it, it replaces any occurrences #_ in the from segment of the query with the prefix setup when Joomla! was installed. e.g.
// Get the global DB object.
$db = JFactory::getDBO();
// Create a new query object.
$query = $db->getQuery(true);
// Select some fields
$query->select('*');
// Set the from From segment
$query->from('#__myComponents_Table');
Later when you execute the query JDatabase will change the from segment of the SQL from
from #__myComponents_Table to
from jp25_myComponents_Table — if the prefix is jp25, prior to executing it.

Robocopy, Multi-Line Execute Process SSIS Task, or Output Batch File Results to SSIS

I need to robocopy files from one location to another in a SSIS package. Since the folder is on another domain, I need to impersonate another account before I run the robocopy.exe command. I found I can execute a "net use" command to impersonate the necessary user account and then execute the robocopy command immediately afterwards. I don't see any way to do this in an Execute Process command to do this directly, so I use an Execute Process task to run a batch file that has these two commands as separate lines. The downside of this approach is that I cannot read the results of the Execute Process command. So this leads me to three questions:
Is there a way to execute a multi-line command in a single Execute Process task?
Is there a way to execute robocopy.exe while impersonating another account in one line?
Is there a way to write the results of a batch file back to either a variable in SSIS or to the SSIS database log?
If there is a positive answer for any of the above three questions, then I may be able to work out a way to add job success or failure rules based on the results of the robocopy command.
This can easily be achived if you have enabled the use of the Extended Stored Procedure "xp_cmdshell" (see Books Online for "Surface Area Configuration").
In the following sample I have build a .CMD file containing all my ROBOCOPY options and simple executes this command file using xp_cmdshell grabbing the output to a table variable (can be a persistent table instead):
Just add a Execute T-SQL statement Task to your SSIS package with the following statement:
/** START **/
DECLARE #cmdfile nvarchar(255) = N'C:\myFolder\myCommandFile.cmd'
DECLARE #logtable table (
[RowId] integer IDENTITY(1,1) NOT NULL
,[Message] nvarchar(1024) NULL
)
INSERT INTO #logtable ([Message])
EXEC xp_cmdshell #cmdfile
SELECT *
FROM #logtable
WHERE
[Message] IS NOT NULL
/** END **/
Depending on your logging options set for the ROBOCOPY command you can show progress, headers, report or more. See the ROBOCOPY documentation for those options. Also try use the /TEE switch if to grab any console output from the ROBOCOPY command.
One way I can think of doing this is to execute a batch command ( so you can execute multiple line items) with RUNAS (to impersonate another user account)
You can capture the output of the batch file in a log file and read the contents of that into SSIS using the Script.
I am not sure on how to write to the SSIS log and am interested to see what other developers have to say about that.
I found two round-about methods for getting data from a batch file execution into the database. One method is to add logging to the actual batch file. I haven't tried this yet, but it seems conceptually possible. The other option I have tried that has worked is to put the batch file execution as a SQL Server Agent job step and add flat-file logging. After I did this, I created a small package that scrapes the file for specific messages. I would still prefer a better solution, but this works well enough for now.
Is there a way to execute a multi-line command in a single Execute
Process task?
Yes, use "Cmd.exe" as executable
Create a string variable with following expression (sample):
*"/C \"start robocopy c:\\temp\\v10.2a c:\\temp\\v1 *.WRI /Z /S && start robocopy c:\\temp\\v10.2a c:\\temp\\v1 *.dll /Z /S\""*
And then map to the arguments parameter via component expression. Then you can execute those two (or more) robocopy in //