How do I customise the CREATE DATABASE statement in VSTS DB Edition Deploy? - sql-server-2008

I'm using VSTS Database Edition GDR Version 9.1.31024.02
I've got a project where we will be creating multiple databases with identical schema, on the fly, as customers are added to the system. It's one DB per customer. I thought I should be able to use the deploy script to do this. Unfortunately I always get the full filenames specified on the CREATE DATABASE statement. For example:
CREATE DATABASE [$(DatabaseName)]
ON
PRIMARY(NAME = [targetDBName], FILENAME = N'$(DefaultDataPath)targetDBName.mdf')
LOG ON (NAME = [targetDBName_log], FILENAME = N'$(DefaultDataPath)targetDBName_log.ldf')
GO
I'd expected something more like this
CREATE DATABASE [$(DatabaseName)]
ON
PRIMARY(NAME = [targetDBName], FILENAME = N'$(DefaultDataPath)$(DatabaseName).mdf')
LOG ON (NAME = [targetDBName_log], FILENAME = N'$(DefaultDataPath)$(DatabaseName)_log.ldf')
GO
Or even
CREATE DATABASE [$(DatabaseName)]
I'm not going to be running this on an on-going basis so I'd like to make it as simple as possible, for the next guy. There are a bunch of options for deployment in the project properties, but I can't get this to work the way I'd like.
Any one know how to set this up?

Better late than never, I know how to get the $(DefaultDataPath)$(DatabaseName) file names from your second example.
The SQL you're showing in your first code snippet suggests that you don't have scripts for creating the database files in your VSTS:DB project, perhaps by deliberately excluded them from any schema comparisons you've done. I found it a little counter-intuitive, but the solution is to let VSTS:DB script the MDF and LDF in you development environment, then edit those scripts to use the SQLCMD variables.
In your database project, go to the folder Schema Objects > Database Level Objects > Storage > Files. In there, add these two files:
Database.sqlfile.sql
ALTER DATABASE [$(DatabaseName)]
ADD FILE (NAME = [$(DatabaseName)],
FILENAME = '$(DefaultDataPath)$(DatabaseName).mdf',
SIZE = 2304 KB, MAXSIZE = UNLIMITED, FILEGROWTH = 1024 KB)
TO FILEGROUP [PRIMARY];
Database_log.sqlfile.sql
ALTER DATABASE [$(DatabaseName)]
ADD LOG FILE (NAME = [$(DatabaseName)_log],
FILENAME = '$(DefaultDataPath)$(DatabaseName)_log.ldf',
SIZE = 1024 KB, MAXSIZE = 2097152 MB, FILEGROWTH = 10 %);
The full database creation script that VSTS:DB, or for that matter VSDBCMD.exe, generates will now use the SQLCMD variables for naming the MDF and LDF files, allowing you to specify them on the command line, or in MSBuild.

We do this using a template database, that we back up, copy, and restore as new customers are brought online. We don't do any of the schema creation with scripts but with a live, empty DB.

Hmm, well it seems that the best answer so far (given the over whelming response) is to edit the file after the fact... Still looking

Related

Why does R upload data much faster than KNIME or Workbench?

What I want to know is, what the heck happens, under the hoods, when I upload data through R and it turns to be way much faster than MySQL Workbench or KNIME?
I work with data and, everyday, I upload data into a MySQL server. I used to upload data using KNIME since it was much faster than uploading with MySQL Workbench (select the table -> "import data").
Some infos: The CSV has 4000 rows and 15 columns. The library I used in R is RMySQL. The node I used in KNIME is database writer.
library('RMySQL')
df=read.csv('C:/Users/my_user/Documents/file.csv', encoding = 'UTF-8', sep=';')
connection <- dbConnect(
RMySQL::MySQL(),
dbname = "db_name",
host = "yyy.xxxxxxx.com",
user = "vitor",
password = "****"
)
dbWriteTable(connection, "table_name", df, append=TRUE, row.names=FALSE)
So, to test, I did the exact same process, using the same file. It took 2 minutes in KNIME and only seconds in R.
Everything happens under the hood! Data upload to DB depends on parameters such as interface between DB and tool, network connectivity, batch size set, memory available for tool and tool data processing speed itself and probably some more. In your case RMySQL package uses batch size of 500 by default and KNIME only 1 so probably that is where the difference comes from. Try setting it to 500 in KNIME and then compare. Have no clue how MySQL Workbench works...

jooq specify database runtime

I have the exact same database definition for multiple databases ( and database servers ). How do I tell Jooq to use the same database as the "Connection" I created to connect to the DB?
Example ( for MySQL ):
jdbc:mysql://localhost:3306/tsm - my development DB ( tsm ), used to generate code
jdbc:mysql://RemoteAmazonDBHost:3306/customer1 - one of my customers
jdbc:mysql://RemoteAmazonDBHost:3306/customer2 - Another customer
All 3 databases have the same definition, the same tables, indexes, etc. The TSM one is the standard our application uses.
Maybe I should be using DSL.using( Connection, Setting ) instead of DSL.using(Connection)? Is that what the manual is implying here?
If I only have one "Input" schema, do I have to specify it? In other words, can I do something like this:
Settings settings = new Settings()
.withRenderMapping(new RenderMapping()
.withSchemata(
new MappedSchema().withOutput(
databaseInfo.getProperties().getProperty("database.db"))));
Or do I have to do something like this:
Settings settings = new Settings()
.withRenderMapping(new RenderMapping()
.withSchemata(
new MappedSchema().withInput("TSM")
.withOutput(databaseInfo.getProperties().getProperty("database.db"))));
I'm assuming you're using code generation. In that case, it might be the easiest to not generate the schema at all but use <outputSchemaToDefault> in the code generation configuration, e.g.
<configuration>
<generator>
<database>
<inputSchema>your_codegen_input_schema_here</inputSchema>
<outputSchemaToDefault>true</outputSchemaToDefault>
</database>
</generator>
</configuration>
See the manual for details: https://www.jooq.org/doc/latest/manual/code-generation/codegen-advanced/codegen-config-database/codegen-database-catalog-and-schema-mapping/
If you want to keep your generated code with schema qualification and map things at runtime, then your second attempt seems correct. Pass this to your Configuration (i.e. the DSL.using() call):
Settings settings = new Settings()
.withRenderMapping(new RenderMapping()
.withSchemata(new MappedSchema()
.withInput("TSM")
.withOutput(databaseInfo.getProperties().getProperty("database.db"))));
More details can be found here: https://www.jooq.org/doc/latest/manual/sql-building/dsl-context/custom-settings/settings-render-mapping

Shapefile to MDB with custom field structure [duplicate]

This question already has an answer here:
How to create table in mdb from dbf query
(1 answer)
Closed 6 years ago.
I have a Shapefile with 80.000 polygons that they are grouped by a specific field called "OTA".
I wanted to convert each Shapefile (it's attribute table) to mdb database (not Personal Geodatabase) with one table in it with the same name as the Shapefile and with a given field structure.
In the code I used I had to load on Python 2 new modules:
pypyodbc and adodbapi
The first module was used to create the mdb file for each shapefile and the second to create the table in the mdb and fill the table with the data from the attribute table of the shapefile.
The code I came up with is the following:
import pypyodbc
import adodbapi
Folder = ur'C:\TestPO' # Folder to save the mdbs
FD = Folder+ur'\27ALLPO.shp' # Shapefile
Map = u'PO' # Map type
N = u'27' # Prefecture
OTAList = sorted(set([row[0] for row in arcpy.da.SearchCursor(FD,('OTA'))]))
cnt = 0
for OTAvalue in OTAList:
cnt += 1
dbname = N+OTAvalue+Map
pypyodbc.win_create_mdb(Folder+'\\'+dbname+'.mdb')
conn_str = (r"Provider=Microsoft.Jet.OLEDB.4.0;Data Source="+Folder+"\\"+dbname+ur".mdb;")
conn = adodbapi.connect(conn_str)
crsr = conn.cursor()
SQL = "CREATE TABLE ["+dbname+"] ([FID] INT,[AREA] FLOAT,[PERIMETER] FLOAT,[KA_PO] VARCHAR(10),[NOMOS] VARCHAR(2),[OTA] VARCHAR(3),[KATHGORPO] VARCHAR(2),[KATHGORAL1] VARCHAR(2),[KATHGORAL2] VARCHAR(2),[LABEL_PO] VARCHAR(8),[PHOTO_45] VARCHAR(14),[PHOTO_60] VARCHAR(10),[PHOTO_PO] VARCHAR(8),[POLY_X_CO] DECIMAL(10,3),[POLY_Y_CO] DECIMAL(10,3),[PINAKOKXE] VARCHAR(11),[LANDTYPE] DECIMAL(2,0));"
crsr.execute(SQL)
conn.commit()
with arcpy.da.SearchCursor(FD,['FID','AREA','PERIMETER','KA_PO','NOMOS','OTA','KATHGORPO','KATHGORAL1','KATHGORAL2','LABEL_PO','PHOTO_45','PHOTO_60','PHOTO_PO','POLY_X_CO','POLY_Y_CO','PINAKOKXE','LANDTYPE'],'"OTA" = \'{}\''.format(OTAvalue)) as cur:
for row in cur:
crsr.execute("INSERT INTO "+dbname+" VALUES ("+str(row[0])+","+str(row[1])+","+str(row[2])+",'"+row[3]+"','"+row[4]+"','"+row[5]+"','"+row[6]+"','"+row[7]+"','"+row[8]+"','"+row[9]+"','"+row[10]+"','"+row[11]+"','"+row[12]+"',"+str(row[13])+","+str(row[14])+",'"+row[15]+"',"+str(row[16])+");")
conn.commit()
crsr.close()
conn.close()
print (u'«'+OTAvalue+u'» ('+str(cnt)+u'/'+str(len(OTAList))+u')')
Executing this code took about 5 minutes to complete the task for about 140 mdbs.
As you can see, I execute an "INSERT INTO" statement for each record of the shapefile.
Is this the correct way (and probably the fastest) or should I collect all the statements for each "OTA" and execute them all together?
I don't think anyone's going to write your code for you, but if you try some VBA yourself, and tell us what happened and what worked and what you're stuck on, you'll get a great response.
Saying that - to start with I don't see any reason to use VB6 when you can use VBA right inside your mdb file.
Use DIR command and possibly FileSystemObject to loop through all DBFs in a given folder, or use FileDialog object to select multiple files at one go
Then process each file with
DoCmd.TransferDatabase command
TransferType:=acImport, _
DatabaseType:="dBASE III", _
DatabaseName:="your-dbf-filepath", _
ObjectType:=acTable, _
Source:="Source", _
Destination:="your-newtbldbf"
Finally process each dbf import with a make table query
Look at results and see what might have to be changed based on field types before and after.
Then .... edit your post and let us know how it went
In theory you could do something like this by searching the directory the DBF files reside in, writing those filenames to a table, then loop through the table and, for each filename, scan the DBF for tables and their fieldnames/datatypes and create those tables in your MDB. You could also bring in all the data from the tables, all within a series of loops.
In theory, you could.
In practice, you can't. And you can't, because DBF and MDB support different data types that aren't compatible.
I suppose you could create a "crosswalk" table such that for each datatype in DBF there is a corresponding, hand-picked datatype in MDB and use that when you're creating the table, but it's probably going to either fail to import some of the data or import corrupted data. And that's assuming you can open a DBF for reading the same way you can open an MDB for reading. Can you run OpenDatabase on a DBF from inside Access? I don't even have the answer to that.
I wouldn't recommend that you do this. The reason that you're doing it is because you want to keep the structure as similar as possible when migrating from dBase/FoxBase to Access. However, the file structure is different between them.
As you are aware, each .DBF ("Database file") file is a table, and the folder or directory in which the .DBF files reside constitutes the "database". With Access, all the tables in one database are in one .MDB ("Microsoft Database") file.
If you try to put each .DBF file in a separate .MDB file, you will have no end of trouble getting the .MDB files to interact. Access treats different .MDB files as different databases, not different tables in the same database, and you will have to do strange things like link all the separate databases just to have basic relational functionality. (I tried this about 25 years ago with Paradox files, which are also a one-file-per-table structure. It didn't take me long to decide it was easier to get used to the one-file-per-database concept.) Do yourself a favor, and migrate all of the .DBF files in one folder into a single .MDB file.
As for what you ought to do with your code, I'd first suggest that you use ADO rather than DAO. But if you want to stick with DAO because you've been using it, then you need to have one connection to the dBase file and another to the Access database. As far as I can tell, you don't have the dBase connection. I've never tried what you're doing before, but I doubt you can use a SQL statement to select directly from a .dbf file in the way you're doing. (I could be wrong, though; Microsoft has come up with stranger things over the years.)
It would

Couchbase lite rename database

How can I rename an existing couchbase lite database (I'd like to avoid creating a new db and having to copy the contents over if possible). Is this the same as replicate to a database locally that has the same name?
Since you map your local database to a remote couchbase bucket using the CBLReplication classes, I can't see any reason why renaming the database shouldn't work.
Just make sure you close your CBLDatabase first:
database.close()
And use the NSFileManager to move the database mydb.cblite from within the CBLManager.sharedInstance().directory as you like:
let databaseManager = CBLManager.sharedInstance()
let fileManager = NSFileManager.defaultManager()
let sourceName = "\(database.name).cblite"
let targetName = "my-new-name.cblite"
if let sourcePath = NSURL(databaseManager.directory,isDirectory:true).URLbyAppendingURLComponent(sourceName).path,
let targetPath = NSURL(databaseManager.directory,isDirectory:true).URLbyAppendingURLComponent(targetName).path{
fileManager.moveItemAtPath(sourcePath,atPath: targetPath)
}
And afterwards reopen the connection to the database with the same CBLReplication settings as before:
let database = databaseManager.databaseNamed("my-new-name")
Please keep in mind you might also move the mydb attachments directory first. And after reopening the connection making sure all of your code is using the new reference for querying etc.
PS: The code used here is written from memory, so it may not work copied 1:1 - but you should get the idea.

Most effective way to push data from a SQL Server database into a Greenplum database?

Greenplum Database version:
PostgreSQL 8.2.15 (Greenplum Database 4.2.3.0 build 1)
SQL Server Database version:
Microsoft SQL Server 2008 R2 (SP1)
Our current approach:
1) Export each table to a flat file from SQL Server
2) Load the data into Greenplum with pgAdmin III using PSQL Console's psql.exe utility
Benifits...
Speed: OK, but is there anything faster? We load millions of rows of data in minutes
Automation: OK, we call this utility from an SSIS package using a Shell script in VB
Pitfalls...
Reliability: ETL is dependent on the file server to hold the flat files
Security: Lots of potentially sensitive data on the file server
Error handling: It's a problem. psql.exe never raises an error that we can catch even if it does error out and loads no data or a partial file
What else we have tried...
.Net Providers\Odbc Data Provider: We have configured a System DSN using DataDirect 6.0 Greenplum Wire Protocol. Good performance for a DELETE. Dog awful slow for an INSERT.
For reference, this is the aforementioned VB script in SSIS...
Public Sub Main()
Dim v_shell
Dim v_psql As String
v_psql = "C:\Program Files\pgAdmin III\1.10\psql.exe -d "MyGPDatabase" -h "MyGPHost" -p "5432" -U "MyServiceAccount" -f \\MyFileLocation\SSIS_load\sql_files\load_MyTable.sql"
v_shell = Shell(v_psql, AppWinStyle.NormalFocus, True)
End Sub
This is the contents of the "load_MyTable.sql" file...
\copy MyTable from '\\MyFileLocation\SSIS_load\txt_files\MyTable.txt' with delimiter as ';' csv header quote as '"'
If you're getting your data load done in minutes, then the current method is probably good enough. However, if you find yourself having to load larger volumes of data (terabyte scale for instance), the usual preferred method for bulk-loading into Greenplum is via gpfdist and corresponding EXTERNAL TABLE definitions. gpload is a decent wrapper that provides abstraction over much of this process and is driven by YAML control files. The general idea is that gpfdist instance(s) are spun up at the location(s) where your data is staged, preferrably as CSV text files, and then the EXTERNAL TABLE definition within Greenplum is made aware of the URIs for the gpfdist instances. From the admin guide, a sample definition of such an external table could look like this:
CREATE READABLE EXTERNAL TABLE students (
name varchar(20), address varchar(30), age int)
LOCATION ('gpfdist://<host>:<portNum>/file/path/')
FORMAT 'CUSTOM' (formatter=fixedwidth_in,
name=20, address=30, age=4,
preserve_blanks='on',null='NULL');
The above example expects to read text files whose fields from left to right are a 20-character (at most) string, a 30-character string, and an integer. To actually load this data into a staging table inside GP:
CREATE TABLE staging_table AS SELECT * FROM students;
For large volumes of data, this should be the most efficient method since all segment hosts are engaged in the parallel load. Do keep in mind that the simplistic approach above will probably result in a randomly distributed table, which may not be desirable. You'd have to customize your table definitions to specify a distribution key.