I have an SSIS Transformation Task that I use as my final destination task to insert data into a SQL Server table. The reason I use the transformation task and not an SQL Server Destination task is because I do not know beforehand what the columns will in the table that we will be inserting into.
In a for each loop container, I am looking for access DB's (in 97 format). The rest of the control flow basically creates a new SQL database and also a table. The access files are what we call "minute" databases where they contain minute information gathered by another process. I need to create a new SQL DB named after the 'minute' db and a table called 'MINUTE' with the columns created based on certain info from the access db. For each of our clients, based on the number of parameters they have at their site, determines the number of columns I need to create in the SQL Minute table.
In the data flow I have two key components: The OLE DB source component (Source - Minute Table) and the Script Transformation task (Destination - Minute Table).
The "Source - Minute Table" gets the data from the access database. The "Destination - Minute Table" transforms the data and inserts it into the appropriate DB and table.
Everything works as it should. I tested it on a DB with 491,000+ records and it took 1 minute. However, I'm testing with one of our larger customers that has over 50 parameters and the access database contains 2+ million records. The package flies until I reach around 477,000 records, and then it pretty much comes to a halt. I can wait 10 minutes, and even longer, until the record count updates, and then continue to wait again.
I've done much research and followed all of the recommendations and guidelines that I have found. My datasource is not sorted. I use SQL command instead of Table, etc in the OLE DB Source. I've changed the values of DefaultBufferMaxRows and DefaultBufferSize many times and get the same results.
Code:
Public Class ScriptMain
Inherits UserComponent
Private conn As SqlConnection
Private cmd As SqlCommand
Private DBName As SqlParameter
Private columnsForInsert As SqlParameter
Private tableValues As SqlParameter
Private numberOfParams As Integer
Private db As String
Private folderPath As String
Private dbConn As String
Private folder As String
Private columnParamIndex As Integer
Private columnDate As DateTime
Private columnMinValue As Double
Private columnStatus As String
Private columnCnt1 As Int16
Private dateAdded As Boolean = False
Private columnStatusCnt As String
Private columnsConstructed As Boolean = False
Private buildValues As StringBuilder
Private columnValues As StringBuilder
Private i As Integer = 0
'This method is called once, before rows begin to be processed in the data flow.
'
'You can remove this method if you don't need to do anything here.
Public Overrides Sub PreExecute()
MyBase.PreExecute()
Try
'Dim dbConnection As String = "Server=(local)\SQLExpress;Database=DataConversion;User ID=sa;Password=sa123;"
'conn = New SqlConnection(dbConnection)
'conn.Open()
'cmd = New SqlCommand("dbo.InsertValues", conn) With {.CommandType = CommandType.StoredProcedure}
'columnsForInsert = New SqlParameter("#Columns", SqlDbType.VarChar, -1) With {.Direction = ParameterDirection.Input}
'cmd.Parameters.Add(columnsForInsert)
'DBName = New SqlParameter("#DBName", SqlDbType.VarChar, -1) With {.Direction = ParameterDirection.Input}
'cmd.Parameters.Add(DBName)
'tableValues = New SqlParameter("#Values", SqlDbType.VarChar, -1) With {.Direction = ParameterDirection.Input}
'cmd.Parameters.Add(tableValues)
db = Variables.varMinFileName.ToString
folder = Variables.varMinFolderName.ToString
folderPath = folder & "\" & db & ".mdb"
dbConn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & folderPath
Using SourceDataAdapter As OleDbDataAdapter = New OleDbDataAdapter("SELECT DISTINCT PARAM_INDEX FROM [MINUTE];", dbConn)
Dim SourceDatatable As New DataTable
SourceDataAdapter.Fill(SourceDatatable)
numberOfParams = SourceDatatable.Rows.Count
End Using
'columnValues.Append("dtmTime, ")
buildValues = New StringBuilder
columnValues = New StringBuilder
columnValues.Append("dtmTime, ")
Catch ex As Exception
Dim writer As New StreamWriter("C:\MinuteLog.log", True, System.Text.Encoding.ASCII)
writer.WriteLine(ex.Message)
writer.Close()
writer.Dispose()
Finally
End Try
End Sub
' This method is called after all the rows have passed through this component.
'
' You can delete this method if you don't need to do anything here.
Public Overrides Sub PostExecute()
MyBase.PostExecute()
'
' Add your code here
'
buildValues = Nothing
columnValues = Nothing
End Sub
Public Overrides Sub Input0_ProcessInput(Buffer As Input0Buffer)
While Buffer.NextRow()
Input0_ProcessInputRow(Buffer)
End While
End Sub
'This method is called once for every row that passes through the component from Input0.
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Dim column As IDTSInputColumn100
Dim rowType As Type = Row.GetType()
Dim columnValue As PropertyInfo
Dim result As Object
Dim rtnValue As String = Variables.varMinFileName.Replace("_", "")
Dim colName As String
Try
For Each column In Me.ComponentMetaData.InputCollection(0).InputColumnCollection
columnValue = rowType.GetProperty(column.Name)
colName = column.Name.ToString
If Not colName.Contains("NULL") Then
'If Not columnValue Is Nothing Then
Select Case column.Name.ToString
Case "PARAM_INDEX"
'result = columnValue.GetValue(Row, Nothing)
result = Row.PARAMINDEX
columnParamIndex = CType(result, Byte)
If columnsConstructed = False And i <= numberOfParams - 1 Then
columnValues.Append(String.Format("VALUE_{0}, STATUS_{0}, ", columnParamIndex.ToString))
End If
Exit Select
Case "dtmTIME"
'result = columnValue.GetValue(Row, Nothing)
result = Row.dtmTIME
columnDate = CType(result, DateTime)
If dateAdded = False Then ' only need to add once since rows are vertical
buildValues.Append("'" & columnDate & "', ")
dateAdded = True
End If
Exit Select
Case "MIN_VALUE"
'result = columnValue.GetValue(Row, Nothing)
result = Row.MINVALUE
columnMinValue = CType(result, Double)
buildValues.Append(columnMinValue & ", ")
Exit Select
Case "MIN_STATUS"
'result = columnValue.GetValue(Row, Nothing)
result = Row.MINSTATUS
columnStatus = CType(result, String)
Exit Select
Case "MIN_CNT_1"
'result = columnValue.GetValue(Row, Nothing)
result = Row.MINCNT1
columnCnt1 = CType(result, Byte)
columnStatusCnt = columnStatus & "010" & columnCnt1.ToString.PadLeft(5, "0"c) & "-----"
buildValues.Append("'" & columnStatusCnt & "', ")
Case Else
Exit Select
End Select
'End If
End If
Next
If i = numberOfParams - 1 Then
If columnsConstructed = False Then
columnValues.Remove(columnValues.Length - 2, 1)
End If
buildValues.Remove(buildValues.Length - 2, 1)
Dim valueResult As String = buildValues.ToString()
SetStoredProc()
cmd.Parameters("#Columns").Value = columnValues.ToString
cmd.Parameters("#DBName").Value = "[" & rtnValue & "].[dbo].[MINUTE]"
cmd.Parameters("#Values").Value = valueResult
cmd.ExecuteNonQuery()
buildValues.Clear()
columnsConstructed = True
dateAdded = False
columnParamIndex = 0
columnMinValue = 0
columnStatus = String.Empty
columnCnt1 = 0
i = 0
conn.Close()
conn.Dispose()
Else
i += 1
End If
Catch ex As Exception
Dim writer As New StreamWriter("C:\MinuteLog.log", True, System.Text.Encoding.ASCII)
writer.WriteLine(ex.Message)
writer.Close()
writer.Dispose()
Finally
'buildValues = Nothing
'columnValues = Nothing
End Try
End Sub
Private Sub SetStoredProc()
Try
Dim dbConnection As String = "Server=(local)\SQLExpress;Database=DataConversion;User ID=sa;Password=sa123;"
conn = New SqlConnection(dbConnection)
conn.Open()
cmd = New SqlCommand("dbo.InsertValues", conn) With {.CommandType = CommandType.StoredProcedure}
columnsForInsert = New SqlParameter("#Columns", SqlDbType.VarChar, -1) With {.Direction = ParameterDirection.Input}
cmd.Parameters.Add(columnsForInsert)
DBName = New SqlParameter("#DBName", SqlDbType.VarChar, -1) With {.Direction = ParameterDirection.Input}
cmd.Parameters.Add(DBName)
tableValues = New SqlParameter("#Values", SqlDbType.VarChar, -1) With {.Direction = ParameterDirection.Input}
cmd.Parameters.Add(tableValues)
Catch ex As Exception
Dim writer As New StreamWriter("C:\MinuteLog.log", True, System.Text.Encoding.ASCII)
writer.WriteLine(ex.Message)
writer.Close()
writer.Dispose()
End Try
End Sub
End Class
Since I can't upload images yet here I've included a blog link I created that includes ample screen shots to help understand the problem mentioned here:
SSIS slows down during transformation task
Any help in determining why my package slows after 400k records and doesn't process all 2+ million records in a reasonable time is much appreciated!
Thanks,
Jimmy
This probably isn't terribly helpful but my guess is you are running out of memory. If SSIS has to page you've had it in my experience.
Can you batch up the work somehow in several smaller runs perhaps?
Full solution can be viewed here on my blog with screenshots - SSIS slowdown solved
In order to get around SSIS slowing down when a large number of records are being transformed and inserted into SQL Server as my destination, I redesigned my SSIS package. Instead of doing an insert in a data transformation task for every record that comes through the buffer, I’ve eliminated it and have used a stored procedure to do a bulk insert. In order to accomplish this, I read in the data from each access DB into a table called “MINUTE” in my SQL Server instance. This minute table has the same schema as the access DB’s and I let SSIS do the heavy lifting of importing all the data into this table. After the data is imported, I execute my stored procedure which transforms the data in this minute table (horizontal records) and does a bulk insert into my new destination MINUTE SQL table (one vertical record.)
The stored procedure that does the bulk insert and transforms the data looks like this:
PROCEDURE [dbo].[InsertMinuteBulk]
-- Add the parameters for the stored procedure here
(#Columns varchar(MAX), #DBName varchar(4000))
AS
BEGIN
DECLARE #SQL varchar(MAX)
SET #SQL =’;WITH Base AS (
SELECT dtmTime,
param_index,
CONVERT(nvarchar(16), MIN_VALUE) AS [VALUE_],
CONVERT(nvarchar(3), MIN_STATUS) + ”000” + LEFT(replicate(”0”,5) + CONVERT(nvarchar(5), MIN_CNT_1),5) + ”—–” AS [STATUS_]
FROM [DataConversion].[dbo].[MINUTE]
)
,norm AS (
SELECT dtmTime, ColName + CONVERT(varchar, param_index) AS ColName, ColValue
FROM Base
UNPIVOT (ColValue FOR ColName IN ([VALUE_], [STATUS_])) AS pvt
)
INSERT INTO ‘ + #DBName + ‘
SELECT *
FROM norm
PIVOT (MIN(ColValue) FOR ColName IN (‘+#Columns+’)) AS pvt’
EXEC (#SQL);
In the Data Flow task, the “Minute Data Source" is an ADO.NET Data Source and feeds the data into my SQL Server destination – "Minute Data Destination".
In the Control Flow, the final task of "Bulk Insert Minute Data" executes the Bulk Insert stored procedure.
The package now runs uninterrupted and is pretty fast considering the size of data that I’m reading, transforming and inserting.
I’ve ran the package as an SSIS job and it took 38 minutes to complete converting 7 months (or 7 minute access DB’s) worth of minute data with over 2 million rows in each access DB.
Related
Is there a way to make populating ListBox fast, because the UI is freezing on form load upon populating the ListBox?
This is my form load code:
Dim abc As String = itemCount()
Dim output = Account_Get(a)
For Each s In output
ListBox1.Items.Add(s)
count1 += 1
If count1 = abc Then
ListBox1.Visible = True
End If
Next
This is the query in module:
Public Function Account_Get(ByVal chk As String) As List(Of String)
Dim result = New List(Of String)()
Try
cn.Open()
sql = "select column_name as str from table where status = 'New' order by rand()"
cmd = New MySqlCommand(sql, cn)
dr = cmd.ExecuteReader
While dr.Read
result.Add(dr("str").ToString())
End While
Return result
Catch ex As Exception
MsgErr(ex.Message, "Error Encounter")
Return Nothing
Finally
cn.Close()
End Try
End Function
this is working fine. but the fact that it loads too many datas. the ui is freezing on load. hoping someone could help me with this. thanks!
Since you are incrementing count1 I assume it is some sort of number. However, you are then comparing it to a string in the If statement. Please use Option Strict.
Changed the Function to return an Array of String. Took the random sort form the sql statement and moved it to a little linq at the end or the function.
You could add a Stopwatch to the data retrieval and the display sections to see where your bottleneck is. BeginUpdate and EndUpdate on the listbox prevents repainting on every addition.
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
Dim output = Account_Get()
ListBox2.BeginUpdate()
ListBox2.Items.AddRange(output)
ListBox2.EndUpdate()
End Sub
Private Rand As New Random
Public Function Account_Get() As String()
Dim dt As New DataTable
Dim result As String()
Using cn As New MySqlConnection("Your connection string")
Dim Sql = "select column_name as str from table where status = 'New'" 'order by rand()"
Using cmd = New MySqlCommand(Sql, cn)
Try
cn.Open()
dt.Load(cmd.ExecuteReader)
Catch ex As Exception
MessageBox.Show(ex.Message, "Error Encounter")
Return Nothing
End Try
End Using
End Using
result = (From dRow In dt.AsEnumerable()
Let field = dRow("str").ToString
Order By Rand.Next
Select field).ToArray
Return result
End Function
The query you are using contains a random order. Ordering records randomly can be a huge performance issue within MySQL as it has to go through all records in the table and then sort them randomly. The more records in the table, the bigger the performance penalty. There is also no limitation on the number of records in your query. So if there are thousands of items in your table the listbox will also be thousands of items in size, which could also take a long time.
If you really require the random ordering you could do something about it in your code. I'm now assuming here that you are: 1) using identifiers in your table, 2) you actually wish to limit the number of items in your listbox and not display all of them.
Get a grasp of the total number of records in the table by a query
Pick a random number from the range of items in your table
Fetch the nearest record
Hope this helps you to get going to find a solution
I've searched around the web looking for samples on how to use MySqlHelper.UpdateDataSet but all I've found is:
Public Shared Sub UpdateDataSet( _
ByVal connectionString As String, _
ByVal commandText As String, _
ByVal ds As DataSet, _
ByVal tablename As String _
)
I'll be grateful if someone will give me:
an example of commandText because I didn't understand what kind of command I need to give;
an explanation of tablename because I need to know if is the name of a table of the DB or of the DataSet or both (with the same name);
a vb.net code example (to start testing).
I tryed to use the command this way:
Private Sub Btn_Mod_Dataset_Click(sender As Object, e As EventArgs) Handles Btn_Mod_Dataset.Click
Dim SqlStr$ = "SELECT * FROM MyTest.Users"
Using DS_Test As DataSet = DB_Functions.mQuery(SqlStr)
With DS_Test
.Tables(0).TableName = "Users"
Dim User$ = .Tables(0).Rows(0)("UserName").ToString
MsgBox(User)
.Tables(0).Rows(0)("User") = "Upd_Test"
User = .Tables(0).Rows(0)("UserName").ToString
MsgBox(User)
End With
Dim DB_Name = "MyTest"
Dim connectionString$ = "datasource=localhost;username=" + UserDB _
+ ";password=" + Password + ";database=" + DB_Name
MySqlHelper.UpdateDataSet(connectionString, _
"Update MyTest.Users Set UserName = 'Test_Ok' WHERE UserName = 'Steve'", _
DS_Test, "Users")
End Using
End Sub
This gives me
System.NullReferenceException' in System.Data.dll
EDIT (to explain my code):
a) DB_Functions is a sepate class where I've stored some function to use on a MySql DataBase. mQuery is a function who extract query result into a dataset;
b) 'User' is a field Name : I've changed it to 'UserName' but same result;
d) The code between With and End With is just a test to see what happens;
NOTE that the code gives error but my DB is updated as in the commandText String. I don't understand what happens
This might get you part of the way.
First get rid of DB_Functions. MySQLHelper has a method to create the DataSet for you; in general, db Ops are so query-specific that there is very little that is generic and reusable. The exception to this is building the ConnectionString: MySQL has gobs of cool options you can enable/disable via the connection string. But for that you just need the standard MySqlConnectionStringBuilder.
Build a DataSet:
' form/class level vars
Private dsSample As DataSet
Private MySqlConnStr As String = "..."
...
Dim SQL = "SELECT Id, FirstName, Middle, LastName FROM Employee"
Using dbcon As New MySqlConnection(MySQLConnStr)
dsSample = MySqlHelper.ExecuteDataset(dbcon, SQL)
dsSample.Tables(0).TableName = "Emps"
End Using
There does not appear to be a way to specify a tablename when you build it, so that is a separate step.
Update a Record
To update a single row, you want ExecuteNonQuery; this will also allow you to use Parameters:
Dim uSQL = "UPDATE Employee SET Middle = #p1 WHERE Id = #p2"
Using dbcon As New MySqlConnection(MySQLConnStr)
Dim params(1) As MySqlParameter
params(0) = New MySqlParameter("#p1", MySqlDbType.String)
params(0).Value = "Q"
params(1) = New MySqlParameter("#p2", MySqlDbType.Int32)
params(1).Value = 4583
dbcon.Open()
Dim rows = MySqlHelper.ExecuteNonQuery(dbcon, uSQL, params)
End Using
Again, this is not really any simpler than using a fully configured DataAdapter, which would be simply:
dsSample.Tables("Emps").Rows(1).Item("Middle") = "X"
daSample.Update(dsSample.Tables("Emps"))
I am not exactly sure what value the UpdateDataSet method adds. I think it is the "helper" counterpart for the above, but since it doesn't provide for Parameters, I don't have much use for it. The docs for it are sketchy.
The commandtext would appear to be the SQL for a single row. Note that the DataAdapter.Update method above would add any new rows added, delete the deleted ones and update values for any row with changed values - potentially dozens or even hundreds of db Ops with one line of code.
My funtion gets string query and returns datatable. So u can set dataset.tables .
Public Function mysql(ByVal str_query As String) As DataTable
Dim adptr As New MySqlDataAdapter
Dim filltab As New DataTable
Try
Using cnn As New MySqlConnection("server=" & mysql_server & _
";user=" & mysql_user & ";password=" & mysql_password & _
";database=" & mysql_database & ";Allow User Variables=True")
Using cmd As New MySqlCommand(str_query, cnn)
cnn.Open()
adptr = New MySqlDataAdapter(cmd)
adptr.Fill(filltab)
cnn.Close()
End Using
End Using
Catch ex As Exception
'you can log mysql errors into a file here log(ex.ToString)
End Try
Return filltab
End Function
I am trying to convert a MySQL time to a string using VB.NET.
Dim adpt As New MySqlDataAdapter(dbcmdstring, connection)
Dim myDataTable As New DataTable()
adpt.Fill(myDataTable)
DataGridView1.DataSource = myDataTable
DataGridView1.Columns.Remove("ActivityID")
DataGridView1.Columns.Remove("ActivityDate")
DataGridView1.Columns.Remove("UserID")
DataGridView1.Columns(0).HeaderCell.Value = "Name"
DataGridView1.Columns(1).HeaderCell.Value = "Start Time"
DataGridView1.Columns(2).HeaderCell.Value = "End Time"
DataGridView1.Columns.Add("Duration", "Duration")
DataGridView1.RowHeadersVisible = False
Dim duration As New TimeSpan
Dim durationStr As String = ""
Dim i As Integer = 0
For Each row As DataGridViewRow In DataGridView1.Rows
duration = Date.Parse(row.Cells(2).Value.ToString).Subtract(Date.Parse(row.Cells(1).Value.ToString))
durationStr = Math.Round(duration.TotalMinutes).ToString & ":" & Math.Round(duration.TotalSeconds).ToString
row.Cells(3).Value = durationStr
Next
When the date is parsed during the construction of the duration variable, it throws an error:
An unhandled exception of type 'System.NullReferenceException'
occurred in WindowsApplicationSQL.exe
Additional information: Object reference not set to an instance of an
object.
I can successfully parse the date and show it in a messagebox, but not convert it to a useable string. I have also tried using just the .value of the time as well.
Any help?
There is a much easier way to get a Duration, and a much, much easier way.
Part of the problem is this: I am trying to convert a MySQL time to a string using VB.NET. There is no need to convert to string. If the column is in fact, a Time() column in MySQL, it's has a NET counterpart: TimeSpan and it can be easily used to calculate a difference using subtraction.
Much Easier Method
dtLog.Columns.Add(New DataColumn("Duration", GetType(TimeSpan)))
For Each r As DataRow In dtLog.Rows
r("Duration") = r.Field(Of TimeSpan)("EndTime") - r.Field(Of TimeSpan)("StartTime")
Next
If you are using a DataSource, it is rarely a good idea to manipulate the data thru the DataGridView. Note: no strings were needed.
Much, MUCH Easier Method
Perform the operation in SQL:
Dim sql = "SELECT ... StartTime, EndTime, TIMEDIFF(EndTime, StartTime) As Duration FROM ActivityLog"
This will create a Duration column in the DataTable containing the result. Similarly, if you dont want certain columns, you can omit them from the SQL to start rather than removing DGV Columns.
Object reference not set to an instance of an object.
Finally, this error probably has nothing to do with the conversion of MySQL to VB, strings or TimeSpans. By default, the DGV has that extra row at the bottom - the NewRow for the user to start adding data - but all the cells are Nothing. So:
For Each row As DataGridViewRow In DataGridView1.Rows
This will try to process the NewRow and poking around it's cells will result in an NRE. When you switched to the datatable, it went away because they dont have the extra row. You still don't need all those gyrations though.
I tried, instead, doing it via the DataTable rather than the DataGridView and it worked!
myDataTable.Columns.Add("Duration", GetType(String))
myDataTable.Columns.Remove("ActivityID")
myDataTable.Columns.Remove("ActivityDate")
myDataTable.Columns.Remove("UserID")
DataGridView1.DataSource = myDataTable
DataGridView1.RowHeadersVisible = False
DataGridView1.Columns(0).HeaderCell.Value = "Name"
DataGridView1.Columns(1).HeaderCell.Value = "Start Time"
DataGridView1.Columns(2).HeaderCell.Value = "End Time"
Dim duration As New TimeSpan
Dim durationStr As String = ""
Dim i As Integer = 0
For Each row As DataRow In myDataTable.Rows
duration = Date.Parse(row.Item("EndTime").ToString).Subtract(Date.Parse(row.Item("StartTime").ToString))
durationStr = Math.Round(duration.Minutes).ToString & ":" & Math.Round(duration.Seconds).ToString
row.Item("Duration") = durationStr
Next
I'm facing a problem when I want to update data from local database to server data, replacing everything that has been modified at local database. I know it might be simple but I got no idea about this, so any help will be appreciate.
In my situation, I want to use a button to upload all modified data to
the server database. Now I'm just using 2 databases at same server to do
testing.
Private Sub btnUp_Click(sender As System.Object, e As System.EventArgs) Handles btnUp.Click
localconn.ConnectionString = lctext
serverconn.ConnectionString = sctext
Try
localconn.Open()
serverconn.Open()
Dim localcmd As New OdbcCommand("select a.acc_id as localid, a.acc_brcid, a.smartcardid, a.acc_created, a.acc_modified as localmodified, b.acd_firstname, b.acd_ic, b.acd_oldic, b.acd_race, b.acd_dob, b.acd_rescity, b.acd_resaddr1, b.acd_telmobile, b.acd_email, b.acd_telwork, b.acd_modified, b.acd_accid from nsk_account a inner join nsk_accountdetail b on a.acc_id = b.acd_accid", localconn)
Dim servercmd As New OdbcCommand("select c.acc_id, c.acc_brcid, a.smartcardid, c.acc_created, c.acc_modified, d.acd_firstname, d.acd_ic, d.acd_oldic, d.acd_race, d.acd_dob, d.acd_rescity, d.acd_resaddr1, d.acd_telmobile, d.acd_email, d.acd_telwork, d.acd_modified, d.acd_accid from nsk_account c inner join nsk_accountdetail d on c.acc_id = d.acd_accid", serverconn)
localcmd.CommandType = CommandType.Text
Dim rdr As OdbcDataReader = localcmd.ExecuteReader()
Dim thedatatable As DataTable = rdr.GetSchemaTable()
'localcmd.Parameters.Add("#localid", OdbcType.Int, "a.acc_id")
'localcmd.Parameters.Add("#localmodified", OdbcType.DateTime, "b.acd_modified")
Dim localid As String
Dim localmodi As String
localcmd.Parameters.AddWithValue("localid", localid)
localcmd.Parameters.AddWithValue("localmodified", localmodi)
For Each localid In thedatatable.Rows
Dim calldata As New OdbcCommand("SELECT acc_modified from nsk_account where acc_id ='" + localid + "'", serverconn)
Dim reader As OdbcDataReader = calldata.ExecuteReader
txtSDate.Text = reader("acc_modified").ToString
If localmodi <= txtSDate.Text Then
'do nothing, proceed to next data
Else
Dim ACCoverwrite As New OdbcCommand("Update nsk_account SET smartcardid = #mykad, acc_created = #created, acc_modified = #modify WHERE acc_id ='" + localid + "'", serverconn)
Dim DEToverwrite As New OdbcCommand("Update nsk_accountdetail SET acd_firstname = #name, acd_ic = #newic, acd_oldic = #oldic, acd_race = #race, acd_dob = #dob, acd_rescity = #city, acd_resaddr1 = #address, acd_telmobile = #phone, acd_email = #email, acd_telwork = #language, acd_modified = #detmodify WHERE acd_accid ='" + localid + "'", serverconn)
ACCoverwrite.ExecuteNonQuery()
DEToverwrite.ExecuteNonQuery()
End If
Next
MessageBox.Show("Upload success", "Error", MessageBoxButtons.OK, MessageBoxIcon.Warning)
Catch ex As Exception
MessageBox.Show(ex.Message, "Error", MessageBoxButtons.OK, MessageBoxIcon.Warning)
Finally
localconn.Close()
serverconn.Close()
End Try
End Sub
any comment or suggestion will be appreciate.
I hope you mean table by table. I didn't read your code much but you got the idea - you need 2 connections but here where there are 2 distinct ways of doing it.
Way #1 - you can use when amounts of data (how to say it better? - not huge). You can load a DataTable object with data from server and update changed records. You can use DataAdapter and issue CommitChanges - all changed/new rows will be written to server.
NOTE: you need a mechanism that will reliably able to tell which rows are new and modified on your local DB. Are you OK if your PK in local DB will be different than on the server? You need to answer these questions. May be you need a special mechanism for PK locally. For example, add rows using negative PK integers, which will tell you that these rows are new. And use "ModifiedDate", which together with PK will tell if the row needs updating.
Way #2 - use anytime, even with larger amount of data. Take a local row and examine it. If it is new - insert, if it is existing and "DateModified" changed - do update. There are variations of how to do it. You can use SQL MERGE statement, etc.
But these are two major ways - direct row insert/update and disconnected update/mass commit.
Also, you can do it in bulk, using a transaction - update some rows-commit, and start new transaction. This will help if the application being used as you updating it.
I hope these ideas help. If you do what you do, where you have
For Each localid In thedatatable.Rows
I am not sure what localid is. It should be
' prepare command before loop
sql = "Select * From Table where ID = #1"
' you will create parameter for #1 with value coming from
' row("ID")
Dim cmd As New .....
cmd.Parameters.Add(. . . . )
For Each row As DataRow In thedatatable.Rows
cmd.Parameters(0).Value = row("ID") ' prepare command upfront and only change the value
using reader as IDataReader = cmd.ExecuteReader(. . . . )
If Not reader.Read() Then
' This row is not found in DB - do appropriate action
Continue For
Else
' here check if the date matches and issue update
' Better yet - fill some object
End if
end using
' if you fill object with data from your row -here you can verify if
' update needed and issue it
. . . . . .
Next
I have a CSV file with 550,000+ rows.
I need to import this data into Access, but when I try it throws an error that the file is too large (1.7GB).
Can you recommend a way to get this file into Access?
Try linking instead of importing ("get external data" -> "link table" in 2003), that leaves the data in the CSV-file and reads from the file directly and in-place. It doesn't limit size (at least not anywhere near 1.7 GB). It may limit some of your read/update operations, but it will at least get you started.
I'd either try the CSV ODBC connector, or otherwise import it first in a less limited database (MySQL, SQL Server) and import it from there.
It seems that some versions of access have a hard 2GB limit on MDB files so you might get into trouble with that anyway.
Good luck.
You can also use an ETL tool. Kettle is an open source one (http://kettle.pentaho.org/) and really quite easy to use. To import a file into a database requires a single transformation with 2 steps: CSV Text Input and Table Output.
why do you using access for huge files ? use sqlexpress or firebird instead
I remember that Access has some size limitation around 2 Go. Going to free SQLExpress (limited to 4 Go) or free MySQL (no size limitation) could be easier.
Another option would be to do away with the standard import functions and write your own. I have done this one time before when some specific logic needed to be applied to the data before import. The basic structure is……
Open then file
Get the first line
Loop through until the end of the line
If we find a comma then move onto the next field
Put record into database
Get the next line repeat etc
I wrapped it up into a transaction that committed every 100 rows as I found that improved performance in my case but it would depend on your data if that helped.
However I would say that linking the data as others have said is the best solution, this is just an option if you absolutely have to have the data in access
Access creates a lot of overhead so even relatively small data sets can bloat the file to 2GB, and then it will shut down. Here are a couple of straightforward ways of doing the import. I didn't test this on huge files, but these concepts will definitely work on regular files.
Import data from a closed workbook (ADO)
If you want to import a lot of data from a closed workbook you can do this with ADO and the macro below. If you want to retrieve data from another worksheet than the first worksheet in the closed workbook, you have to refer to a user defined named range. The macro below can be used like this (in Excel 2000 or later):
GetDataFromClosedWorkbook "C:\FolderName\WorkbookName.xls", "A1:B21", ActiveCell, False
GetDataFromClosedWorkbook "C:\FolderName\WorkbookName.xls", "MyDataRange", Range ("B3"), True
Sub GetDataFromClosedWorkbook(SourceFile As String, SourceRange As String, _
TargetRange As Range, IncludeFieldNames As Boolean)
' requires a reference to the Microsoft ActiveX Data Objects library
' if SourceRange is a range reference:
' this will return data from the first worksheet in SourceFile
' if SourceRange is a defined name reference:
' this will return data from any worksheet in SourceFile
' SourceRange must include the range headers
'
Dim dbConnection As ADODB.Connection, rs As ADODB.Recordset
Dim dbConnectionString As String
Dim TargetCell As Range, i As Integer
dbConnectionString = "DRIVER={Microsoft Excel Driver (*.xls)};" & _
"ReadOnly=1;DBQ=" & SourceFile
Set dbConnection = New ADODB.Connection
On Error GoTo InvalidInput
dbConnection.Open dbConnectionString ' open the database connection
Set rs = dbConnection.Execute("[" & SourceRange & "]")
Set TargetCell = TargetRange.Cells(1, 1)
If IncludeFieldNames Then
For i = 0 To rs.Fields.Count - 1
TargetCell.Offset(0, i).Formula = rs.Fields(i).Name
Next i
Set TargetCell = TargetCell.Offset(1, 0)
End If
TargetCell.CopyFromRecordset rs
rs.Close
dbConnection.Close ' close the database connection
Set TargetCell = Nothing
Set rs = Nothing
Set dbConnection = Nothing
On Error GoTo 0
Exit Sub
InvalidInput:
MsgBox "The source file or source range is invalid!", _
vbExclamation, "Get data from closed workbook"
End Sub
Another method that doesn't use the CopyFromRecordSet-method
With the macro below you can perform the import and have better control over the results returned from the RecordSet.
Sub TestReadDataFromWorkbook()
' fills data from a closed workbook in at the active cell
Dim tArray As Variant, r As Long, c As Long
tArray = ReadDataFromWorkbook("C:\FolderName\SourceWbName.xls", "A1:B21")
' without using the transpose function
For r = LBound(tArray, 2) To UBound(tArray, 2)
For c = LBound(tArray, 1) To UBound(tArray, 1)
ActiveCell.Offset(r, c).Formula = tArray(c, r)
Next c
Next r
' using the transpose function (has limitations)
' tArray = Application.WorksheetFunction.Transpose(tArray)
' For r = LBound(tArray, 1) To UBound(tArray, 1)
' For c = LBound(tArray, 2) To UBound(tArray, 2)
' ActiveCell.Offset(r - 1, c - 1).Formula = tArray(r, c)
' Next c
' Next r
End Sub
Private Function ReadDataFromWorkbook(SourceFile As String, SourceRange As String) As Variant
' requires a reference to the Microsoft ActiveX Data Objects library
' if SourceRange is a range reference:
' this function can only return data from the first worksheet in SourceFile
' if SourceRange is a defined name reference:
' this function can return data from any worksheet in SourceFile
' SourceRange must include the range headers
' examples:
' varRecordSetData = ReadDataFromWorkbook("C:\FolderName\SourceWbName.xls", "A1:A21")
' varRecordSetData = ReadDataFromWorkbook("C:\FolderName\SourceWbName.xls", "A1:B21")
' varRecordSetData = ReadDataFromWorkbook("C:\FolderName\SourceWbName.xls", "DefinedRangeName")
Dim dbConnection As ADODB.Connection, rs As ADODB.Recordset
Dim dbConnectionString As String
dbConnectionString = "DRIVER={Microsoft Excel Driver (*.xls)};ReadOnly=1;DBQ=" & SourceFile
Set dbConnection = New ADODB.Connection
On Error GoTo InvalidInput
dbConnection.Open dbConnectionString ' open the database connection
Set rs = dbConnection.Execute("[" & SourceRange & "]")
On Error GoTo 0
ReadDataFromWorkbook = rs.GetRows ' returns a two dim array with all records in rs
rs.Close
dbConnection.Close ' close the database connection
Set rs = Nothing
Set dbConnection = Nothing
On Error GoTo 0
Exit Function
InvalidInput:
MsgBox "The source file or source range is invalid!", vbExclamation, "Get data from closed workbook"
Set rs = Nothing
Set dbConnection = Nothing
End Function
For really large files, you can try something like this . . .
INSERT INTO [Table] (Column1, Column2)
SELECT *
FROM [Excel 12.0 Xml;HDR=No;Database=C:\your_path\excel.xlsx].[SHEET1$];
OR
SELECT * INTO [NewTable]
FROM [Excel 12.0 Xml;HDR=No;Database=C:\your_path\excel.xlsx].[SHEET1$];