Do a VLOOKUP of a database that is too large to open in excel - ms-access

I am trying to do a VLOOKUP query into an Excel file (File 1) with about 500,000 rows from another csv file (File 2) that has about 4.5 million rows. This second file is too large to fully load in Excel, and so I am unsure how to proceed.
I am attempting to import data from File 2 to File 1 based on matching the unique PointID identifier in Column B in both files. I also have File 2 in an Access database if that works better. I have tried indicating the 'table_array' index in File 1 without opening File 2, but am receiving an error message.
Is there a way I can iterate over File 2 like a VLOOKUP without opening it or receiving an error message?

If you've already got File 2 in Access I would import File 1 into Access as well. Make sure that File 1 has its PointID set as the Primary Key, then you should be able to use an Update query in Access to get the relevant values from File 2 into File 1. You would then export the updated File 1 data back to a new Excel file (if that's where you need it to be).
I can't think of an easy way to update the original File 1 directly. It doesn't work if you add File 1 as a linked table in Access because the data isn't updateable as far as I can tell (I did try this, but I am working on older copies of Excel/Access so maybe newer versions may allow it).

Related

Importing specific columns from a CSV into excel

I am trying to do what the title says and also do it for new records. I cannot link the CSV file because it exceeds the 255 limit. So i am attempting to split up the table.
I have the below table in access
DateOfTest
Time
PromptTime
TestSequence
PATResults
Logs
Serial Number
1
2
3
4
5
6
7
Obviously, where the numbers are i want the data from the CSV to be inserted.
I have created a form including a button so i can run some VBA, but i cannot find the correct information online for my work, as i am new to VBA it is also a bit confusing.
I have attempted some random code, but i was just spraying and praying at that point
I am not sure I understood your question. In the impoer tool you can choose columns, but if you want to do it with a script, I would suggest to perform pre-processing phase with simple python and pandas to read the csv file, remove any unwanted columns and save to another CSV to be uploaded directly to excel.
something like this
import pandas as pd
df = pd.read_csv ('csvfile.csv')
df.drop('column_name', inplace=True, axis=1)
df.to_excel ('filename.xlsx', index = False, header=True)

Source File Connection (Flat File) - Not reading column metadata

When I create the SSIS package it requires a file to be referenced to pick up the files metadata. For example the column headers will be ColumnA, ColumnB.
I have always assumed that these column names need to be present in the file for it to be loaded. Recently business, for whatever reason, changed one of the column names in the file to something else so the file contains ColumnA, NotColumnB. When the SSIS package runs it ignores this and loads the file. I assumed that it would fail. Is my assumption correct and there is something weird going on or is my assumption incorrect, if so please let me know why.
I have changed the column names in a few other packages that load data from a file and they also dont care what the column names are
Click on the flat file source, and press F4 to show the properties tab. There are a property called ValidateExternalMetadata change it to True.
For more information check the following answer:
Detect new column in source not mapped to destination and fail in SSIS
Update 1
It looks like that flat file connection manager has no validation engine and the metadata defined is used at configuration time to configure the mappings between the data file and the database.
Why Does't SSIS Flat File Data Check If Columns Names or Order Have Changed? What is best way to check?
Flat file destination columns data types validation

Missing rows while exporting more than 1 milliion record into csv file via SSIS

Task : Need to export 1.1 million records to a csv file
I loaded it via SSIS Dataflow.
As you can see there are 1,100,800 rows that is loaded from a table(Source) to the FlatFile location which is a CSV file.
My FlatFile destination Source filename is Test.csv
Now when i open the csv file i get the error
"file not loaded completely"
Now when i see the record at the very end of my csv file .Sorry cannot attache the csv file due to data integrity.
So i only see record till 1048578 but the row i loaded was 1,100880 so there are some missing rows and i cannot add them manually as well . See the end of the csv it does not let me type to the next row.
Any idea why?
As for workaround i loaded in to seperate csv file 1 million in 1 csv and rest in others.
But i really wanna know why it is doing this.
Thank you in advance for looking at this.
It's Excel's fault. It only supports 1,048,576 rows.
https://support.office.com/en-us/article/excel-specifications-and-limits-1672b34d-7043-467e-8e27-269d656771c3
The error you're getting is because you're trying to open a .csv with more than the acceptable number of rows. Try opening the file in a different app, like Notepad++.

MS Access 2013 saved exports not saving to MSysIMEXSpecs table

I am working on an Access 2013 database that someone else created. It has a module that exports several reports as PDF files to a specific folder. Some of the reports are exporting successfully but 3 of them aren't. An example of the code used is as follows:
DoCmd.RunSavedImportExport "Export-rptJobsToClose_FS2"
I receive an error that the database can't save the output data to the file you've selected. I realize that the path is saved in the "Export-rptJobsToClose_FS2" saved export. I would like to see the path so I have tried opening the MSysIMEXSpecs table but when I do, it is totally empty. So is the corresponding table MSysIMEXColumns. If I create a new SavedExport definition and use the same name as the one in the code, I get the message that it already exists. How is that possible that it already exists when those system tables are empty? I have tried creating saved exports with new names, but if they don't work I can't reuse those names as I get the message that they already exist. So, I have to keep thinking of new names and can't see any information about the Saved Exports that I have already created. Thanks for any help.
MSysIMEX* tables contain import specifications for correct data transfer. Saved import-exports stored in other place. You can see all names of saved imports/exports using menu External Data -> Saved Imports/Exports, there you can also see and edit destination path and import/export name.
Thru VBA you can reach the collection of saved imports/exports by using collection CurrentProject.ImportExportSpecifications, destination path stored in XML attribute of each Item.
The code below prints all existing import-export specifications
Dim ie As ImportExportSpecification
For Each ie In CurrentProject.ImportExportSpecifications
Debug.Print ie.Name
Next
Saved import/exports in Access are not the same thing as import/export specifications. If you want to see the saved import/export definition, you can dump it by typing the following command into the Immediate window.
? CodeProject.ImportExportSpecifications(*SpecificationName*).XML

SSIS 2008 Excel Source - Problems loading Alphanumberic columns

I am using SSIS 2008 to load alphanumeric columns from Excel.
I have one column which starts off as integer
1
2
...
999
Then changes to AlphaNumeric
A1
A2
A999
When I try to load using using an Excel Data Source, excel will always say that it is an integer as it must only sample the top of the file.
(BTW - I know that I can re-order the file so that the alphas are at the top but I would rather not have to do this...)
Unfortunately you can't seem to be able to change its mind. This means that when it loads the data, it filters out the 'A' and the A999 record will update the 999 record. This is obviously not good...
I have tried to change the external and output columns to string under the advanced editing options, but I get errors and it won't run until you set the columns back to integer.
Does anyone have a solution?
SSIS uses Jet to access the Excel files. By default, Jet scans the first 8 rows of your data to determine the type of each column.
To fix it, you will need to edit the registry to increase the TypeGuessRows DWORD value of one of the following registry keys to determine how many rows to scan in your data:
It depends on what version of Windows and what version of excel ... as follows:
For 32-bit Windows
Excel 97
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\3.5\Engines\Excel
Excel 2000 and later versions
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel
For 64-bit Windows
Excel 97
HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Jet\3.5\Engines\Excel
Excel 2000 and later versions
HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Jet\4.0\Engines\Excel
Then, specify IMEX=1 in the connection string as follows:
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=D:\abc.xls;
Extended Properties="EXCEL 8.0;HDR=YES;IMEX=1";
This information can be found in a more verbose form at: http://support.microsoft.com/kb/189897/
Wow, that looks like a huge pain. I came across a couple of examples where you could alter the connection string and sometimes get better results but they don't seem to work for everyone.
Scripting an automatic conversion to an .csv file would be a good workaround, there are a number of suggestions in this thread:
converting an Excel (xls) file to a comma separated (csv) file without the GUI
including some code in C# that you may be able to easily plop in:
http://jarloo.com/code/api-code/excel-to-csv/
here is the simiar question where altering the connection string is discussed if you want to look into it for yourself: SSIS Excel Import Forcing Incorrect Column Type
Good luck!