Querying subdirectories using drill - apache-drill

I have a parent Folder called Job_1 and i have different subdirectories from 0 -10 and in each of 0 to 10 folders i have different file called property and address.
SELECT
COUNT(LOCID)
FROM dfs.`/Desktop/DataValidation/outputfiles/Job_1/`
WHERE NUMBLDGS > 1
Job_1 has folders 0 -10 and the file is property.txt but the folder has other text files as well called accounts and so on... for a single folder the query looks like
SELECT
COUNT(LOCID)
FROM dfs.`/Desktop/DataValidation/outputfiles/Job_1/0/property.txt`
WHERE NUMBLDGS > 1
How can I run one single query against all sub directories 0 -10 and against property.txt?

you can use wildcards:
SELECT COUNT(LOCID) FROM dfs.`/Desktop/DataValidation/outputfiles/Job_1/*/property.txt` WHERE NUMBLDGS > 1

Related

Do a VLOOKUP of a database that is too large to open in excel

I am trying to do a VLOOKUP query into an Excel file (File 1) with about 500,000 rows from another csv file (File 2) that has about 4.5 million rows. This second file is too large to fully load in Excel, and so I am unsure how to proceed.
I am attempting to import data from File 2 to File 1 based on matching the unique PointID identifier in Column B in both files. I also have File 2 in an Access database if that works better. I have tried indicating the 'table_array' index in File 1 without opening File 2, but am receiving an error message.
Is there a way I can iterate over File 2 like a VLOOKUP without opening it or receiving an error message?
If you've already got File 2 in Access I would import File 1 into Access as well. Make sure that File 1 has its PointID set as the Primary Key, then you should be able to use an Update query in Access to get the relevant values from File 2 into File 1. You would then export the updated File 1 data back to a new Excel file (if that's where you need it to be).
I can't think of an easy way to update the original File 1 directly. It doesn't work if you add File 1 as a linked table in Access because the data isn't updateable as far as I can tell (I did try this, but I am working on older copies of Excel/Access so maybe newer versions may allow it).

SSIS_Get count of rows from source and load to destination

I have two OLEDB sources such as
DB Source1= select count(*) from A
DB Source2= select count(*) from B
Now, I need to get the count of Records uploaded
DB Source1 -DB Source2
for eg,
DBSource1 = 9 ;DBSource2= 1
then record uploaded will be 9-1=8
Finally I need them to be loaded to a flat file destination with following columns
RecordsReceived ErrorRecords RecordsUploaded
9 1 8
How do I achieve this?
TIA :)
You should look into the Row Count Transformation task. This one will count your selected records that flow through it and store it in a variable you declared. You can use those variables later in your script to store them in a flat file.

SQL - Recursive delete in one query?

Assuming I have a DB like this:
Folders (with "parent folder" column)
Files (with "folder" column)
Is there a way to delete all files in a folder that has sub folders in only one query?
Example:
Folders:
id,name,parent
1, folder1, 0
2, folder2, 1
3, folder3, 2
Files:
name, folder
file1, 2
And I try to delete folder1. That single query should delete all files in folder2, and folder3 becasue folder2 is under folder1, and folder2 is under folder1.
** I know I can do this as a recursive script, but I want to educate myself more
As suggested by #jarlh, a really nice solution is having a Foreign key, with on delete cascade.

Count of related items in a 2nd table with zero results needed (query check please)

This MySQL statement is a bit over my head. I pieced it togather through a lot of Google searches. It seems to work right but I just wanted to see if I could get a thumbs up. I'm paranoid I did something a bit off and some issue could come up I'm not understanding.
I have a 'directories' table, 'folders' table and 'documents' table. (directories have many folders, folders have many documents).
On a web page, I have a select where a user can choose a directory (which has many folders). This query is for an AJAX call that loads a second select with the list of all folders belonging to the directory (getting the id's and names to load the 'folders' select).
So, this query will be made against one directory to get a list of folder id's and folder names for that directory. I also needed the folder name to contain a count of how many documents are contained in each folder. Also, I originally had just "join" which did not return zero results but changing it to "left join" listed folders with 0 documents (don't have an understanding of the different types of joins yet).
MY FRANKEN-QUERY:
SELECT f.id, CONCAT(f.folder_name , ' (', COUNT(DISTINCT d.id), ' documents')') AS folder_name
FROM folders f
LEFT JOIN documents d ON d.folder_id = f.id
WHERE f.directory_id = '2'
GROUP BY f.id
ORDER BY f.folder_name
RESULTS (seems to work fine):
id folder_name
1 MAIN (2 documents)
8 test1 (2 documents)
9 test2 (3 documents)
50 test3 (0 documents)
Thanks - much appreciated!
It looks fine offhand, but just run a couple tests on your data ans make sure you get consistent (correct) results.
Assuming document.id is a primary key, you can remove the DISTINCT keyword from the count.
For more on the various join types
http://en.wikipedia.org/wiki/Join_%28SQL%29

Pulling data from a text file to generate a report

Have a program in MS-Access, using VBA. I need to come up with an If statement to pull data from a text file. The data is a list of procedures and prices. I have to pull the prices from the text file to show in a report how much each procedure costs.
ID PID M1 M2 M3 Total
1 11120390(procedure)
2 180(price) 360 180 540 1080(total Price)
3 2 1 3 6(Units sold)
4
5 200(Price) 200 600 800 1600(total price)
6 1 3 4 8(Units Sold)
7 11120390(procedure)
The table in the text file is setup like this and I need to Pull the procedure number and the price of each procedure from the text file.
This is a general answer to a vaguely-presented question. You typically have to go through these steps:
Make a connection to the file
Open the file
Parse the file (as Simon was
saying): go through it as a series
of strings, find an orientation
point, get to the relevant parts
Import the relevant parts, perhaps
in a holding table
Present the data in typical Access
fashion (query, report)
And if the file isn't well structured or correctly generated, you'll need extra parsing code and perhaps error handling to deal with aberrations.