how to delete all files in a single folder containing multiple folders that contains multiple files but only delete files that doesn't contain (2)? - duplicates

I have converted thousands of sound files to a lower quality to reduce my project file size but instead of the converter overwriting the original files, it created duplicate files of the original ones with (2) on the end of each file name.
Now I want to delete the original files that doesn't have (2) on the end of the file name and then rename the duplicates with the original file names by simply removing the (2) on the end of each file name, but I don't want to do it one by one because it would take a very long time and my final defense is coming next week. Can anyone help?

Related

Verifying and Formatting Text before importing it to MySql

I have an issue importing some CSV/TXT files.
Here at the company we receive files from other sources (companies). Some of these files sometimes come partially broken.
For example, a file containing 6 columns (id, name, city, state, zipCode, phone) and 2 million lines. The first 10.000 lines of that file are OK. But in the middle of the file instead of 6 columns, it has 5 or even 7 columns.
It seems like somebody "merged" several files into this one and did not pay attention to the number of columns. So when I import it to my MySql database table, the data comes very messy due to the columns being broken. The zipCode records show up on the field state and so on.
I was wondering how to scan such file before importing it to my DB, something like counting the ";" delimiters of each line. Would it be done using Regex or what would be the best option for that?
My program is written in Lazarus/Pascal.
I would read the file line by line and check the columns.
If a line respects the expected columns (count, , copy it in another file (input_OK.csv).
If it doesn't dump it in a broken lines file (input_KO.csv).
Study input_KO.csv errors, correct them then import the corrected file into the database.
IMO, a regex will take long here.

How to reference a document in MySQL?

My database has a table within it, let's call it 'doclist'. I want the table to display a document name, the date associated with it, and a link to download this file. After researching it seems that the file (looking to include a PDF or Word doc) needs to be stored somewhere within htdocs, and I will need to reference the file location within one of the table fields. What type of field would that need to be: varchar, blob, or text? The only other tutorials I have seen including files are just dumping data, where I want to reference a file and allow its download. Any direction would be very helpful at this point.

Insert missing rows in CSV of incrementally numbered files generated by directory listing?

I have created a CSV from a set of files in a directory that are numbered incrementally:
img1_1.jpg, img1_2.jpg ... img1_1999.jpg, img1_2000.jpg
The CSV output is like so:
filename, datetime
eg:
img1_1.JPG,2011-05-11 09:16:33.000000000
img1_3.jpg,2011-05-11 10:10:55.000000000
img1_4.jpg,2011-05-11 10:17:31.000000000
img1_6.jpg,2011-05-11 10:58:37.000000000
The problem is, there are a number of files missing in the listing, as some of the files don't exist. As a result, when imported, the actual row number does not match the file number.
Can anyone think of a reasonably efficient way to insert the missing rows so that the row number and filename matches up other than manually inserting rows for the missing ones? (There are over 800 missing rows).
Background
A previous programmer developed an uploader script and did not save the creation time of the mysql record in the database. I figured the easiest way to find the creation time for the majority of the records would be to output a directory listing of all the files and combine them in a spreadsheet.
You exactly need to do what you write in your comment to answer #tadman.
A text parser script to inject the missing lines with e.g. a date/time value that reflects the record is an empty one, i.e. there is no real data is behind it (e.g. date it to 1950-01-01 00:00:00). When it is done, bulk import the CSV.I think this must be the best and most efficient solution.
Also, think about any future insert/delete/update events might occur to your data.
That would possibly break the chain you initially have had, so you might prefer instead, to introduce a numeric field for the jpegs IDs (and index that field), and leave the PK "as is" (auto increment).
In this case you can avoid CSV manipulation, as well as being chained to your AUTO PK (means: you will not get in trouble if a new jpeg arrives with an ID which was previously deleted, or existing ID, etc).
So the solution really depends on how you want to use this table in the future. If you give more details, I am sure the community can come up with even more ideas.
If it's a one-time thing, it might be easiest to open up your csv in a spreadsheet.
If your table above is in sheet1, you could put something like the following in sheet2 (this is openoffice, but there are similar functions for Excel)
pre_filename | filename | datetime
img1_1 | = A2&".JPG" | =OFFSET(Sheet1.$B$1;MATCH(B2;Sheet1.$A$2:$A$4;0);0)
You should be able to select the three cells above and drag them down to however many you need.

best way to rename a database field to given pattern if it already exixts?

I have a database table that contains id, filename, userId
id is unique identifier
filename should also be unique
table may contain >10000 records
When a user uploads a file it should be entered in database with given
rules:
If there is no record with same filename, it should be added as it is (Ex. foobar.pdf)
If there is record with same filename, it should be added as uploadedName(2).ext (foobar(2).pdf)
If there are n records with same base filename (foobar), it should be added as uploadedName(n+1).ext (foobar(20).pdf)
Now if foobar(2).pdf is uploaded, it should be added as foobar(2)(2).pdf & so on
This pattern needs to be followed because the file is already being
uploaded at client side using ajax before sending the details to
server and the file hosting service follows the above rules to name
the files.
My solution:
maintain a file that contains all the names and the number of times
it has occurred.
if a filename that exists in file is entered, increase occurrence count and new name is generated, else add to it to file
if the new name generated is in database, add it to file and generate new name
I would recommend you, since is an upload application, to do as follow
Create a column in your files table that store the original name of the file
Create another column to store a new generated name for the file, and this name would be some md5 or hash from the original plus the timestamp at the time of the upload, that ways you wouldn't have duplicated names.
Then when you upload your files you save then with this new name on the disk but show the original name to any application that requests it and if you need to download or stream it just get the saved hashed name from the database.

Extracting column names from several CSV files programming

I have 40 csv files. All the files have different column names. I want a list of column names of each csv in a table format(in csv or in excel). The new file should contains list of column names from each csv file and corresponding csv file name.
I am doing it manually for now but if the number of files increases, it will become problematic. I want to do it using code.
Note: It can be a very trivial thing but I am not a techie so please help.