Unable to match csv column name using load records from file - csv

I have created a new screen where I'm uploading Amazon orders from a CSV file. The original CSV file downloaded needs to have the first few lines removed to have the column headings at the top.
Before
After
I have a data field on the screen called date/time, but when I upload the CSV the mapping finds some sort of hidden characters in the file so the mapping doesn't automatically map the field.
I have tried changing the encoding while uploading the file, as well as changing the display name to include the ??? and double quotes like the image below, but the field doesn't auto-map.
Is there someway to get the field to auto-map on the file upload so the user doesn't have to map the field manually?
Thanks,
Kurt Bauer

Related

String gets separated into different rows in SSIS

I am currently working on an input file and I do have a column which contains 3 different values in one cell itself. Although this data is not being used in the transformation , I need to input this data from the source and then ignore when it is loaded into the staging table.
But the issue I face is that it gets loaded into separate rows rather than 1 cell.
This particular column is input as a string datatype. what change do I need to make to resolve this issue. Please let me know If more details are needed to answer the question.
I have uploaded a sample file to google drive https://drive.google.com/file/d/17hn8xmRd4CWsgKBzHgdwnR9W4jTJ9lTn/view?usp=sharing
The following is a screenshot of the csv data as opened in a text editor
Having downloaded sample.csv from your link, the first thing I did was open it in a text editor (Notepad++, TextPad, Visual Studio, etc) and just looked at what you have.
Row 1 is column headers
Encoded in UFT-8 with BOM (byte order marker)
Line Endings are CR/LF (Carriage Return & Line Feed)
Column delimiter appears to be a comma ,
Double Quote, ", is used as the text qualifier but only when needed
There are CR/LF characters in the actual data
I then define my flat file connection manager based on that data
Finally, I have a data flow with a Flat File Source to a Derived Column and drop a Data Viewer between them
As you can see, configuring your Flat File Connection Manager as I show will allow all the data to flow into your table as expected.
What is happening now is the CRLF, which is our row delimiter, is having precedence over the embedded CRLF in the column data. By setting the double quote as the Text Qualifier, the data reader correctly "skips" the embedded CRLF until it is encountered outside of the quotes.

Open csv without any data change/formatting

I have some csv file which contains data like this 0234
When I open that csv, my libreoffice automatically converts value like 234 (leading zero removed)
libreoffice also formats some large numbers, so instead of original values i'm getting like: 13323+15
Question: can I somehow set-up libreoffice like so, that it never changed original values and opened file without any auto formatting ?

Metadata map for importing CSV data into IPTC XMP images using Bridge

Let's say I have 100 scanned Tif files. I also have a CSV of the metadata for those 100 Tif files. Each file is named with its unique identifier, which is also column 1 of the csv.
First: How do I find a map that tells me what columns should be named what, in order to stay within the IPTC standard using XMP? (I've googled for most of the day and have found nothing)
Second: How can I merge the metadata in the CSV to each corresponding image?
I'm basically creating a spreadsheet with all 50,000 images in an archival collection, and plan to use the CSV to create the metadata for the images once they're scanned.
Thanks!
To know where to put your metadata, I'd suggest looking at the IPTC Photo Metadata Standard page. Without knowing more about your data, it's hard for someone else to say what data should go where.
As for embedding your data into your files from a CSV file, I'd suggest exiftool. Change the header of each column to the name of the TAG to write to and make the first column the path/filename of each file, your command would be as simple as
exiftool -csv=file.csv /path/to/files
See exiftool FAQ #26 for more details.

NiFi : Regular Expression in ExtractText gets CSV header instead of data

I'm working on a flow where I get CSV files. I want to put the records into different directories based on the first field in the CSV record.
For ex, the CSV file would look like this
country,firstname,lastname,ssn,mob_num
US,xxxx,xxxxx,xxxxx,xxxx
UK,xxxx,xxxxx,xxxxx,xxxx
US,xxxx,xxxxx,xxxxx,xxxx
JP,xxxx,xxxxx,xxxxx,xxxx
JP,xxxx,xxxxx,xxxxx,xxxx
I want to get the field value of the first field i.e, country. Put those records into a particular directory. US records goes to US directory, UK records goes to UK directory, and so on.
The flow that I have right now is:
GetFile ----> SplitText(line split count = 1 & header line count = 1) ----> ExtractText (line = (.+)) ----> PutFile(Directory = \tmp\data\${line:getDelimitedField(1)}). I need the header file to be replicated across all the split files for a different purpose. So I need them.
The thing is, the incoming CSV file gets split into multiple flow files with the header successfully. However, the regex that I have given in ExtractText processor evaluates it against the splitted flow files' CSV header instead of the record. So instead of getting US or UK in the "line" attribute, I always get "country". So all the files go to \tmp\data\country. Help me how to resolve this.
I believe getDelimitedField will only work off a singular line and is likely not moving past the newline in your split file.
I would advocate for a slightly different approach in which you could alter your ExtractText to find the country code through a regular expression and avoid the need to include the contents of the file as an attribute.
Using a regex of ^.*\n+(\w+) will capture the first line and the first set of word characters up to the comma and place them in the attribute name you specify in capture group 1. (e.g. country.1).
I have created a template that should get the value you are looking for available at https://github.com/apiri/nifi-review-collateral/blob/master/stackoverflow/42022249/Extract_Country_From_Splits.xml

Populate form with file name from mysql database

I've created a series of forms for users to add and edit records in a database. The database includes audio and image file names (the actual files are moved to folders on submit).
My problem is that I can't get the file names to display on the edit forms. Which means that unless the user uploads the files again, those fields are blanked in the database! I understand the "type='file'" does not take a "value" attribute. I was able to get around that in my textareas by simply displaying the php variable in the textarea. I tried that with file names, and they do display, but outside of the input box, which means... see above, blanked fields within the database.
Here's the code I'm using:
<li>
<label for=se_ogg">Sound excerpt (upload .ogg file)</label>
<input id="se_ogg" type = "file"
name = "se_ogg">' . $row['se_ogg'] . '</input>
</li>
Any ideas? Can this be done?
The file input field doesn't allow you to define a value for security reasons otherwise you could hide a file field and use it to grab files from unsuspecting peoples computers. If you just want to display the filename of the file just uploaded just display it as formatted text.