PowerBI - Excel datasource contains JSON in column - json

I have an excel sheet that has three columns A, B and C.
A and B contain regular text. A firstname and lastname, if you will. The third column C contains JSON data.
Is there a way I can read this file into PowerBI and have it automatically parse out the JSON data into additional columns? In PowerBI Desktop Client, I can use an excel sheet as the datasource, and it loads in my data into the client, however it naturally treats column C as just text. I've had a look at the Advanced Editor and I'm thinking I might have to include something in there to help parse that out.
Any ideas?

I figured it out. In the query editor, right-click on the column that contains the JSON, go to Transform and select JSON. It will parse out the data, allowing you to add them in as additional column.
Extremely handy!

Related

how can i write a condition in ssis to verify if the column is a string doesn't contain numbers

i have an excel source to a data base but before i transmit the information in the column product-name i have to verify if the information is a string without numbers in it(example:lion is correct but lion124 is wrong) , using conditional split
and after the verification i have to send a message in an excel file telling the user that the column that he wrote is not correct and if it is correct i will send it to the data base
how could i check for the column and how can i send a excel file ?
I would run it through a script transformation and use c#.
Add a column (boolean) to your data and use this:
Row.NewColForIntTest = Row.YourStringColumn.Any(char.IsDigit);
Then conditionally split off the new column.

Advanced mapping of JSON in Azure Data Factory - some guidance requested

I'm trying to map a JSON document (sensor data) into a more meaningful representation using Mapping Dataflows. However, hard time getting this to work and would really appreciate some insight/recommendations on how to solve the following:
The input is
What I would like to end up with is the following:
Any pointers as to how this can be implemented are more than welcome.
This can be accomplished using the Copy activity and then split function in Derived Column transformation in Azure Data Factory.
Use the copy activity to read the JSON file as source and in sink, use SQL database to store the data as table. In Mapping tab, Import the schema and map the JSON records to the corresponding column names. Refer this third-part tutorial for guidance - https://sqlkover.com/dynamically-map-json-to-sql-in-azure-data-factory/
Finally, use the Data Flow activity and choose the SQL table as source now which you have used as sink above.
Select the Derived Column transformation.
Use split function.
Add the column which will take the split values which you want to split as shown below.
Use split(<column_name_to_split>, '_') function to split the column on with _ delimiter. Change <column_name_to_split> to the name of column you cant to split. Refer image below.
Preview the data to check the result.

AWS glue: crawler does not identify metadata when CSV contains string and timestamp/date values

I have come across one thing when we consider CSV as input to crawler
crawler doesn't identify the columns header when all the data is in string format in CSV.
#P1 Headers are displayed as col0,col1...colN.
#P2 And actual column names are considered as data.
#P3 Metadata (i.e. column datatype is shown as string even the CSV dataset consists of date/timestamp value)
If we are going to consider custom (CSV) classifier then we are manually mentioning the column header.
#P2 will get covered i.e. column names will be removed however
#P1 still remain same. column header will be displayed as col0,col1...colN.
There are 3 things I want to avoid and achieve expected result.
CSV with strings only should show actual column names instead of col0,col1...colN.
Metadata of generated table should show correctly (i.e. date/timestamp, string) once it is crawled by crawler.
If Custom classifier is used, we need to mention column header names manually in classifier, yet result is not satisfactory.
Need generic solution instead of manual interventions.
Have gone through this document: here
If anyone has already implemented the solution, Please help.
I got solution to one of the above points. Headers i.e. first line of CSV is displayed by using 'Has heading' in CSV classifier.
However, Solution for following is yet to figure out.
Metadata of CSV file is shown as string even if column contains timestamp/date value. Crawler is reading these datatypes as string.
Custom classifier needs manual interventions. I have mentioned all column names in classifier. Is there generic solution?
If we are using pd.to_csv to write the dataframe, then to avoid getting column names as col1, col2 and so on, add the parameter
index_label='index' such as:
pd.to_csv(df,index_label='index')

Can we compare columns of multiple inputs file to derive a new column in SSIS

I am trying to create a derived column based on columns provided in different input file but unfortunately I keep getting error when I tried to map my Raw_File_1 with Derived Column.The error looks like this:
Cannot create a connector.
The destination component does not have any available inputs for use in creating a path.
My goal is to able to connect both Raw_File_1 and Map_File_1 into Derived Column and generate a new column.
If anyone can provide me any suggestion that would be great!!
I have source file and reference file both are flat file. My source file has column a, column b and column c and my reference file has column d, column e and column f.
If column a=column d and column b=column f then I want to populate column c as the same value as column f. How can I do this kind of analysis or lookup in SSIS
Based on your comments that I patched into the question, you're looking to augment the existing data based on matching data from your reference file.
The core of your SSIS package will look like this
In the first data flow, we will source from map_file_1 and load into a "raw" file.
I configure my raw file destination like this
When the package runs, it'll fill that special format file with the reference data. It's important, because you can either use a database or a raw file as your lookup source.
Finally, we get to work! A flat file source to a Lookup component. In the first tab of that lookup, be sure to change the Connection type from the default of "OLE DB connection manager" to "Cache connection manager"
In Connection tab, click to create a new CCM and use the raw file generated in the preceding step.
Map columns A to D and B to E (assuming data types match). Click the check box on column F and in the Lookup Operation part, Replace C with that value.
Final thoughts
This will be a case sensitive lookup. If things don't have a match in the reference file, it's going to blow up. That's probably not what you want so configure the Lookup transformation to not do that ;)
I blogged about using Excel to populate the cache if you want more words http://billfellows.blogspot.com/2011/11/using-excel-in-ssis-lookup.html
Your question is not clear, i will try to give some suggestions:
If you are looking to perform a lookup with a derived column:
You can use Cache Transform component and Cache connection manager to achieve that:
SSIS - How To Use Flat File Or Excel File In Lookup Transformation [Cache Transformation]
If you are looking to Merge both input:
Then you need to use Merge Join or Union All components:
SSIS Union All Transformation
Learn SSIS : MERGE, MERGE JOIN and UNION ALL
SSIS Basics: Using the Merge Join Transformation

How to Get Data from CSV File and Send them to Excel Using Pentaho?

I have a tabular csv file that has seven columns and containing the following data:
ID,Gender,PatientPrefix,PatientFirstName,PatientLastName,PatientSuffix,PatientPrefName
2 ,M ,Mr ,Lawrence ,Harry , ,Larry
I am new to pentaho and I want to design a transformation that moves the data (values of the 7 columns) to an empty excel sheet. The excel sheet has different column names, but should carry the same data, as shown:
prefix_name,first_name,middle_name,last_name,maiden_name,suffix_name,Gender,ID
I tried to design a transformation using the following series of steps, but it gives me errors at the end that I could not interpret them.
What is the proper design to move the data from the csv file to the excel sheet in this case? Any ideas to solve this problem?
As #Brian.D.Myers mentioned in the comment you can use select values step. But here is how you do it step by step explanation.
Select all the fields from CSV file input step.
Configure the select values step as follows.
In the Content tab of Excel writer step click on Get fields button and fill the fields. Alternatively you can use Excel output step as well.