MySql Seperate values in one col to many - mysql

I am retrieving data from mysql db. All the data is one column. I need to separate this into several cols: The structure of this col is as follows:
{{product ID=001 |Country=Netherlands |Repository Link=http://googt.com |Other Relevant Information=test }} ==Description== this are the below codes: code 1 code2 ==Case Study== case study 1 txt case study 2 txt ==Benefits== ben 1 ben 2 === Requirements === (empty col) === Architecture === *arch1 *arch2
So I want cols like: Product ID, Country, Repository Link, Architecture etc.....

If you are planning on simply parsing out the output of your column, it will depend on the language of choice you are currently using.
However, in general the procedure for doing this is as follows.
1, pull output into string
2, find a delimiter(In you case it appears '|' will do)
3, you have to options here(again depending on language)
A, Split each segment into an array
1, Run array through looping structure to print out each section OR use array
to manipulate data individually(your choice)
B, In Simple String method, you can either create a new string, or replace all
instances of '|' with '\n'(new line char) so that you can display all data.
I recommend the array conversion as this will allow you to easily interact with the data in a simple manner.
This is often something done today with json and other such formats which are often stored in single fields for various reasons.
Here is an example done in php making use of explode()
$unparsed = "this | is | a | string that is | not: parsed";
$parsed = explode("|", $unparsed);
echo $parsed[2]; // would be a
echo $parsed[4]; // would be not: parsed

Related

Extract string from csv file after reading in Prolog

Good evening,
I am trying to read a csv file in Prolog containing all the countries in the world. Executing this code:
read_KB(R) :- csv_read_file("countries.csv",R).
I get a list of Terms of this type:
R = [row('Afghanistan;'), row('Albania;'), row('Algeria;'), row('Andorra;'), row('Angola;'), row('Antigua and Barbuda;'), row('Argentina;'), row('Armenia;'), row(...)|...].
I would like to extract only the names of each country in form of a String and put all of them into a list of Strings.
I tried this way with only the first row executing this:
read_KB(L) :- csv_read_file("/Users/dylan/Desktop/country.csv",R),
give(R,L).
give([X|T],X).
I obtain only a Term of type row('Afghanistan;')
You can use maplist/3:
read_KB(Names) :-
csv_read_file('countries.csv', Rows, [separator(0';)]),
maplist([row(Name,_), Name] >> true, Rows, Names).
The answer given by #slago can be simplified, using arg/3 instead of a lambda expression, making it slightly more efficient:
read_KB(Names) :-
csv_read_file('countries.csv', Rows, [separator(0';)]),
maplist(arg(1), Rows, Names).

How to remove empty cells from csv while parsing csv using PapaParse?

Or to put the question another way: Why is PapaParse's ParseResult.data an empty array when trimming all leading and trailing empty cells during Papa.step() function? EDIT: Please note I can achieve what I'm wanting by mapping over the parsed results and trimming, but I don't want to parse and then map, I'd rather do it all in one go.
Example CSV:
Col 1,Col 2,Col 3
1-1,1-2,
,2-2,2-3
3-1,3-2,3-3
Note that row 1 contains headers (Col 1, Col 2, etc). Row 2 col 3 is empty, and
row 3 col 1 is empty.
Given that CSV, I want to present this back to the user (as a nicely-formatted
table):
| | | |
|-----|-----|-----|
| 1-1 | 1-2 | |
| 2-2 | 2-3 | |
| 3-1 | 3-2 | 3-3 |
I want to push all rows as far to the left as they can go, and remove all empty
cells from the end of each row.
In other words, I want to trim all empty cells from both the beginning and the
end of each row. Below is the code I'm using. I have put debuggers inside of
trimEmptyCells and it is doing exactly as expected. However, the ParseResult
that parseAndTrim returns contains an empty data array.
export const parseAndTrim = (csv: string): Papa.ParseResult => {
return Papa.parse(csv, {
skipEmptyLines: true,
step: trimEmptyCells,
})
};
const trimEmptyCells = (results: Papa.ParseResult) => {
// Note that `_.dropWhile` and `_.dropRightWhile` are [lodash
// functions](https://lodash.com/docs/4.17.15#dropRight).
const leftTrimmed = _.dropWhile(results.data, (r) => r === "");
return _.dropRightWhile(leftTrimmed, (r) => r === "");
};
My first guess was
that PapaParse was experiencing errors with arrays with different lengths, but
the errors array is also empty. So I tested what I could (no step function)
at https://www.papaparse.com/demo using the example below and simply having
missing cells (not merely empty) throws no errors and returns a proper data
array.
Example test input at https://www.papaparse.com/demo
Col 1,Col 2,Col 3
1-1,1-2
,2-2,2-3
Based on this comment from pokoli (the #2 contributor to PapaParse and the #1 contributor since early 2017), I believe this is impossible. pokoli's proposed solution is
You should use Papa.parse to read records as array, filter them and then use Papa.Unparse to write the second file.
I wish I could mutate data while parsing to be faster, but PapaParse is very fast. I was able to parse a 36,000-line csv in under 300ms, and unparse in twice the time. Parsing a 2,000-line csv took under 30ms and unparse again took twice the time. My use case will involve CSVs under 2,000 lines 99% of the time so parsing into 2d array, filtering, unparsing back into csv, then parsing again into json won't take too long.

Kusto KQL reference first object in an JSON array

I need to grab the value of the first entry in a json array with Kusto KQL in Microsoft Defender ATP.
The data format looks like this (anonymized), and I want the value of "UserName":
[{"UserName":"xyz","DomainName":"xyz","Sid":"xyz"}]
How do I split or in any other way get the "UserName" value?
In WDATP/MSTAP, for the "LoggedOnUsers" type of arrays, you want "mv-expand" (multi-value expand) in conjunction with "parsejson".
"parsejson" will turn the string into JSON, and mv-expand will expand it into LoggedOnUsers.Username, LoggedOnUsers.DomainName, and LoggedOnUsers.Sid:
DeviceInfo
| mv-expand parsejson(LoggedOnUsers)
| project DeviceName, LoggedOnUsers.UserName, LoggedOnUsers.DomainName
Keep in mind that if the packed field has multiple entries (like DeviceNetworkInfo's IPAddresses field often does), the entire row will be expanded once per entry - so a row for a machine with 3 entries in "IPAddresses" will be duplicated 3 times, with each different expansion of IpAddresses:
DeviceNetworkInfo
| where Timestamp > ago(1h)
| mv-expand parsejson(IPAddresses)
| project DeviceName, IPAddresses.IPAddress
to access the first entry's UserName property you can do the following:
print d = dynamic([{"UserName":"xyz","DomainName":"xyz","Sid":"xyz"}])
| extend result = d[0].UserName
to get the UserName for all entries, you can use mv-expand/mv-apply:
print d = dynamic([{"UserName":"xyz","DomainName":"xyz","Sid":"xyz"}])
| mv-apply d on (
project d.UserName
)
thanks for the reply, but the proposed solution didn't work for me. However instead I found the following solution:
project substring(split(split(LoggedOnUsers,',',0),'"',4),2,9)
The output of this is: UserName

CSV Parser through angularJS

I am building a CSV file parser through node and Angular . so basically a user upload a csv file , on my server side which is node the csv file is traversed and parsed using node-csv
. This works fine and it returns me an array of object based on csv file given as input , Now on angular end I need to display two table one is csv file data itself and another is cross tabulation analysis. I am facing problem while rendering data, so for a table like
I am getting parse responce as
For cross tabulation we need data in a tabular form as
I have a object array which I need to manipulate in best possible way so as to make easily render on html page . I am not getting a way how to do calculation on data I get so as to store cross tabulation result .Any idea on how should I approach .
data json is :
[{"Sample #":"1","Gender":"Female","Handedness;":"Right-handed;"},{"Sample #":"2","Gender":"Male","Handedness;":"Left-handed;"},{"Sample #":"3","Gender":"Female","Handedness;":"Right-handed;"},{"Sample #":"4","Gender":"Male","Handedness;":"Right-handed;"},{"Sample #":"5","Gender":"Male","Handedness;":"Left-handed;"},{"Sample #":"6","Gender":"Male","Handedness;":"Right-handed;"},{"Sample #":"7","Gender":"Female","Handedness;":"Right-handed;"},{"Sample #":"8","Gender":"Female","Handedness;":"Left-handed;"},{"Sample #":"9","Gender":"Male","Handedness;":"Right-handed;"},{"Sample #":";"}
There are many ways you can do this and since you have not been very specific on the usage, I will go with the simplest one.
Assuming you have an object structure such as this:
[
{gender: 'female', handdness: 'lefthanded', id: 1},
{gender: 'male', handdness: 'lefthanded', id: 2},
{gender: 'female', handdness: 'righthanded', id: 3},
{gender: 'female', handdness: 'lefthanded', id: 4},
{gender: 'female', handdness: 'righthanded', id: 5}
]
and in your controller you have exposed this with something like:
$scope.members = [the above array of objects];
and you want to display the total of female members of this object, you could filter this in your html
{{(members | filter:{gender:'female'}).length}}
Now, if you are going to make this a table it will obviously make some ugly and unreadable html so especially if you are going to repeat using this, it would be a good case for making a directive and repeat it anywhere, with the prerequisite of providing a scope object named tabData (or whatever you wish) in your parent scope
.directive('tabbed', function () {
return {
restrict: 'E',
template: '<table><tr><td>{{(tabData | filter:{gender:"female"}).length}}</td></tr><td>{{(tabData | filter:{handedness:"lefthanded"}).length}}</td></table>'
}
});
You would use this in your html like so:
<tabbed></tabbed>
And there are ofcourse many ways to improve this as you wish.
This is more of a general data structure/JS question than Angular related.
Functional helpers from Lo-dash come in very handy here:
_(data) // Create a chainable object from the data to execute functions with
.groupBy('Gender') // Group the data by its `Gender` attribute
// map these groups, using `mapValues` so the named `Gender` keys persist
.mapValues(function(gender) {
// Create named count objects for all handednesses
var counts = _.countBy(gender, 'Handedness');
// Calculate the total of all handednesses by summing
// all the values of this named object
counts.Total = _(counts)
.values()
.reduce(function(sum, num) { return sum + num });
// Return this named count object -- this is what each gender will map to
return counts;
}).value(); // get the value of the chain
No need to worry about for-loops or anything of the sort, and this code also works without any changes for more than two genders (even for more than two handednesses - think of the aliens and the ambidextrous). If you aren't sure exactly what's happening, it should be easy enough to pick apart the single steps and their result values of this code example.
Calculating the total row for all genders will work in a similar manner.

Parsing numerical data using Prolog?

I am new to prolog and am considering using it for a small data analysis application. Here is what I am seeking to accomplish:
I have a CSV file with some data of the following from:
a,b,c
d,e,f
g,h,i
...
The data is purely numerical and I need to do the following: 1st, I need to group rows according to the following scheme:
So what's going on above?
I start at the 1st row, which has value 'a' in column one. Then, I keep going down the rows until I hit a row whose value in column one differs from 'a' by a certain amount, 'z'. The process is then repeated, and many "groups" are formed after the process is complete.
For each of these groups, I want to find the mean of columns two and three (as an example, for the 1st group in the picture above, the mean of column two would be: (b+e+h)/3).
I am pretty sure this can be done in prolog. However, I have 50,000+ rows of data and since prolog is declarative, I am not sure how efficient prolog would be at accomplishing the above task?
Is it feasible to work out a prolog program to accomplish the above task, so that efficiency of the program is not significantly lower than a procedural analog?
this snippet could be a starting point for your task
:- [library(dcg/basics)].
rownum(Z, AveList) :- phrase_from_file(row_scan(Z, [], [], AveList), 'numbers.txt').
row_scan(Z, Group, AveSoFar, AveList) -->
number(A),",",number(B),",",number(C),"\n",
{ row_match(Z, A,B,C, Group,AveSoFar, Group1,AveUpdated) },
row_scan(Z, Group1, AveUpdated, AveList).
row_scan(_Z, _Group, AveList, AveList) --> "\n";[].
% row_match(Z, A,B,C, Group,Ave, Group1,Ave1)
row_match(_, A,B,C, [],Ave, [(A,B,C)],Ave).
row_match(Z, A,B,C, [H|T],Ave, Group1,Ave1) :-
H = (F,_,_),
( A - F =:= Z
-> aggregate_all(agg(count,sum(C2),sum(C3)),
member((_,C2,C3), [(A,B,C), H|T]), agg(Count,T2,T3)),
A2 is T2/Count, A3 is T3/Count,
Group1 = [], Ave1 = [(A2,A3)|Ave]
; Group1 = [H,(A,B,C)|T], Ave1 = Ave
).
with this input
1,2,3
4,5,6
7,8,9
10,2,3
40,5,6
70,8,9
16,0,0
yields
?- rownum(6,L).
L = [ (3.75, 4.5), (5, 6)]