EDIT: I know similar questions like this have been asked on SO but nothing compares to how simple I need this to be :)
I have a classic-asp web page that reads a CSV file and spits it onto the page using HTML. The content in the CSV file however contains some paragraphs with properly formed sentences.
In short, I need to display the grammar of these paragraphs which includes commas.
This is a snippet of what my parsing looks like:
sRows = oInStream.readLine
arrRows = Split(sRows,",")
If arrRows(0) = aspFileName And arrRows(1) = "minCamSys" Then
minCamSys1= arrRows(2)
minCamSys2= arrRows(3)
minCamSys3= arrRows(4)
How can I alter my Split() so that I can display commas without breaking the CSV format.
I would prefer to use double quotes around the data that contains a comma (as is usually the CSV standard when importing to Excel). For example:
Peter,Jeff,"Jim was from Ontario, Canada",Scott
I would like to avoid the use of a library as this is a simple in-house application.
Thank you!
Well folks the answer was right in front of my face. Kind of silly really but for this application, it will suffice.
I swapped out the , delimiter with a |. So the new code looks like this:
sRows = oInStream.readLine
arrRows = Split(sRows,"|")
This may not be a great solution but for this simple application it is all that is necessary.
Related
I have a slew of JSON files I'm getting dumps of, with data from the day/period it was pulled. Most of the JSON files I'm dealing with are a lot larger than this, but I figured a smaller one would be easier to work with.
{"playlists":[{"uri":"spotify:user:11130196075:playlist:1Ov4b3NkyzIMwfY9E8ixpE","listeners":366,"streams":386,"dateAdded":"2016-02-24","newListeners":327,"title":"#Covers","owner":"Saga Prommeedet"},{"uri":"spotify:user:mickeyrose30:playlist:2Ov4b3NkyzIMwfY9E8ixpE","listeners":229,"streams":263,"dateAdded":"removed","newListeners":154,"title":"bestcovers2016","owner":"Mickey Rose"}],"top":2,"total":53820}
What I'm essentially trying to do is add a date attribute to each line of data, so that when I combine multiple JSON files to put through an analytical tool, the right row of data is associated with the correct date. My first thought was to write it as such:
{"playlists":[{"uri":"spotify:user:11130196075:playlist:1Ov4b3NkyzIMwfY9E8ixpE","listeners":366,"streams":386,"dateAdded":"2016-02-24","newListeners":327,"title":"#Covers","owner":"Saga Prommeedet"},{"uri":"spotify:user:mickeyrose30:playlist:2Ov4b3NkyzIMwfY9E8ixpE","listeners":229,"streams":263,"dateAdded":"removed","newListeners":154,"title":"bestcovers2016","owner":"Mickey Rose"}],"top":2,"total":53820,"date":072617}
since the "top" and "total" attributes are showing up on each row of data (with the associated values also showing up on each row) when I put it through an analytical tool like Tableau.
Also, have been editing and saving files through Brackets, and testing things through this converter (https://konklone.io/json/)
In javascript language
var m = JSON.parse(json_string);
m["date"]="20170804";
JSON.stringify(m);
This will work for you, very simple,
I have a CSV file which I want to convert to Parquet for futher processing. Using
sqlContext.read()
.format("com.databricks.spark.csv")
.schema(schema)
.option("delimiter",";")
.(other options...)
.load(...)
.write()
.parquet(...)
works fine when my schema contains only Strings. However, some of the fields are numbers that I'd like to be able to store as numbers.
The problem is that the file arrives not as an actual "csv" but semicolon delimited file, and the numbers are formatted with German notation, i.e. comma is used as decimal delimiter.
For example, what in US would be 123.01 in this file would be stored as 123,01
Is there a way to force reading the numbers in different Locale or some other workaround that would allow me to convert this file without first converting the CSV file to a different format? I looked in Spark code and one nasty thing that seems to be causing issue is in CSVInferSchema.scala line 268 (spark 2.1.0) - the parser enforces US formatting rather than e.g. rely on the Locale set for the JVM, or allowing configuring this somehow.
I thought of using UDT but got nowhere with that - I can't work out how to get it to let me handle the parsing myself (couldn't really find a good example of using UDT...)
Any suggestions on a way of achieving this directly, i.e. on parsing step, or will I be forced to do intermediate conversion and only then convert it into parquet?
For anybody else who might be looking for answer - the workaround I went with (in Java) for now is:
JavaRDD<Row> convertedRDD = sqlContext.read()
.format("com.databricks.spark.csv")
.schema(stringOnlySchema)
.option("delimiter",";")
.(other options...)
.load(...)
.javaRDD()
.map ( this::conversionFunction );
sqlContext.createDataFrame(convertedRDD, schemaWithNumbers).write().parquet(...);
The conversion function takes a Row and needs to return a new Row with fields converted to numerical values as appropriate (or, in fact, this could perform any conversion). Rows in Java can be created by RowFactory.create(newFields).
I'd be happy to hear any other suggestions how to approach this but for now this works. :)
I want to add a link using existing string in in my wiki page.
This string will be appended to a url to form a complete URL.
This string consists of many words, for example "Crisis Management in International Computing"
I want to split by empty space " " then construct this string: "Crisis+Management+in+International+Computing"
Here is the String variable I have in my wiki page:
{{SUBJECTPAGENAME}}
Note: I have to check first if the string consists of more than one word, as if the string is just one word like this "Crisis" I won't perform split function.
I searched the web and did not find clear semantic to us in order to perform this issue.
Anyone experienced such a matter?
If I understand correctly from the comments, you want to replace all occurrences of space in your string, and replace it with +. That can be done with string functions of the ParserFunctions extension.
If you are running a fairly recent version of MediaWiki (>1.18, check by going to Special:Version), the ParserFunctions extension is bundled with the software. You just need to enable it by adding the following to LocalSettings.php:
require_once "$IP/extensions/ParserFunctions/ParserFunctions.php";
$wgPFEnableStringFunctions = true;
Then you will be able to write e.g.
{{#replace: {{SUBJECTPAGENAME}} |<nowiki> </nowiki>|+}}
Note however that if all you really want is a url version of a page name, you can just use {{SUBJECTPAGENAMEE}} instead of {{SUBJECTPAGENAME}}.
I would recommend you to go for a custom parser function.
Or as a hack, try splitting the string using the arraymaptemplate parser functions coming as part of Semantic Forms.
URL : arraymaptemplate parser function.
You can use an intro template to create the link and use array template to split and add the words to the intro template.
I have not tried it with delimiter character as space, but from the documentation, seems, it should be working using the html encoding for space.
I have three files: Conf.txt, Temp1.txt and Temp2.txt. I have done regex to fetch some values from config.txt file. I want to place the values (Which are of same name in Temp1.txt and Temp2.txt) and create another two file say Temp1_new.txt and Temp2_new.txt.
For example: In config.txt I have a value say IP1 and the same name appears in Temp1.txt and Temp2.txt. I want to create files Temp1_new.txt and Temp2_new.txt replacing IP1 to say 192.X.X.X in Temp1.txt and Temp2.txt.
I appreciate if someone can help me with tcl code to do same.
Judging from the information provided, there basically are two ways to do what you want:
File-semantics-aware;
Brute-force.
The first way is to read the source file, parse it to produce certain structured in-memory representation of its content, then serialize this content to the new file after replacing the relevant value(s) in the produced representation.
Brute-force method means treating the contents of the source file as plain text (or a series of text strings) and running something like regsub or string replace on this text to produce the new text which you then save to the new file.
The first way should generally be favoured, especially for complex cases as it removes any chance of replacing irrelevant bits of text. The brute-force way me be simpler to code (if there's no handy library to do this, see below) and is therefore good for throw-away scripts.
Note that for certain file formats there are ready-made libraries which can be used to automate what you need. For instance, XSLT facilities of the tdom package can be used to to manipulate XML files, INI-style file can be modified using the appropriate library and so on.
I'm looking for the easiest way to turn a CSV file (of floats) into a float list. I'm not well acquainted with reading files in general in Ocaml, so I'm not sure what this sort of function entails.
Any help or direction is appreciated :)
EDIT: I'd prefer not to use a third party CSV library unless I absolutely have to.
https://forge.ocamlcore.org/projects/csv/
If you don't want to include a third-party library, and your CSV files are simply formatted with no quotes or embedded commas, you can parse them easily with standard library functions. Use read_line in a loop or in a recursive function to read each line in turn. To split each line, call Str.split_delim (link your program with str.cma or str.cmxa). Call float_of_string to parse each column into a float.
let comma = Str.regexp ","
let parse_line line = List.map float_of_string (Str.split_delim comma line)
Note that this will break if your fields contain quotes. It would be easy to strip quotes at the beginning and at the end of each element of the list returned by split_delim. However, if there are embedded commas, you need a proper CSV parser. You may have embedded commas if your data was produced by a localized program in a French locale — French uses commas as the decimal separator (e.g. English 3.14159, French 3,14159). Writing floating point data with commas instead of dots isn't a good idea, but it's something you might encounter (some spreadsheet CSV exports, for example). If your data comes out of a Fortran program, you should be fine.