How to find all .csv files in a directory using Pharo? - csv

How can I find all files ending in .csv in a given directory using Pharo?

This will work too:
'G:\My Drive\Data Mining' asFileReference allChildrenMatching: '*.csv'

Use basename and endsWith: for the children of the directory (FileReference). From http://pharobooks.gforge.inria.fr/PharoByExampleTwo-Eng/latest/FileSystem.pdf:
working := 'G:\My Drive\Data Mining' asFileReference.
working allChildren select: [ :each | each basename endsWith: '.csv' ]

Related

Paths from json file don't expand in Snakemake

I have a Snakemake pipeline where I get my input/output paths for my file folders from a json file and use the expand function to obtain the paths.
import json
with open('config.json', 'r') as f:
config = json.load(f)
wildcard = ["1234", "5678"]
rule them_all:
input:
expand('config["data_input"]/data_{wc}.tab', wc = wildcard)
output:
expand('config["data_output"]/output_{wc}.rda', wc = wildcard)
shell:
"Rscript ./my_script.R"
My config.json is
{
"data_input": "/very/long/path",
"data_output": "/slightly/different/long/path"
}
While trying to make a dry run, though, I get the following error:
$ snakemake -np
Building DAG of jobs...
MissingInputException in line 12 of /path/to/Snakefile:
Missing input files for rule them_all:
config["data_input"]/data_1234.tab
config["data_input"]/data_5678.tab
The files are there and their path is /very/long/path/data_1234.tab.
This is probably a low-hanging fruit, but what am I doing wrong in the syntax for the expansion? Or is it the way I call the json file?
expand() does not interpret access to dictionaries for its first argument while expanding the path with quotation marks, so this operation with expand() has to be done in a wildcard.
The correct syntax, in this case, would be e.g.
expand('{input_folder}/data_{wc}.tab', wc = wildcard, input_folder = config["data_input"])

Append new element to JSON object in specific format using bash and jq

I would like to append an element to an existing JSON file where I have the working directory as the key and the working directory + the contents as the value in a string array format.
Lets say I have the following structure:
Docs (Directory)
|
+-- RandomFile.json
|
+-- Readme (Working Directory)
| |
| +-- Readme.md
| +-- Readyou.md
What I would like to achieve is the structure below with the working directory as the prefix for every element in the array.
"Readme": ["Readme/Readme.md", "Readme/Readyou.md"]
From the output above, I would like to append that to the contents of the RandomFile.json which currently looks like this:
{
"docs": {
"Doc": ["doc1"]
}
}
to this:
{
"docs": {
"Doc": ["doc1"],
"Readme": ["Readme/Readme.md", "Readme/Readyou.md"]
}
}
Is it something that can be managed straightforward using bash and jq?
This requires jq 1.6 in order to use the --args option.
$ jq --arg wd "$(basename "$PWD")" '.docs+={($wd): $ARGS.positional | map("\($wd)/\(.)")}' ../RandomFile.json --args *
{
"docs": {
"Doc": [
"doc1"
],
"Readme": [
"Readme/Readme.md",
"Readme/Readyou.md"
]
}
}
The shell is used to pass the base name of the current working directory as the variable $wd.
The shell is also used to pass the names of all the files in the current working directory as separate arguments.
The file to edit is assumed to be ../RandomFile.json; if you only know that there is a JSON file in the parent, you can use ../*.json instead.
Use += to update the .docs object of the original with a new key (the working directory) and list of file names. map prefixes each element of $ARGS.positional with $wd.

replace comma in json file's field with jq-win

I have a problem in working JSON file. I launch curl in AutoIt sciript to download a json file from web and then convert it to csv format by jq-win
jq-win32 -r ".[]" -c class.json>class.txt
and the json is in the following format:
[
{
"id":"1083",
"name":"AAAAA",
"channelNumber":8,
"channelImage":""},
{
"id":"1084",
"name":"bbbbb",
"channelNumber":7,
"channelImage":""},
{
"id":"1088",
"name":"CCCCCC",
"channelNumber":131,
"channelImage":""},
{
"id":"1089",
"name":"DDD,DDD",
"channelNumber":132,
"channelImage":""},
]
after jq-win, the file should become:
{"id":"1083","name":"AAAAA","channelNumber":8,"channelImage":""}
{"id":"1084","name":"bbbbb","channelNumber":7,"channelImage":""}
{"id":"1088","name":"CCCCCC","channelNumber":131,"channelImage":""}
{"id":"1089","name":"DDD,DDD","channelNumber":132,"channelImage":""}
and then the csv file will be further process by the AutoIt script and become:
AAAAA,1083
bbbbb,1084
CCCCCC,1088
DDD,DDD,1089
The json has around 300 records and among them, 5~6 record has comma in it eg DDD,DDD
so when I tried read in the csv file by _FileReadToArray, the comma in DDD,DDD cause trouble.
My question is: can I replace comma in the field using jq-win ?
(I tried use fart.exe but it will replace all comma in json file which is not suitable for me.)
Thanks a lot.
Regds
LAM Chi-fung
can I replace comma in the field using jq-win ?
Yes. For example, use gsub, pretty much as you’d use awk’s gsub, e.g.
gsub(","; "|")
If you want more details, please provide more details as per [mcve].
Example
With the given JSON input, the jq program:
.[]
| .name |= gsub(",";";")
| [.[]]
| map(tostring)
| join(",")
yields:
1083,AAAAA,8,
1084,bbbbb,7,
1088,CCCCCC,131,
1089,DDD;DDD,132,

BitBake: How to use shell script content as body of pkg_postinst or pkg_preinst functions?

I want to add the contents of a shell script into the body of pkg_preinst_${PN} or pkg_postinst_${PN} function (BitBake recipe of a software package).
For example, let's consider this "PREINST" shell script:
$ cat PREINST
#! /bin/sh
chmod +x /usr/bin/mybin
Executing a simple 'cat' command inside pkg_preinst function doesn't work:
pkg_preinst_${PN}() {
cat ${S}/path/to/PREINST
}
In this way, the contents for the .spec file for the generated rpm package are not the expected:
%pre
cat /Full/Path/To/Variable/S/path/to/PREINST
As you can see, %pre section doesn't include real contents of PREINST file, just includes the 'cat' command.
Is it possible to include the contents of PREINST file into the generated .spec file in some way?
Thank you in advance!
Finally I solved this issue by prepending this code to the do_package task:
do_package_prepend() {
PREINST_path = "${S}/${MYMODULE}/PREINST"
POSTINST_path = "${S}/${MYMODULE}/POSTINST"
PREINST = open(PREINST_path, "r")
POSTINST = open(POSTINST_path, "r")
d.setVar("pkg_preinst", PREINST.read())
d.setVar("pkg_postinst", POSTINST.read())
}
It modifies "pkg_preinst" and "pkg_postinst" keys in 'd' global dictionary with the content of each PREINST and POSTINST file as value. Now it works! :)

Issue importing CSV file with Logstash

I have a CSV file, and I want to import it in my Elastic Search.
I am on Windows 10 and I also have a Kibana be able to browse data once imported. I use Logstash to try to make this import.
All of my services (Kibana, ES and Logstash) are running on my localhost.
I tried with a the following Logstash configuration file (my csv file is in the correct path):
input {
file {
path => ["E:\Programmes\Logstash\logstash-2.2.0\data\*.csv"]
start_position => "beginning"
}
}
filter {
csv {
columns => ["billing_entity","invoice","company","username","reference","line_number","recipient_number","zone","network","date","time","country","duration","cost","currency","call_origin","billing_type"]
separator => ";"
}
#grok {
# match => { "call" => "%{WORD:billing_entity} %{WORD:invoice} %{WORD:company} %{WORD:username} %{WORD:reference} %{NUMBER:line_number} %{NUMBER:recipient_number} %{WORD:zone} %{WORD:network} %{DATE:date} %{TIME:time} %{WORD:country} %{WORD:duration} %{NUMBER:cost} %{WORD:currency} %{WORD:call_origin} %{WORD:billing_type}" }
#}
}
output {
elasticsearch {
action => "index"
index => "call_samples"
#index => "call-%{+YYYY.MM.dd}"
hosts => "localhost"
workers => 1
}
}
As you can see I tried to use 'csv' or 'grok' filter.
Then I launched in verbose mode logstash with this configuration file :
logstash.bat -f ..\conf\logstash.conf -v > logfile.txt
EDIT : after each try, I delete the generated sincedb files to simulate changes. But anyway I noticed they are empty
But in the logs I see nothing relevant :
message=>"Using mapping template from"
message=>"Attempting to install template"
message=>"New Elasticsearch output"
message=>"Registering file input"
message=>"No sincedb_path set,generating o....
message=>"Using mapping template from ...
message=>"Attempting to install template"
message=>"New Elasticsearch output"
message=>"Starting pipeline"
message=>"Pipeline started"
Quite alike my file is ignored .... I also tried several indexes , etc ... it will never import data.
To check if data is present I make query on localhost:9200 or I browse Kibana search bar "Index name or pattern" with the index "call_samples".
Can anyone help me on this please ? I'm stuck at this point ... Thanks
EDIT 2 :
Ok I'm dumb on this one, I just wanted to redirect the logs streams to a file when I was launching logstash :
logstash.bat -f ..\conf\logstash.conf -v > logfile.txt
But it was probably breaking the input file from being imported. So I just removed the part where I redirect on file :
logstash.bat -f ..\conf\logstash.conf -v
Now, my index is correctly created, but no data is being imported ...
It was an encoding issue, and even in verbose mode, it never told me it was failing or something ... not even a little clue.
So I tested with a new test file encoded in UTF-8 and it worked well ...