Combine files on commit in Mercurial - mercurial

I've got a project with 2 files I want to source-control using Mercurial:
A SCX-File which is a binary (database) file
A SCT-File which is a text file
My filter:
[encode]
**.scx = tempfile: sccxml INFILE OUTFILE 0
[decode]
**.scx = tempfile: sccxml INFILE OUTFILE 1
Problem
sccxml only receives the path to the SCX-File
The SCX-File can not be converted to a text-file without the corresponding SCT-File
Workarounds
Is it possible to combine the files before the filter runs?
Is it possible to pass both file's paths to sccxml-Converter?
UPDATE:
No, I'm using not using the Win32Text extension.
The SccXml-Executable needs both an SCT-File and an SCX-File as parameter to convert them to a Text-File (the text-representations of both files get tar'ed into one file).
I Want To have the binary files as Text-File in the Repo, to get meaningful diffs. I am currently trying to achieve this using a precommit hook.

Related

file "(...).csv" not Stata file error in using merge command

I use Stata 12.
I want to add some country code identifiers from file df_all_cities.csv onto my working data.
However, this line of code:
merge 1:1 city country using "df_all_cities.csv", nogen keep(1 3)
Gives me the error:
. run "/var/folders/jg/k6r503pd64bf15kcf394w5mr0000gn/T//SD44694.000000"
file df_all_cities.csv not Stata format
r(610);
This is an attempted solution to my previous problem of the file being a dta file not working on this version of Stata, so I used R to convert it to .csv, but that also doesn't work. I assume it's because the command itself "using" doesn't work with csv files, but how would I write it instead?
Your intuition is right. The command merge cannot read a .csv file directly. (using is technically not a command here, it is a common syntax tag indicating a file path follows.)
You need to read the .csv file with the command insheet. You can use it like this.
* Preserve saves a snapshot of your data which is brought back at "restore"
preserve
* Read the csv file. clear can safely be used as data is preserved
insheet using "df_all_cities.csv", clear
* Create a tempfile where the data can be saved in .dta format
tempfile country_codes
save `country_codes'
* Bring back into working memory the snapshot saved at "preserve"
restore
* Merge your country codes from the tempfile to the data now back in working memory
merge 1:1 city country using `country_codes', nogen keep(1 3)
See how insheet is also using using and this command accepts .csv files.

File not found when appending a csv file

Stata version: 12.1
I get an error "file not found" using this code:
cd "$path_in"
insheet using "df_mcd_clean.csv", comma clear
append using "df_mcd15_clean.csv" #where error happens
append using "df_ingram_liu1998_clean.csv"
append using "df_wccd_clean.csv"
I double checked that the file is indeed called that and located in the directory.
append is for appending .dta files. Therefore, if you ask to append foo.csv Stata assumes you are referring to foo.csv.dta, which it can't find.
The solutions include
Combine the .csv files outside Stata.
Read in each .csv file, save as .dta, then append.
The current version of the help for append says this:
append appends Stata-format datasets stored on disk to the end of the dataset in memory. If any filename is
specified without an extension, .dta is assumed.
and that was true too in Stata 12. (Whether the wording was identical, you can say.)

Opensmile: unreadable csv file while extracting prosody features from wav file

I am extracting prosody features from an audio file while using Opensmile using Windows version of Opensmile. It runs successful and an output csv is generated. But when I open csv, it shows some rows that are not readable. I used this command to extract prosody feature:
SMILEXtract -C \opensmile-3.0-win-x64\config\prosody\prosodyShs.conf -I audio_sample_01.wav -O prosody_sample1.csv
And the output of csv looks like this:
[
Even I tried to use the sample wave file given in Example audio folder given in opensmile directory and the output is same (not readable). Can someone help me in identifying where the problem is actually? and how can I fix it?
You need to enable the csvSink component in the configuration file to make it work. The file config\prosody\prosodyShs.conf that you are using does not have this component defined and always writes binary output.
You can verify that it is the standart binary output in this way: omit the -O parameter from your command so it becomesSMILEXtract -C \opensmile-3.0-win-x64\config\prosody\prosodyShs.conf -I audio_sample_01.wav and execute it. You will get a output.htk file which is exactly the same as the prosody_sample1.csv.
How output csv? You can take a look at the example configuration in opensmile-3.0-win-x64\config\demo\demo1_energy.conf where a csvSink component is defined.
You can find more information in the official documentation:
Get started page of the openSMILE documentation
The section on configuration files
Documentation for cCsvSink
This is how I solved the issue. First I added the csvSink component to the list of the component instances. instance[csvSink].type = cCsvSink
Next I added the configuration parameters for this instance.
[csvSink:cCsvSink]
reader.dmLevel = energy
filename = \cm[outputfile(O){output.csv}:file name of the output CSV
file]
delimChar = ;
append = 0
timestamp = 1
number = 1
printHeader = 1
\{../shared/standard_data_output_lldonly.conf.inc}`
Now if you run this file it will throw you errors because reader.dmLevel = energy is dependent on waveframes. So the final changes would be:
[energy:cEnergy]
reader.dmLevel = waveframes
writer.dmLevel = energy
[int:cIntensity]
reader.dmLevel = waveframes
[framer:cFramer]
reader.dmLevel=wave
writer.dmLevel=waveframes
Further reference on how to configure opensmile configuration files can be found here

NiFi merge CSV files using MergeRecord

i have a stream of JSON records that i convert it into CSV record successfully with this instruction. but now i want to merge this CSV records into one CSV file. below is that flow:
at step 5 i face with around 9K csv record, how do i merge it into one csv file using MergeRecord processor?
my csv header:
field1,field2,field3,field4,field5,field6,field7,field8,field9,field10,field11
some of this fields may be null and vary in records.
after this use UpdateAttribute configure it so that it can save the file with a filename and after that use putFile to store it to a specific location
I had a similar problem and solved it by using RouteonAttribute processor. Hope this helps someone.
Below is how I configure the processor using ${merge.count:equals(1)}

How to rename my hadoop result into a file with ".csv" extension

Actually my intention is to rename the output of a hadoop job to .csv files, because i need to visualize this csv data in rapidminer.
In How can i output hadoop result in csv format it is said, that for this purpose I need to follow these three steps:
1. Submit the MapReduce Job
2. Which will extract the output from HDFS using shell commands
3. Merge them together, rename as ".csv" and place in a directory where the visualization tool can access the final file
If so, how can I achieve this?
UPDATE
myjob.sh:
bin/hadoop jar /var/root/ALA/ala_jar/clsperformance.jar ala.clsperf.ClsPerf /user/root/ala_xmlrpt/Amrita\ Vidyalayam\,\ Karwar_Class\ 1\ B_ENG.xml /user/root/ala_xmlrpt-outputshell4
bin/hadoop fs -get /user/root/ala_xmlrpt-outputshell4/part-r-00000 /Users/jobsubmit
cat /Users/jobsubmit/part-r-00000 /Users/jobsubmit/output.csv
showing:
The CSV file was empty and couldn’t be imported.
when I tried to open output.csv.
solution
cat /Users/jobsubmit/part-r-00000> /Users/jobsubmit/output.csv
Firstly you need to retrieve MapReduce result from HDFS
hadoop dfs -copyToLocal path_to_result/part-r-* local_path
Then cat them into a single file
cat local_path/part-r-* > result.csv
Then it depends your MapReduce result format, if it's already a csv format, then it is done. If not, probably you have to use other tool like sed or awk to transform it into csv format.