AWK: How to merge CSV files and eliminate rows that contain certain values? - csv

I have hundreds of CSV files. Each CSV file is similar to this:
| KEYWORD | NUMBER OF COMPS | AVGE M E (K) | GS/M | EST. A SE/M | C CORE |
|---------|-----------------|--------------|------|-------------|--------|
| Apples | 311 | 12 | N/A | <100 | 10 |
| Bananas | >1,200 | 737 | N/A | 490 | 88 |
| Oranges | 48 | 184 | N/A | N/A | 1 |
| Fruits | 161 | 94 | N/A | - | 6 |
(I have posted this in table format, to make it more readable, but the CSV data is at the bottom of this post).
All the CSV files have the same header row. Only the data is different.
I would like to do the following:
Merge all the CSV files together, but only have 1 header row.
Omit any rows where EST. A SE/M (Column 5) contains any of the following data: <100, N/A or -
Notes about the Data
Sometimes the some or even all cells in the CSV file are wrapped in quotation marks.
Other times they are not.
Sometimes the first column (keyword) may contain multiple words or accented characters.
My code so far
This code merges all the CSV files into 1 without only one heading
awk '(NR == 1) || (FNR > 1)' *.csv > ^0-output.csv
This works perfectly.
However, I am not sure how to delete the unwanted rows after the merge.
So far I have this:
awk '$5 !~ /(<100|N\/A|-)/' ^0-output.csv > ^0-output.csv
But when I use this code, it just produces a blank file.
Plus, I am not sure if there is a way to integrate it in the first line, so it does everything with a single command.
Notes
Here is how the data looks in CSV format
Sample1.csv
KEYWORD,NUMBER OF COMPS,AVGE M E (K),GS/M,EST. A SE/M,C CORE
Apples,311,12,N/A,<100,10
Bananas,">1,200",737,N/A,490,88
Oranges,48,184,N/A,N/A,1
Fruits,161,94,N/A,-,63
Sample2.csv
KEYWORD,NUMBER OF COMPS,AVGE M E (K),GS/M,EST. A SE/M,C CORE
Dino,588,67,N/A,888,234
Thunder,">1,200",211,N/A,<100,77
Ninja,95,37,N/A,-,878
Sample3.csv
KEYWORD,NUMBER OF COMPS,AVGE M E (K),GS/M,EST. A SE/M,C CORE
Blur,84,2454,N/A,-,234
Sample4.csv
"KEYWORD","NUMBER OF COMPS","AVGE M E (K)","GS/M","EST. A SE/M","C CORE"
"hedgehog rolls ròund",32,481,N/A,"878",13
"Clever Fox jumps Hîgh",233,83,N/A,"<100",12
"Bear à lot",122,35,N/A,"-",11
"kitten hîgh life","121","673","32","N/A","15"
Please note: The actual files that the finished script will be used on will have a variety of file names. They will NOT always follow the pattern of sample 1, sample 2 etc.
Expected Output
Expected output: (CSV format)
KEYWORD,NUMBER OF COMPS,AVGE M E (K),GS/M,EST. A SE/M,C CORE
Bananas,">1,200",737,N/A,490,88
Dino,588,67,N/A,888,234
"hedgehog rolls ròund",32,481,N/A,"878",13
(Note: It doesn't matter if the expected output keeps the wrapping quote marks as the final CSV file is opened in Apple Numbers)
Expected output: (Readable format)
| KEYWORD | NUMBER OF COMPS | AVGE M E (K) | GS/M | EST. A SE/M | C CORE |
|---------|-----------------|--------------|------|-------------|--------|
| Bananas | >1,200 | 737 | N/A | 490 | 88 |
| Dino | 588 | 67 | N/A | 888 | 234 |
| hedgehog rolls ròund | 588 | 67 | N/A | 888 | 234 |
Environment:
I am using Mac OS X 10.14.6. I am unable to install other versions of awk.

You may just add merge 2 conditions into one using && :
awk -F, 'NR==1 || (FNR>1 && $5 !~ /^(<100|N\/A|-)$/)' *.csv > output.csv
Here $5 !~ /^(<100|N\/A|-)$/) will skip a row if $5 is <100 or - or N/A. It is important to use regex anchors ^ and $ to avoid matching unwanted string such as 1000 or AB-123.
It seems you have a comma in double quotes also in file1.csv. In that case following gnu-awk command should work from you:
awk -v FPAT='"[^"]*"|[^,]*' '
NR == 1 || (FNR > 1 && $5 !~ /^(<100|N\/A|-)*$/)' *.csv > output.csv

EDIT: As per OP's comments there could be a comma in between " too, so to handle that its better to use FPAT, written and tested with GNU awk.
awk -v FPAT='[^,]*|"[^"]+"' '
{ sub(/\r$/,"") }
FNR==1{
if(NR==1){ print }
next
}
$5=="<100"||$5=="N/A"||$5=="-"{
next
}
1
' *.csv
Could you please try following, written and tested with GNU awk on shown samples only.
awk '
BEGIN{
FS=OFS=","
}
FNR==1{
if(NR==1){ print }
next
}
$5=="<100"||$5=="N/A"||$5=="-"{ next }
1
' *.csv
OR in case your values can contain something else also and you want to use regex to match the values which you want to neglect then try following.
awk '
BEGIN{
FS=OFS=","
}
FNR==1{
if(NR==1){ print }
next
}
$5~/<100/ || $5~/N\/A/ || $5~/-/{ next }
1
' *.csv
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section of this program from here.
FS=OFS="," ##Setting field separator as comma here.
}
FNR==1{ ##Checking condition if its firt line of current Input_file then do following.
if(NR==1){ print } ##If its very first line of very first Input_file then print that line.
next ##next will skip all further statements from here.
}
$5=="<100"||$5=="N/A"||$5=="-"{ next } ##Checking condition if 5th field contains either <100 OR N/A OR - then skip all further statements.
1 ##awk'sh way to print the current line.
' *.csv ##Passing all .csv files to awk program from here.

It looks to me like you're only interested in testing the 2nd-last field and neither that nor the last field can contain commas so just count field numbers from the end instead of from the beginning of each line and then you don't care whether earlier fields contain commas or not. Given that, this will work using any awk:
$ awk -F',' '(NR==1) || (FNR>1 && $(NF-1)!~/^"?(<100|N\/A|-)"?$/)' *.csv
KEYWORD,NUMBER OF COMPS,AVGE M E (K),GS/M,EST. A SE/M,C CORE
Bananas,">1,200",737,N/A,490,88
Dino,588,67,N/A,888,234
"hedgehog rolls ròund",32,481,N/A,"878",13

Related

How can I clean a TSV file having record or fields separators in one of its fields?

Given a TSV file with col2 that contains either a field or record separator (FS/RS) being respectively a tab or a carriage return which are escaped/surrounded by quotes.
$ printf '%b\n' 'col1\tcol2\tcol3' '1\t"A\tB"\t1234' '2\t"CD\nEF"\t567' | \cat -vet
col1^Icol2^Icol3$
1^I"A^IB"^I1234$
2^I"CD$
EF"^I567$
+------+---------+------+
| col1 | col2 | col3 |
+------+---------+------+
| 1 | "A B" | 1234 |
| 2 | "CD | 567 |
| | EF" | |
+------+---------+------+
Is there a way in sed/awk/perl or even (preferably) miller/mlr to transform those pesky characters into spaces in order to generate the following result:
+------+---------+------+
| col1 | col2 | col3 |
+------+---------+------+
| 1 | "A B" | 1234 |
| 2 | "CD EF" | 567 |
+------+---------+------+
I cannot get miller 6.2 to make the proper transformation (tried with DSL put/gsub) because it doesn't recognize the tab or CR/LF being part of the columns which breaks the field number:
$ printf '%b\n' 'col1\tcol2\tcol3' '1\t"A\tB"\t1234' '2\t"CD\nEF"\t567' | mlr --opprint --barred --itsv cat
mlr : mlr: CSV header/data length mismatch 3 != 4 at filename (stdin) line 2.
A good library cleanly handles things like embedded newlines and quoted separators (in fields)
In a Perl script with Text::CSV
use warnings;
use strict;
use Text::CSV;
my $file = shift // die "Usage: $0 filename\n";
my $csv = Text::CSV->new( { binary => 1, sep_char => "\t", auto_diag => 1 } );
open my $fh, '<', $file or die "Can't open $file: $!";
while (my $row = $csv->getline($fh)) {
s/\s+/ /g for #$row; # collapse multiple spaces, tabs, newlines
$csv->say(*STDOUT, $row);
}
Note the many other options for the constructor that can help handle various irregularities.
This can fit in a one-liner; its functional interface (with csv) is particularly well suited for that.
if you run
printf '%b\n' 'col1\tcol2\tcol3' '1\t"A\tB"\t1234' '2\t"CD\nEF"\t567' | \
mlr --c2t --fs "\t" clean-whitespace
col1 col2 col3
1 A B 1234
2 CD EF 567
I'm using mlr 6.2.
A way to do it in miller 5 is to use simply the put verb:
printf '%b\n' 'col1\tcol2\tcol3' '1\t"A\tB"\t1234' '2\t"CD\nEF"\t567' | \
mlr --tsv put -S 'for (k in $*) {$[k] = gsub($[k], "\n", " ")}' then clean-whitespace
perl -MText::CSV_XS=csv -e'
csv
in => *ARGV,
on_in => sub { s/\s+/ /g for #{$_[1]} },
sep_char => "\t";
'
Or s/[\t\n]/ /g if you prefer.
Can be placed all on one line.
Input is accepted from file named by argument or STDIN.
With GNU awk for multi-char RS, RT, and gensub():
$ awk -v RS='"([^"]|"")*"' '{ORS=gensub(/[\n\t]/," ","g",RT)} 1' file
col1 col2 col3
1 "A B" 1234
2 "CD EF" 567
The above just uses RS to isolate each "..." string and saves it in RT, then replaces every \n or \t in that string with a blank and saves the result in ORS, then prints the record.
you absolutely don't need gawk to get this done - here's one that works for mawk, gawk, or macos nawk :
INPUT
--before--
col1 col2 col3
1 "A B" 1234
2 "CD
EF" 567
CODE
{m,n,g}awk '
BEGIN {
1 __=substr((OFS=FS="\t\"")(FS)(ORS=_)\
(RS = "^$"),_+=_^=_<_,_)
}
END {
1 printbefore()
3 for (_^=_<_; _<=NF; _++) {
3 sub(/[\t-\r]+/, ($_~__)?" ":"&", $_)
}
1 print
}
1 function printbefore(_)
{
1 printf("\n\n--before--\n%s\n------"\
"------AFTER------\n\n", $+_)>("/dev/stderr")
}
OUTPUT
———AFTER (using mawk)------
col1 col2 col3
1 "A B" 1234
2 "CD EF" 567
strip out the part about printbefore() that's more for debugging purposes, then it's just
{m,n,g}awk '
BEGIN { __=substr((OFS=FS="\t\"") FS \
(ORS=_) (RS="^$"),_+=_^=_<_,_)
} END {
for(--_;_<=NF;_++) {
sub(/[\t-\r]+/, $_~__?" ":"&",$_) } print }'

jq create output in many separate files

given the following json:
[
{"_id":{"$oid":"6d2"},"jlo":"ΕΙ AJSB","dd":"d5f"},
{"_id":{"$oid":"c6d3"},"jlo":"ΕΙ ALKSB","dd":"5d9"},
{"_id":{"$oid":"b0cc6d4"},"jlo":"ΕΙ AGHTSB","dd":"1b1"},
{"_id":{"$oid":"6d2"},"jlo":"ΕPOWΙ AJSB","dd":"d5f"},
{"_id":{"$oid":"c6d3"},"jlo":"ΕGTΙ ALKSB","dd":"5d9"},
{"_id":{"$oid":"b0cc6d4"},"jlo":"ΕLKΙ AGHTSB","dd":"1b1"}
]
what i need to do is have as output for each discrete value of the ll element, the unique values of ta, in a separate file, named after a one to one representation where each dd code is substituted with a human readable representation:
d5f:departmentone
5d9:departmentalt
1b1:departshort
Desired output, in a per row basis, each unique value of jlo with the count of times it was found in each dd element so we get in the end something like this:
first file named departmentone.txt:
ΕΙ AJSB 1
ΕPOWΙ AJSB 1
second file named departmentalt.txt
ΕΙ ALKSB 1
ΕGTΙ ALKSB 1
third file named departshort.txt
ΕΙ AGHTSB 2
i have tried with map and reduce, group_by, sort_by, with really poor results
Only one invocation of jq is necessary. To allocate the output to the separate files, you can combine this one invocation with a single invocation to awk, or you could use a shell loop as illustrated below.
First, here's an illustration of how the shell pipeline would look:
jq -r --rawfile dd2name dd2name.tsv -f group.jq input.json |
while IFS=$'\t' read -r f v ; do echo "$v" >> "$f" ; done
This assumes that the mapping to filenames is in a TSV file named dd2name.tsv, and that the following jq program is in group.jq:
def dict:
split("\n") | map(select(length>0) | split("\t"))
| INDEX(.[0]) | map_values(.[1]);
($dd2name | dict) as $dict
| ($dict | keys_unsorted[]) as $dd
| map(select(.dd == $dd))
| group_by(.jlo)
| map("\($dict[$dd])\t\(.[0].jlo) \(length)")[]
As the name suggests, the dict function creates a dictionary giving the mapping of .dd values to the filenames. It assumes the availability of INDEX. If your jq does not have INDEX, then now would be an excellent time to upgrade your jq; otherwise, its def can easily be copied from builtin.jq (google: builtin.jq "def INDEX"), or you could replace the last line by: | reduce .[] as $p ({}; .[$p[0]] = $p[1]);
awk-based solution
The following invocation of awk can be used instead of the while ... done command above:
awk -F\\t 'fn && (fn!=$1) {close(fn)}; {fn=$1; print $2 >> fn}'
Season to taste
If the dd2name.tsv mapping file does not contain the ".txt" suffix, it can easily be added in any of a variety of ways, according to taste.
Note also that the proposed solutions above make some assumptions, notably that the .jlo values do not contain tabs, newlines, or NULs. If any of those assumptions is violated, then some tweaking will be required.
I'd do it in three passes, filtering the array with the desired dd and grouping by jlo, then extracting the jlo of the first (guaranteed) item of the array and its length :
map(select(.dd == "d5f")) | group_by(.jlo) | map("\(.[0].jlo) \(length)") | .[]
You can try it here.
Full bash run :
jq --arg dd d5f --raw-output 'map(select(.dd == $dd)) | group_by(.jlo) | map("\(.[0].jlo) \(length)") | .[]' yourJsonFile > departmentone.txt
jq --arg dd 5d9 --raw-output 'map(select(.dd == $dd)) | group_by(.jlo) | map("\(.[0].jlo) \(length)") | .[]' yourJsonFile > departmentalt.txt
jq --arg dd 1b1 --raw-output 'map(select(.dd == $dd)) | group_by(.jlo) | map("\(.[0].jlo) \(length)") | .[]' yourJsonFile > departmentshort.txt
Supposing you have a file named "mapping.txt" with the following content :
d5f:departmentone
5d9:departmentalt
1b1:departshort
You could extract those codes and labels to generate the files :
while IFS=: read -r code label; do
jq --arg dd $code --raw-output 'map(select(.dd == $dd)) | group_by(.jlo) | map("\(.[0].jlo) \(length)") | .[]' yourJsonFile > "$label".txt
done < mapping.txt

How to convert the following text to comma seperated list using awk - Need to skip headers and trailers

+---------------------------------+------------+------+----------+
| Name | NumCourses | Year | Semester |
+---------------------------------+------------+------+----------+
| ABDULHADI, ASHRAF M | 2 | 1990 | 3 |
| ACHANTA, BALA | 2 | 1995 | 3 |
| ACHANTA, BALA | 2 | 1996 | 3 |
+---------------------------------+------------+------+----------+
648 rows in set (0.02 sec)
--------------------------
Skip the first 3 lines and the last two lines. I would need an output like -
ABDULHADI, ASHRAF M, 2, 1990, 3
ACHANTA, BALA, 2, 1995, 3
ACHANTA, BALA, 2, 1996, 3
You can start with this awk and build on it as you need.
awk '
BEGIN {
FS = " *[|] *" # Set the Field Separator to this pattern
OFS = "," # Set the Output Field Separator to ,
}
NF { # Skip blank lines
$1 = $1 # Reconstruct your input line
gsub(/^,|,$/,"") # Remove leading and trailing ,
lines[++i] = $0 # Add line to array
}
END {
for(x=4;x<=i-2;x++) # Skip first three and last two lines
print lines[x] # Print line
}' file
ABDULHADI, ASHRAF M,2,1990,3
ACHANTA, BALA,2,1995,3
ACHANTA, BALA,2,1996,3
If your data does not have blank lines then you can remove NF and use NR as key instead of ++i
FS pattern above is zero or more spaces followed by pipe (placed in character class to consider it literal, since it is a meta character) followed by zero or more spaces.
Here is an awk
awk -F" *[|] *" 'FNR==NR {a=FNR;next} FNR>3 && FNR<a-2 {print $2,$3,$4,$5}' OFS=", " file{,}
ABDULHADI, ASHRAF M, 2, 1990, 3
ACHANTA, BALA, 2, 1995, 3
ACHANTA, BALA, 2, 1996, 3
Read the file two times, one to count the lines, one to get the correct output.
If your awk does not work with file{,}, change to file file to read it two times

AWK: Converting CSV with Headers to Summary Table

I have a need to document 100+ CSV files as far as the format of those files and including sample data. What I would like to do is take a CSV of the following format:
Name, Phone, State
Fred, 1234567, TX
John, 2345678, NC
and convert it to:
Field | Sample
--- | ----
Name | Fred
Phone | 1234567
State | TX
Is this possible with AWK? From my example below, you will see I am trying to format as a markdown table. I have it currently transposing the header row with
#!/usr/bin/awk -v RS='\r\n' -f
BEGIN { printf "| Field \t| Critical |\n"}
{
printf "|---\t|---\t|\n"
for (i=1; i<=NF; i++) {print "|", toupper($i), "| sample |"}
}
END {}
But I am not sure now how to use the first row of data, after the header to display the sample data?
awk is the right tool for data parsing. You can try something like:
awk '
BEGIN { FS=", "; OFS=" | " }
NR==1 {
for(tag = 1; tag <= NF; tag++) {
hdr[tag] = sprintf ("%-7s", $tag)
}
next
}
{
for(fld = 1; fld <= NF; fld++) {
data[NR,fld] = $fld
}
}
END {
print "Field | Sample\n------- | -------";
for(rec = 2; rec <= NR; rec++) {
for(line = 1; line <= NF; line++) {
print hdr[line], data[rec,line]
}
}
}' file
Output:
Field | Sample
------- | -------
Name | Fred
Phone | 1234567
State | TX
Name | John
Phone | 2345678
State | NC
Here is a more simple way to do it with awk
No need to store everything in a array then print at the end.
awk -F", " 'NR==1{split($0,a,FS);print "Field | Sample\n------- | -------";next} {for (i=1;i<=NF;i++) printf "%-8s| %s\n",a[i],$i}' file
Field | Sample
------- | -------
Name | Fred
Phone | 1234567
State | TX
Name | John
Phone | 2345678
State | NC
How it works:
awk -F", " ' # set field separator to ","
NR==1{ # if first line do:
split($0,a,FS) # split first line to an array named "a" to get the labels
print "Field | Sample" # print header
print "------- | -------" # print separator
next} # prevents nothing more run for first line
{ # for all lines except first do:
for (i=1;i<=NF;i++) # loop trough all element in line
printf "%-8s| %s\n",a[i],$i # print data for every element
}
' file

Code Golf: Musical Notes

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
The challenge
The shortest code by character count, that will output musical notation based on user input.
Input will be composed of a series of letters and numbers - letters will represent the name of the note and the number will represent the length of the note. A note is made of 4 vertical columns. The note's head will be a capital O, stem, if present will be 3 lines tall, made from the pipe character |, and the flag(s) will be made from backward slash \.
Valid note lengths are none, 1/4 of a note, 1/8 of a note, 1/16 of a note and 1/32 of a note.
| |\ |\ |\
| | |\ |\
| | | |\
O O O O O
1 1/4 1/8 1/16 1/32
Notes are places on the Staff, according to their note name:
----
D ----
C
B ----
A
G ----
F
E ----
All input can be assumed to be valid and without errors - Each note separated with a white space on a single line, with at least one valid note.
Test cases
Input:
B B/4 B/8 B/16 B/32 G/4 D/8 C/16 D B/16
Output:
|\
--------------------------|---|\--------
| |\ |\ |\ | |\ |\
------|---|---|\--|\-----O----|--O----|\
| | | |\ | O |
-O---O---O---O---O----|--------------O--
|
---------------------O------------------
----------------------------------------
Input:
E/4 F/8 G/16 A/32 E/4 F/8 G/16 A/32
Output:
--------------------------------
--------------|\--------------|\
|\ |\ |\ |\
------|\--|\--|\------|\--|\--|\
| | | O | | | O
--|---|--O--------|---|--O------
| O | O
-O---------------O--------------
Input:
C E/32 B/8 A/4 B F/32 B C/16
Output:
------------------------------|\
|\ |\
----------|---|---------------|-
O | | O
---------O----|--O----|\-O------
|\ O |\
------|\--------------|\--------
|\ O
-----O--------------------------
Code count includes input/output (i.e full program).
Golfscript (112 characters)
' '%:A;10,{):y;A{2/.0~|1=~:r;0=0=5\- 7%
4y#--:q' '' O'if-4q&!q*r*{16q/r<'|\\'
'| 'if}' 'if+{.32=y~&{;45}*}%}%n}%
Perl, 126 characters (115/122 with switches)
Perl in 239 226 218 216 183 180 178 172 157 142 136 133 129 128 126 chars
This 126 character solution in Perl is the result of a lengthy collaboration between myself and A. Rex.
#o=($/)x10;$/=$";map{m[/];$p=4+(5-ord)%7;
$_.=--$p?!($p&~3)*$'?16<$p*$'?" |\\":" | ":$/x4:" O ",
$|--&&y# #-#for#o}<>;print#o
A. Rex also proposes a solution to run with the perl -ap switch. With 111(!)
characters in this solution plus 4 strokes for the extra command-line switch,
this solution has a total score of 115.
$\="$:
"x5;$p=4+(5-ord)%7,s#..##,$\=~s#(.)\K$#--$p?
$_*!($p&~3)?"$1|".(16<$p*$_?"\\":$1).$1:$1x4:O.$1x3#gemfor#F
The first newline in this solution is significant.
Or 122 characters embedding the switches in the shebang line:
#!perl -ap
$\="$:
"x5;$p=4+(5-ord)%7,s#..##,$\=~s#(.)\K$#--$p?$_*!($p&~3)?"$1|".(16<$p*$_?
"\\":$1).$1:$1x4:O.$1x3#gemfor#F
(first two newlines are significant).
Half-notes can be supported with an additional 12 chars:
#o=($/)x10;$/=$";map{m[/];$p=4+(5-ord)%7;
$_.=--$p?!($p&~3)*$'?16<$p*$'?" |\\":" | ":$/x4:$'>2?" # ":" O ",
$|--&&y# #-#for#o}<>;print#o
LilyPond - 244 bytes
Technically speaking, this doesn't adhere to the output specification, as the output is a nicely engraved PDF rather than a poor ASCII text substitute, but I figured the problem was just crying out for a LilyPond solution. In fact, you can remove the "\autoBeamOff\cadenzaOn\stemUp" to make it look even more nicely formatted. You can also add "\midi{}" after the "\layout{}" to get a MIDI file to listen to.
o=#(open-file"o""w")p=#ly:string-substitute
#(format o"~(~a"(p"2'1""2"(p"4'1""4"(p"6'1""6"(p"8'1""8"(p"/""'"(p"C""c'"(p"D""d'"(p" ""/1"(p"
"" "(ly:gulp-file"M")))))))))))#(close-port o)\score{{\autoBeamOff\cadenzaOn\stemUp\include"o"}\layout{}}
Usage: lilypond thisfile.ly
Notes:
The input must be in a file named "M" in the same directory as the program.
The input file must end in a newline. (Or save 9 bytes by having it end in a space.)
The output is a PDF named "thisfile.pdf", where "thisfile.ly" is the name of the program.
I tested this with LilyPond 2.12.2; other versions might not work.
I haven't done much in LilyPond, so I'm not sure this is the best way to do this, since it has to convert the input to LilyPond format, write it to an auxiliary file, and then read it in. I currently can't get the built-in LilyPond parser/evaluator to work. :(
Now working on an ASCII-output solution.... :)
C89 (186 characters)
#define P,putchar(
N[99];*n=N;y;e=45;main(q){for(;scanf(" %c/%d",n,n+1)>0;n
+=2);for(;y<11;q=y-(75-*n++)%7 P+q-4?e:79)P*n&&q<4&q>0?
124:e)P*n++/4>>q&&q?92:e))*n||(e^=13,n=N,y++P+10))P+e);}
Half-note support (+7 characters)
#define P,putchar(
N[99];*n=N;y;e=45;main(q){for(;scanf(" %c/%d",n,n+1)>0;n
+=2);for(;y<11;q=y-(75-*n++)%7 P+q-4?e:v<4?79:64)P*n&&q<4&q>0?
124:e)P*n++/4>>q&&q?92:e))*n||(e^=13,n=N,y++P+10))P+e);}
Python 178 characters
The 167 was a false alarm, I forgot to suppress the stems on the whole notes.
R=raw_input().split()
for y in range(10):
r=""
for x in R:o=y-(5-ord(x[0]))%7;b=" -"[y&1]+"O\|";r+=b[0]+b[o==3]+b[-(-1<o<3and''<x[1:])]+b[2*(-1<o<":862".find(x[-1]))]
print r
Python 167 characters (Broken)
No room for the evil eye in this one, although there are 2 filler characters in there, so I added a smiley. This technique takes advantage of the uniqueness of the last character of the note lengths, so lucky for me that there are no 1/2 notes or 1/64 notes
R=raw_input().split()
for y in range(10):
r=""
for x in R:o=y-(5-ord(x[0]))%7;b=" -"[y&1]+"O\|";r+=b[0]+b[o==3]+b[-(-1<o<3)]+b[2*(-1<o<":862".find(x[-1]))]
print r
Python 186 characters <<o>>
Python uses the <<o>> evil eye operator to great effect here. The find() method returns -1 if the item is not found, so that is why D doesn't need to appear in the notes.
R=raw_input().split()
for y in range(10):
r=""
for x in R:o='CBAGFE'.find(x[0])+4;B=" -"[y%2];r+=B+(B,'O')[o==y]+(x[2:]and
y+4>o>y and"|"+(B,'\\')[int(x[2:])<<o>>6+y>0]or B*2)
print r
11 extra bytes gives a version with half notes
R=raw_input().split()
for y in range(10):
r=""
for x in R:t='CBAGFE'.find(x[0])+4;l=x[2:];B=" -"[y%2];r+=B+(B,'#O'[l
in'2'])[t==y]+(l and y+4>t>y and"|"+(B,'\\')[int(l)>>(6+y-t)>0]or B*2)
print r
$ echo B B/2 B/4 B/8 B/16 B/32 G/4 D/8 C/16 D B/16| python notes.py
|\
------------------------------|---|\--------
| | |\ |\ |\ | |\ |\
------|---|---|---|\--|\-----#----|--O----|\
| | | | |\ | # |
-O---O---#---#---#---#----|--------------#--
|
-------------------------#------------------
--------------------------------------------
159 Ruby chars
n=gets.split;9.downto(0){|p|m='- '[p%2,1];n.each{|t|r=(t[0]-62)%7;g=t[2..-1]
print m+(r==p ?'O'+m*2:p>=r&&g&&p<r+4?m+'|'+(g.to_i>1<<-p+r+5?'\\':m):m*3)}
puts}
Ruby 136
n=gets;10.times{|y|puts (b=' -'[y&1,1])+n.split.map{|t|r=y-(5-t[0])%7
(r==3?'O':b)+(t[1]&&0<=r&&r<3?'|'<<(r<t[2,2].to_i/8?92:b):b+b)}*b}
Ruby 139 (Tweet)
n=gets;10.times{|y|puts (b=' -'[y&1,1])+n.split.map{|t|r=y-(5-t[0])%7
(r==3?'O':b)+(t[1]&&0<=r&&r<3?'|'<<(r<141>>(t[-1]&7)&3?92:b):b+b)}*b}
Ruby 143
n=gets.split;10.times{|y|puts (b=' -'[y&1,1])+n.map{|t|r=y-(5-t[0])%7;m=t[-1]
(r==3?'O':b)+(m<65&&0<=r&&r<3?'|'<<(r<141>>(m&7)&3?92:b):b+b)}*b}
Ruby 148
Here is another way to calculate the flags,
where m=ord(last character), #flags=1+m&3-(1&m/4)
and another way #flags=141>>(m&7)&3, that saves one more byte
n=gets.split;10.times{|y|b=' -'[y&1,1];n.each{|t|r=y-(5-t[0])%7;m=t[-1]
print b+(r==3?'O':b)+(m<65&&0<=r&&r<3?'|'<<(r<141>>(m&7)&3?92:b):b+b)}
puts}
Ruby 181
First try is a transliteration of my Python solution
n=gets.split;10.times{|y|r="";n.each{|x|o=y-(5-x[0])%7
r+=(b=" -"[y&1,1]+"O\\|")[0,1]+b[o==3?1:0,1]+b[-1<o&&o<3&&x[-1]<64?3:0,1]+b[-1<o&&o<(":862".index(x[-1]).to_i)?2:0,1]}
puts r}
F#, 458 chars
Reasonably short, and still mostly readable:
let s=Array.init 10(fun _->new System.Text.StringBuilder())
System.Console.ReadLine().Split([|' '|])
|>Array.iter(fun n->
for i in 0..9 do s.[i].Append(if i%2=1 then"----"else" ")
let l=s.[0].Length
let i=68-int n.[0]+if n.[0]>'D'then 7 else 0
s.[i+3].[l-3]<-'O'
if n.Length>1 then
for j in i..i+2 do s.[j].[l-2]<-'|'
for j in i..i-1+(match n.[2]with|'4'->0|'8'->1|'1'->2|_->3)do s.[j].[l-1]<-'\\')
for x in s do printfn"%s"(x.ToString())
With brief commentary:
// create 10 stringbuilders that represent each line of output
let s=Array.init 10(fun _->new System.Text.StringBuilder())
System.Console.ReadLine().Split([|' '|])
// for each note on the input line
|>Array.iter(fun n->
// write the staff
for i in 0..9 do s.[i].Append(if i%2=1 then"----"else" ")
// write note (math so that 'i+3' is which stringbuilder should hold the 'O')
let l=s.[0].Length
let i=68-int n.[0]+if n.[0]>'D'then 7 else 0
s.[i+3].[l-3]<-'O'
// if partial note
if n.Length>1 then
// write the bar
for j in i..i+2 do s.[j].[l-2]<-'|'
// write the tails if necessary
for j in i..i-1+(match n.[2]with|'4'->0|'8'->1|'1'->2|_->3)do s.[j].[l-1]<-'\\')
// print output
for x in s do printfn"%s"(x.ToString())
C 196 characters <<o>>
Borrowing a few ideas off strager. Interesting features include the n+++1 "triple +" operator and the <<o>> "evil eye" operator
#define P,putchar
N[99];*n=N;y;b;main(o){for(;scanf(" %c/%d",n,n+1)>0;n+=2);for(;y<11;)
n=*n?n:(y++P(10),N)P(b=y&1?32:45)P((o=10-(*n+++1)%7-y)?b:79)P(0<o&o<4&&*n?'|':b)
P(*n++<<o>>6&&0<o&o<4?92:b);}
168 characters in Perl 5.10
My original solution was 276 characters, but lots and lots of tweaking reduced it by more than 100 characters!
$_=<>;
y#481E-GA-D62 #0-9#d;
s#.(/(.))?#$"x(7+$&).O.$"x($k=10).($1?"|":$")x3 .$"x(10-$2)."\\"x$2.$"x(9-$&)#ge;
s#(..)*?\K (.)#-$2#g;
print$/while--$k,s#.{$k}\K.#!print$&#ge
If you have a minor suggestion that improves this, please feel free to just edit my code.
Lua, 307 Characters
b,s,o="\\",io.read("*l"),io.write for i=1,10 do for n,l in
s:gmatch("(%a)/?(%d*)")do x=n:byte() w=(x<69 and 72 or 79)-x
l=tonumber(l)or 1 d=i%2>0 and" "or"-"o(d..(i==w and"O"or
d)..(l>3 and i<w and i+4>w and"|"or d)..(l>7 and i==w-3
and b or l>15 and i==w-2 and b or l>31 and i==w-1 and b or
d))end o"\n"end
C -- 293 characters
Still needs more compression, and it takes the args on the command line instead of reading them...
i,j,k,l;main(c,v)char **v;{char*t;l=4*(c-1)+2;t=malloc(10*l)+1;for(i=0;i<10;i
++){t[i*l-1]='\n';for(j=0;j<l;j++)t[i*l+j]=i&1?'-':' ';}t[10*l-1]=0;i=1;while
(--c){j='G'-**++v;if(j<3)j+=7;t[j*l+i++]='O';if(*++*v){t[--j*l+i]='|';t[--j*l
+i]='|';t[--j*l+i]='|';if(*++*v!='4'){t[j++*l+i+1]='\\';if(**v!='8'){t[j++*l+
i+1]='\\';if(**v!='1'){t[j++*l+i+1]='\\';}}}}i+=3;}puts(t);}
edit: fixed the E
edit: down to 293 characters, including the newlines...
#define X t[--j*l+i]='|'
#define Y t[j++*l+i+1]=92
i,j,k,l;main(c,v)char**v;{char*t;l=4*(c-1)+2;t=malloc(10*l)+1;for(i=10;i;)t[--i*
l-1]=10,memset(t+i*l,i&1?45:32,l-1);t[10*l-1]=0;for(i=1;--c;i+=3)j=71-**++v,j<3?
j+=7:0,t[j*l+i++]=79,*++*v?X,X,X,*++*v-52?Y,**v-56?Y,**v-49?Y:0:0:0:0;puts(t);}