Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I have a small script:
require "csv"
require "json"
puts "JSON file name (include extension) ->"
jsonfile = gets.chomp
json = JSON.parse(File.open(jsonfile).read)
#puts json.first.collect {|k,v| k}.join(',')
puts json.collect {|node| "#{node.collect{|k,v| v}.join(',')}\n"}.join
CSV.open("generated.csv", "wb") do |csv|
csv << json.collect {|node| "#{node.collect{|k,v| v}.join(',')}\n"}.join
end
In the terminal it shows like this:
Missing: [User],{"error"=>[{"Something"=>"", "errno"=>"somthing", "de"=>"smoehting", "pe"=>"error", "errorMessage"=>"Missing "}], "data"=>nil}
Missing: [User],{"error"=>[{"Something"=>"", "errno"=>"somthing", "de"=>"smoehting", "pe"=>"error", "errorMessage"=>"Missing "}], "data"=>nil}
I need to output each row into a seperate row in a csv file. The above is what im trying to do to write it to csv, but it does not work
Your first mistake in your code is that you call << only one time. Each << creates one line, so you have to call << method n-times, where n is a number of lines.
Your second mistake is that you concatenate the array elements and try to pass a string as a <<'s argument. Each <<'s argument should be an array.
Summarizing, to create a CSV file containing two lines:
# my.csv
1,2,3
4,5,6
you should write:
CSV.open("my.csv", "wb") do |csv|
csv << [1, 2, 3]
csv << [4, 5, 6]
end
Similarly, to achieve your desired effect, try to rewrite your code as:
CSV.open("generated.csv", "wb") do |csv|
json.each do |node|
csv << node.collect { |k,v| v }
end
end
Related
I had produced a script to parse some blast files from different samples. As I wanted to know the genes that all the samples had it commum I created a list, and a dictionary to count them. I have also produced a json file from the dictionary. Now I want to removed those genes whose counts are less than 100, as this is the number of samples, either from the dictionary or from the json file but I don't know how to.
This is part of the code:
###to produce a dictionary with the genes, and their repetitions
for extracted_gene in matches:
if extracted_gene in matches_counts:
matches_counts[extracted_gene]+=1
else:
matches_counts[extracted_gene]=1
print matches_counts #check point
#if matches_counts[extracted_gene]==100:
#print extracted_gene
#to convert a dictionary into a txt file and format it with json
with open('my_gene_extraction_trial.txt', 'w') as file:
json.dump(matches_counts,file, sort_keys=True, indent=2, separators=(',',':'))
print 'Parsing has finished'
I had tried different ways to do so:
a) ignoring the else statement but then it will give me an empty dict
b)trying to print only the ones whose values is 100, but it does not get printed
c) I read the documentation about json but I only can see how to delete elements by objects but not by values.
Can I anyone help me with this issue, please? This is getting me mad!
This is what it should look like:
# matches (list) and matches_counts (dict) already defined
for extracted_gene in matches:
if extracted_gene in matches_counts:
matches_counts[extracted_gene] += 1
else: matches_counts[extracted_gene] = 1
print matches_counts #check point
# Create a copy of the dict of matches to remove items from
counts_100 = matches_counts.copy()
for extracted_gene in matches_counts:
if matches_counts[extracted_gene] < 100:
del counts_100[extracted_gene]
print counts_100
Let me know if you still get errors.
Complete Julia newbie here.
I'd like to do some processing on a CSV. Something along the lines of:
using CSV
in_file = CSV.Source('/dir/in.csv')
out_file = CSV.Sink('/dir/out.csv')
for line in CSV.eachline(in_file)
replace!(line, "None", "")
CSV.writeline(out_file, line)
end
This is in pseudocode, those aren't existing functions.
Idiomatically, should I iterate on 1:CSV.countlines(in_file)? Do a while and check something?
If all you want to do is replace a string in the line, you do not need any CSV parsing utilities. All you do is read the file line by line, replace, and write. So:
infile = "/path/to/input.csv"
outfile = "/path/to/output.csv"
out = open(outfile, "w+")
for line in readlines(infile)
newline = replace(line, "a", "b")
write(out, newline)
end
close(out)
This will replicate the pseudocode you have in your question.
If you need to parse and read the csv field by field, use the readcsv function in base.
data=readcsv(infile)
typeof(data) #Array{Any,2}
This will return the data in the file as a 2 dimensional array. You can process this data any way you want, and write it back using the writecsv function.
for i in 1:size(data,1) #iterate by rows
data[i, 1] = "This is " * data[i, 1] # Add text to first column
end
writecsv(outfile, data)
Documentation for these functions:
http://docs.julialang.org/en/release-0.5/stdlib/io-network/?highlight=readcsv#Base.readcsv
http://docs.julialang.org/en/release-0.5/stdlib/io-network/?highlight=readcsv#Base.writecsv
I am following this tutorial for exporting CSV:
http://railscasts.com/episodes/362-exporting-csv-and-excel
However, I am getting an "undefined method '<<' for CSV:Class
The code snippet being highlighted in the error is:
def self.to_csv
CSV.generate do |csv|
CSV << column_names #this row is highlighted in the error
all.each do |opportunity|
CSV << product.attributes.value_at(*column_names)
end
end
end
My config/application:
require 'rails/all'
require 'csv'
Thanks!
Please note that I am only 2:30 into the video.
In the Fifth line
CSV << product.attributes.value_at(*column_names)
change it to
CSV << product.attributes.values_at(*column_names)
Now it'll work fine. Hope this helps.
Just wondering if these two functions are to be done using Nokogiri or via more basic Ruby commands.
require 'open-uri'
require 'nokogiri'
require "net/http"
require "uri"
doc = Nokogiri.parse(open("example.html"))
doc.xpath("//meta[#name='author' or #name='Author']/#content").each do |metaauth|
puts "Author: #{metaauth}"
end
doc.xpath("//meta[#name='keywords' or #name='Keywords']/#content").each do |metakey|
puts "Keywords: #{metakey}"
end
etc...
Question 1: I'm just trying to parse a directory of .html documents, get the information from the meta html tags, and output the results to a text file if possible. I tried a simple *.html wildcard replacement, but that didn't seem to work (at least not with Nokogiri.parse(open()) maybe it works with ::HTML or ::XML)
Question 2: But more important, is it possible to output all of those meta content outputs into a text file to replace the puts command?
Also forgive me if the code is overly complicated for the simple task being performed, but I'm a little new to Nokogiri / xpath / Ruby.
Thanks.
I have a code similar.
Please refer to:
module MyParser
HTML_FILE_DIR = `your html file dir`
def self.run(options = {})
file_list = Dir.entries(HTML_FILE_DIR).reject { |f| f =~ /^\./ }
result = file_list.map do |file|
html = File.read("#{HTML_FILE_DIR}/#{file}")
doc = Nokogiri::HTML(html)
parse_to_hash(doc)
end
write_csv(result)
end
def self.parse_to_hash(doc)
array = []
array << doc.css(`your select conditons`).first.content
... #add your selector code css or xpath
array
end
def self.write_csv(result)
::CSV.open("`your out put file name`", 'w') do |csv|
result.each { |row| csv << row }
end
end
end
MyParser.run
You can output to a file like so:
File.open('results.txt','w') do |file|
file.puts "output" # See http://ruby-doc.org/core-2.1.2/IO.html#method-i-puts
end
Alternatively, you could do something like:
authors = doc.xpath("//meta[#name='author' or #name='Author']/#content")
keywrds = doc.xpath("//meta[#name='keywords' or #name='Keywords']/#content")
results = authors.map{ |x| "Author: #{x}" }.join("\n") +
keywrds.map{ |x| "Keywords: #{x}" }.join("\n")
File.open('results.txt','w'){ |f| f << results }
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions concerning problems with code you've written must describe the specific problem — and include valid code to reproduce it — in the question itself. See SSCCE.org for guidance.
Closed 9 years ago.
Improve this question
This might be a backwards way of doing this. I have a bit of code that reads a CSV file and prints the results in a HTML file. I would like to have this file printed out as an unordered list if at all possible.
This is what I have now and its output is not what I want:
require 'csv'
col_data = []
CSV.foreach("primary_NAICS_code.txt") {|row| col_data << row}
begin
file = File.open("primary_NAICS_code_html.html", "w")
col_data.each do |row|
indentation, (text,*) = row.slice_before(String).to_a
file.write(indentation.fill("<ul>").join(" ") + "<il>" + text+ "</il></ul?\n")
end
rescue IOError => e
puts e
ensure
file.close unless file == nil
end
Unordered lists aren't surrounded by <ul> ... </ul?. Question marks don't make HTML happy.
List items are <li> tags, not <il>.
You need to keep track of your depth to know if you need to add <ul> tags or can just add more items.
Try this:
require 'csv'
col_data = []
CSV.foreach("primary_NAICS_code.txt") {|row| col_data << row}
begin
file = File.open("primary_NAICS_code_html.html", "w")
file.write('<ul>')
depth = 1
col_data.each do |row|
indentation, (text,*) = row.slice_before(String).to_a
if indentation.length > depth
file.write('<ul>')
elsif indentation.length < depth
file.write('</ul>')
end
file.write("<li>" + text+ "</li>")
depth = indentation.length
end
file.write('</ul>')
rescue IOError => e
puts e
ensure
file.close unless file == nil
end
It's not very pretty but it seems to work.