I want to convert my .tif files to .asc in R - tiff

I have downloaded bioclim variables from worldclim but they are in the Tif format, I want to convert them to ASC format. what would be the code in R for this purpose?

Related

How to see the compression used to create a parquet file with pyarrow?

If I have a parquet file I can do
pqfile=pq.ParquetFile("pathtofile.parquet")
pqfile.metadata
but exploring around using dir in the pqfile object, I can't find anything that would indicate the compression of the file. How can I get that info?
#0x26res has a good point in the comments that converting the metadata to a dict will be easier than using dir.
Compression is stored at the column level. A parquet file consists of a number of row groups. Each row group has columns. So you would want something like...
import pyarrow as pa
import pyarrow.parquet as pq
table = pa.Table.from_pydict({'x': list(range(100000))})
pq.write_table(table, '/tmp/foo.parquet')
pq.ParquetFile('/tmp/foo.parquet').metadata.row_group(0).column(0).compression
# 'SNAPPY'

Spark read multiple CSV files, one partition for each file

suppose I have multiple CSV files in the same directory, these files all share the same schema.
/tmp/data/myfile1.csv, /tmp/data/myfile2.csv, /tmp/data.myfile3.csv, /tmp/datamyfile4.csv
I would like to read these files into a Spark DataFrame or RDD, and I would like each file to be a parition of the DataFrame. How can I do this?
You have two options I can think of:
1) Use the Input File name
Instead of trying to control the partitioning directly, add the name of the input file to your DataFrame and use that for any grouping/aggregation operations you need to do. This is probably your best option as it is more aligned with the parallel processing intent of spark where you tell it what to do and let it figure out the how. You do this with code like this:
SQL:
SELECT input_file_name() as fname FROM dataframe
Or Python:
from pyspark.sql.functions import input_file_name
newDf = df.withColumn("filename", input_file_name())
2) Gzip your CSV files
Gzip is not a splittable compression format. This means when loading gzipped files, each file will be it's own partition.

batch convert shapefiles to csv in R

I have a folder with shapefiles, for which i want to add 3 columns named "species", "x","y" (where x,y are the coordinates in WGS84) and then to convert in .csv format. Is that possible (assume yes) and if anybody can support in r scripting ?

ParseTweets function for more than one JSON file at the same time using R

I have 100 JSON files contain approximately 800,000 tweets. How can I parse all files at the same time using R in order to clean it?

Mahout CSV to SEQ for text vectorization

I have a large CSV file where each line consists (id, description) in a Text format. I wanted to convert each line to a vector using "seq2sparse" and then later run "rowsimilarity" to generate a textual similarity result.
Problem is i need to convert the CSV file to SEQ somehow to work with "seq2sparse", and existing method "seqdirectory" takes a directory of text files rather than a CSV file. Anyway to accomplish this?