Date is getting converted to number, while making html of datafrane - html

While converting the data frame to HTML, Date is getting converted to a number.
library("xtable")
print(xtable(Data), type="html", file="Data.html",timestamp=date())
The first column of this data frame is in Date format, which is getting converted to a number.

You could try tableHTML which handles dates. As a quick example:
library(tableHTML)
Data <- data.frame(a = 1:10, b = as.Date('2017-01-01'))
mytable <- tableHTML(Data, rownames = FALSE)
mytable
And to write it in a file, you can use:
write_tableHTML(mytable, file = 'Data.html')

Related

extract date from file name using SSIS expression

HT,
Using Microsoft SSIS.
I have a input CSV file called - LatLong_WD_Locations_06-21-2021.
I want to extract date from this file name in 20210621 format using SSIS expression and store in to a variable called v_FileDate which is Int32 type.variable v_FileName is a string type.
I tried with
(DT_I4) REPLACE(SUBSTRING( #[User::v_FileName],FINDSTRING( #[User::v_FileName] , "_", 2)+1,10),"-","")
but its not working.
Can I have some help here, please.
TIA
Almost there. You specified 2 for the occurrence argument in FINDSTRING expression. So it is finding the _ before Location in the file name giving you a result of Locations_. Since that is not a integer it is throwing an error.
Change the 2 to a 3:
(DT_I4) REPLACE(SUBSTRING( #[User::v_FileName],FINDSTRING( #[User::v_FileName] , "_", 3)+1,10),"-","")
The above would account for if the V_FileName has a file extension. It would not get you the final format of yyyyMMdd. See below...
You could also simplify and use RIGHT expression. Get the right 10 characters of the string and then replace:
(DT_I4) REPLACE(RIGHT(REPLACE(#[User::v_FileName], ".csv",""), 10),"-","")
I updated the above statement to account for if v_FileName had an extension. That still does not give your final format of yyyyMMdd. See below...
Those 2 expressions above will get the date out of the v_FileName, but in format MMddyyyy. Now you will have to parse out each part of the date and put it back together using one of the above statements. The example below is using the one with RIGHT:
(DT_I4) SUBSTRING(REPLACE(RIGHT(REPLACE(#[User::v_FileName], ".csv",""), 10),"-",""), 5,4)
+ SUBSTRING(REPLACE(RIGHT(REPLACE(#[User::v_FileName], ".csv",""), 10),"-",""), 1,2)
+ SUBSTRING(REPLACE(RIGHT(REPLACE(#[User::v_FileName], ".csv",""), 10),"-",""), 3,2)
If you ever have Date on the last 10 positions in file name solution is very simple. But if that is not case, write me below in a comment and I will write a new expression.
Solution explained step by step:
Get/create variable v_FileDate with value LatLong_WD_Locations_06-21-2021
For check create DateString with expression
RIGHT( #[User::v_FileDate], 4) +
LEFT (RIGHT( #[User::v_FileDate], 10), 2) +
LEFT (RIGHT( #[User::v_FileDate], 7), 2)
Create a final variable DateInt with expression
(DT_I4) #[User::DateS]
How variable should like:
Or you can with a single variable (It would be better with a single variable).
(DT_I4) (
RIGHT( #[User::v_FileDate], 4) +
LEFT (RIGHT( #[User::v_FileDate], 10), 2) +
LEFT (RIGHT( #[User::v_FileDate], 7), 2)
)
Final Result
Using script task...
Pass in v_FileDate as readwrite
Pass in v_FileName as readonly
Code:
//Get filename from SSIS
string fileName = Dts.Variables["v_FileName"].Value;
//Split based on "_"
string[] pieces = fileName.Split('_');
//Get the last piece [ref is 0 based and length is actual]
string lastPiece = pieces[piece.Length -1];
//Convert to date
DateTime d = DateTime.ParseExact(lastPiece, "MM-dd-yyyy", System.Globalization.CultureInfo.InvariantCulture);
//Convert to int from string converted date in your format
Dts.Variables["v_FileDate"].Value = int.Parse(d.ToString("yyyyMMdd"));
Some work around -Working for me in (YYYYMMDD) format.
Thanks #keithL
(DT_I4)( REPLACE(RIGHT(#[User::v_FileName], 5),"-","")+REPLACE(SUBSTRING( #[User::v_FileName],FINDSTRING( #[User::v_FileName] , "", 3)+1,2),"-","")+REPLACE(SUBSTRING( #[User::v_FileName],FINDSTRING( #[User::v_FileName] , "", 3)+3,4),"-",""))

Extract html table and create column in R

I'm trying to extract the table on the following URL https://www.cenace.gob.mx/DocsMEM/OpeMdo/OfertaCompVent/OferVenta/MDA/Termicas/OfeVtaTermicaHor%20BCS%20MDA%20Hor%202018-12-26%20v2019%2002%2024_01%2000%2001.html
So far I've been trying to use
url <- getURL("https://www.cenace.gob.mx/DocsMEM/OpeMdo/OfertaCompVent/OferVenta/MDA/Termicas/OfeVtaTermicaHor%20BCS%20MDA%20Hor%202018-12-26%20v2019%2002%2024_01%2000%2001.html",.opts = list(ssl.verifypeer = FALSE) )
parse <- xmlParse(url, isHTML = TRUE)
r <- readHTMLTable(parse)
But only null tables are returned. I've some knowledge in XML, and as far as I understand, I can scrape html webpages just as XML files. But I can't find any specific node in order to do that.
Also, I would like to know how can I create a column with the Fecha value.
Thanks!

How can I get the timestamp in “yyyy/MM/dd hh:mm:ss.fff"” format in VBA?

I want to get this output:
2018-09-02 00:00:00.000
I tried the below code:
.Cells(LRS + 1, 15).Value = Format(.Cells(LRS + 1, "A").Value, "yyyy-MM-dd hh:mm:ss.fff")
And I got:
2018-09-02 00:00:00.fff
The initial date in Excel has the following format yyyy/mm/dd, no time included. That's why the time part includes only zeros 00:00:00.000. The reason I want to include the time in the specific format is that I'm planning to import those dates into a SQL table with that format.
Is there any solution?
As you can see from the documentation fff is not recognised as a formatting token in VBA.
Helpfully, you can actually import your data into SQL without formatting it to add the time. If you import it into a datetime field the SQL engine will automatically default the time part of the field to midnight on the date you give it.
I think you can just change your format string to yyyy-MM-dd by itself.
However if you really want to do it like this, then since there's no time specified then just hard-code 000 instead of fff. The rest of the time can be similarly hard-coded, since it never varies, so you end up with yyyy-MM-dd 00:00:00.000. But as I said, I think it's a bit pointless.
After replacing the cell format with the corresponding format, it is likely that the value of the cell is imported as text, not as a value.
Sub test()
Dim s As String, s1 As String, s2 As String
'First Cell format as your "yyyy-mm-dd hh:mm:ss.000"
Range("a2").NumberFormatLocal = "yyyy-mm-dd hh:mm:ss.000"
'In vb,This format("yyyy-mm-dd hh:mm:ss.000") is not recognized.
Range("b2") = Format(Range("a2"), "yyyy-mm-dd hh:mm:ss.000")
s1 = Format(Range("a2"), "yyyy-mm-dd hh:mm:ss.000")
's1 = "2018-09-03 01:24:33.000"
'Since you format a2 cell as "yyyy-mm-dd hh:mm:ss.000" you can get data as text
Range("b2") = Range("a2").Text
s2 = Range("a2").Text
's2= "2018-09-03 01:24:33.240"
End Sub
Sheet Data
Local Window

Python 2.7, MySQL, JSON, & DateTime - How can I remove the T between the Date and Time in the timestamp field?

I have a MySQL database with a few tables. I wrote a python script to dump some of the tables data into a JSON file. I am a bit confused with dumping the date and time stamp.
Here is the code sample, conversion.py:
import MySQLdb
import json
import collections
from datetime import date, datetime
#connect to database
conn = MySQLdb.connect(host= "localhost", user="root", passwd="root", db="testdb")
#Fetch rows
sql = "SELECT * from offices"
cursor = conn.cursor()
cursor.execute(sql)
rows = cursor.fetchall()
data = []
def json_serial(obj):
"""JSON serializer for objects not serializable by default json code"""
if isinstance(obj, (datetime, date)):
return obj.isoformat()
raise TypeError ("Type %s not serializable" % type(obj))
for row in rows:
d = collections.OrderedDict()
d['officeCode'] = row[0]
d['city'] = row[1]
d['phone'] = row[2]
d['eff_date'] = row[3]
d['lastupdatedby'] = row[4]
d['state'] = row[5]
d['country'] = row[6]
d['postalcode'] = row[7]
d['territory'] = row[8]
data.append(d)
with open('data.json', 'w') as outfile:
json.dump(data, outfile, default=json_serial)
conn.close()
When I execute this code, a JSON file is created which is fine. I have a problem with two fields, eff_date which is a date type in database and lastupdatedby is a timestamp type in the database.
"eff_date": "2015-09-23"
"lastupdatedby": "2016-08019T08:13:53"
So, in my JSON file, eff_time is created fine but lastupdatedby is getting a T in middle of date and time as shown above. But, in my actual database there is no T between the date and time. I would like to get rid of that T because I am planning to dump this file into a different database and I don't think it will accept that format.
Any help will be much appreciated.
The T between the date and time is per the ISO 8601 format.
And that's format returned by the datetime.isoformat function, found in the code here:
return obj.isoformat()
(That happens to be the format that Javascript is expecting.)
If we want to return a string different format, we probably need to use a different function, e.g. strftime function in place of isoformat.
If isoformat is working for the date objects, leave that alone. Just do the strftime for a datetime object.
The format string "%Y-%m-%d %H:%M:%S" might suit your needs.
https://docs.python.org/2/library/datetime.html#strftime-strptime-behavior

keyword search in string from mysql using python?

I am pulling from a mysql database table using python3.4. I use the csv module to write the rows of data from the database into .CSV format. Now I am trying toros figure out how I can vet the rows of data by keywords that may show up in the fourth column of data (row[3]). I was thinking of using the re moduleas below but I keep getting errors. Is it not possible to search keywords in a field that is string type and to filter those results if they have those keywords? I keep getting an error. Please help
import re
import csv
userdate = input('What date do you want to look at?')
query = ("SELECT *FROM sometable WHERE timestamp LIKE %s", userdate)
keywords = 'apples', 'bananas', 'cocoa'
# Execute sql Query
cursor.execute(query)
result = cursor.fetchall()
#Reads a CSV file and return it as a list of rows
def read_csv_file(filename):
"""Reads a CSV file and return it as a list of rows."""
for row in csv.reader(open(filename)):
data.append(row)
return data
f = open(path_in + data_file)
read_it = read_csv_file(path_in + data_file)
with open('file.csv', 'wb') as csvfile:
spamwriter = csv.writer(csvfile, delimiter=' ',
quotechar='|', quoting=csv.QUOTE_MINIMAL)
for row in data:
match = re.search('keywords, read_it)
if match:
spamwriter.writerow(row)
I gave up on the regular expressions and used
for row in data:
found_it = row.find(keywords)
if found_it != -1:
spamwriter.writerow(row)