Pulling One Element From A CSV File - csv

I'm trying to write a function that will return the most recent 'closing' value in a csv file containing the data of a cryptocurrency. The csv file contains 6 columns and about 900 rows and I'm looking to only pull one element of the table.
However, I seem to faced a fair bit of difficulty in pulling this off for some reason. The function below returns values from the column I want, however it seems to be pulling values from the very bottom of the document (whereas I want the most recent values).
Also, just a side note to explain what I was attempting to do with the 'count'. Since I'm expecting the value I want to be located on the second row, I wanted my for loop to only iterate through two lines of the file. However, as the result of the function went on to reveal to me, as it currently stands with the counter I'm returning two values from the function.
I understand there must be a much less convoluted way of getting the information I need so am open to any solution to the problem. Though, that being said, I'd be really interested to see where I went wrong here as I'm fairly new to Python.
Thanks a lot!
def csv_to_close(csv_file):
with open(f"{csv_file}.csv", 'r') as csvfile:
csv_file = csv.reader(csvfile)
running = True
count = 0
while running == True:
if count < 2:
for column in csv_file:
close = column[4]
count += 1
else:
running = False
print(close)

Related

Dataframe is of type 'nonetype'. How should I alter this to allow merge function to operate?

I have pulled in data from a number of csv files, as well as a database. I wish to use a merge function to make a dataframe isolating the phone numbers that are contained in both dataframes(one originating from csv, the other originating from the database). However, the dataframe from the database displays as type 'nonetype.' This disallows any operation such as merge. How can i change this to allow the operation?
The data comes in from the database as a list of tuples. I then convert this to a dataframe. However, as stated above, it displays as 'nonetype.' I'm assuming at the moment I am confused about about how dataframes handle data types.
#Grab Data
mycursor = mydb.cursor()
mycursor.execute("SELECT DISTINCT(Cell) FROM crm_data.ap_clients Order By Cell asc;")
apclients = mycursor.fetchall()
#Clean Phone Number Data
for index, row in data.iterrows():
data['phone_number'][index] = data['phone_number'][index][-10:]
for index, row in data2.iterrows():
data2['phone_number'][index] = data2['phone_number'][index][-10:]
for index, row in data3.iterrows():
data3['phone_number'][index] = data3['phone_number'][index][-10:]
#make data frame from csv files
fbl = pd.concat([data,data2,data3], axis=0, sort=False)
#make data frame from apclients(database extraction)
apc = pd.DataFrame(apclients)
#perfrom merge finding all records in both frames
successfulleads= pd.merge(fbl, apc, left_on ='phone_number', right_on='0')
#type(apc) returns NoneType
The expected results are to find all records in both dataframes, along with a count so that I may compare the two sets. Any help is greatly appreciated from this great community :)
So it looks like I had a function to rename the column of the dataframe as shown below:
apc = apc.rename(columns={'0': 'phone_number'}, inplace=True)
for col in apc.columns:
print(col)
the code snippet out of the above responsible:
inplace=True
This snippet dictates whether or not the object is modified in the dataframe, or whether a copy is made. The return type on said object is of nonetype.
Hope this helps whoever ends up in my position. A great thanks again to the community. :)

How to avoid empty rows after actual data by using PHPExcel

I am using PHPExcel for importing data to the mysql database.
My code is,
require APPPATH . 'phpexcel/PHPExcel/IOFactory.php';
$objPHPExcel = PHPExcel_IOFactory::load($_FILES['ifile']['tmp_name']);
$data = $objPHPExcel->getActiveSheet(0)->toArray(null, true, true, true);
In my excel sheet having 14 rows,but $objPHPExcel->setActiveSheetIndex(0)->getHighestRow() returns 1047856 rows. Due to this the processing time too high. So $data returns error and server gets slow. How to avoid this?
No, your Excel sheet has something in those rows: whether data or styling or print settings or whatever, they exist in the excel file itself.... however, there is a getHighestDataRow() method that looks at the actual content of cells rather than simply their existence in a file. It will still return cells that contain a NULL or an empty string, but is probably better for your use.
If getHighestDataRow() resolved your problem with the row count, then you should probably also consider using rangeToArray() rather than toArray()

how to retrieve data from a column and produce that value into a new column

I'm super new to R and have a question on how to do something. I listed the things that i got to work so ppl have an idea on what is going on. the thing im having trouble with is in bold.
-I have a data spreadsheet with 2 columns of data (End and CTCF). With the CTCF column having more cells
-I want to to take one value from the "End" column and subtract that value from each individual value in "CTCF" column (so i would have a bunch of products from each calculation)
-I want to then compare those products and find the miniumn absoulute value and the coresponding spot in the CTCF column
-then place that value into a new column ajacent to the corresponding End value.
I wrote a while loop (i know there is probably a WAY easier method) and got the calulation/comparison thing down. I was even able to output the location of the CTCF cell that contains my value of interest see below:
*
*data2<-read.csv("farah.csv")
head(data2)
periph_ctcfs<-list()
temp<-vector(mode="numeric", length = 356)
count<-1
for(i in 1:length(data2$CTCF)) while (count<357)
{
End<-data2$End[count]
periph_ctcfs<-(End-data2$CTCF)
periph_ctcfs<-abs(periph_ctcfs)
periph_ctcfs<-which.min(periph_ctcfs)
print(periph_ctcfs)
temp[]<-data2$CTCF[periph_ctcfs]
count<-count + 1;
}*
The problem is when im trying to produce the new "periph_ctcfs" column, when im trying to insert it into the "temp" vector, the last printed number gets placed within all the cells of the "temp" vector. It feels like that each time the loop goes through its not inserting the retrieved value into "temp". Can anyone help? Thanks ive included a link to a photo (below) so you can get a visual on the layout of the data. Sorry for being a n00b.
For clarity purposes:

Access writng to wrong row number

4150
NRrows = RSNonResourceCosts.RecordCount ' Number of Rows in Non Resource Table
NRCols = RSNonResourceCosts.Fields.Count ' Number of Fields in NonResource Table
Dim CL(1 To 10) As Integer ' This is to count "filled rows" when spreadsheet is filled
Dim Header(1 To 10) As String
'-----------
'Find the Headers (Taken from Actual Table and not predefined as original)
For Each Recordsetfieldx In RSNonResourceCosts.Fields
If C > 0 Then
Header(C) = Recordsetfieldx.Name
End If
C = C + 1
Next Recordsetfieldx
4170
R = 0
'Write to worksheet
RSNonResourceCosts.MoveFirst
Do Until RSNonResourceCosts.EOF
For C = 1 To NRCols - 1
FieldName = RSNonResourceCosts.Fields(C).Value
If RSNonResourceCosts.Fields(Header(C)).Value <> "" Then
CL(C) = CL(C) + 1
WKS.Cells(200 + R, C) = RSNonResourceCosts.Fields(Header(C)).Value
End If
Next C
RSNonResourceCosts.MoveNext
R = R + 1
Loop
I attach code. Have solved part of original by defining Recordset. User can add column to Table. First part of code determines the headers. Second part determines values and writes to worksheet. The new Rows are appearing first on the worksheet and in wrong column. I tried attaching worksheet but it looked awful. Any help would be appreciated.
Two things:
1) The order your records is the order they are in the recordset. If you want them in a particular order, try sorting them (perhaps with an ORDER BY in the underlying SQL statement)
2) For the column issue: In the first bit of code, I don't see where C is initialized, but keep in mind the Headers and Fields both start with an index of 0, so if you set Header(1) = the first field's header (index 0), but then copy the data in the fields without shifting the index value, it will shift everything over by one column.
As an added note, you might want to consider what happens when you have more than 10 columns. Using fixed-length arrays means your code will break. You might want to read about using a dynamic array and ReDim.
I don't yet feel like I have completely grasped the entirety of the problem yet, but let me take a stab at it. From what I do understand, data is being written from your record set into excel (good), but it is going into the 'wrong row' (question title) and the 'wrong column' (question text).
From what I see, I don't know the purpose of FieldName = RSNonResourceCosts.Fields(C).Value, but I want to make sure that you understand that RSNonResourceCosts.Fields(C).Value is not necessarily equivalent to RSNonResourceCosts.Fields(Header(C)).Value. More than that, you are likely missing at least one column altogether in your output, or at least skipping over it accidentally. rs.Fields(0).name is the first 'column' in a recordset, but it is completely ignored in your code. Perhaps this is intentional, maybe it is a key field or something useless to you, but it is important that you are making that distinction intentionally. But, since I don't see where your code populates the headers in your worksheet, I wonder if 'wrong column' means every record has been shifted a column and your last column is sitting empty. That, coupled with the dubious omission of C being initialized as 0 (not 1, or anything else) in your above code, makes me concerned that Header(3) could possibly by field(1), or field(4), or I don't know. That would certainly also confuse the columns in your output, or at least make dependence on FieldName frustrating.
Another thing, really a shot in the dark: NRrows. I have had issues before, depending on how I create my recordset, of not getting the correct record count the first time. And, if I base the population of a worksheet, array, etc., on the number of rows and the records relative position in that number, my records get all sorts of wacky. Maybe you did this already, but since it isn't shown, I recommend a RSNonResourceCosts.movelast: RSNonResourceCosts.movefirst line before you define NRrows, just to be sure.
And last, if I am way off base here... then you really are going to have to show us the spreadsheet, even if it isn't your most beautiful work. We all know that if it were, you wouldn't be asking about it here... so set your pride aside, and be more specific as well as show us what the output looks like and how it should look.

InStrRev Not Giving Correct Results

I have an Access database where I'm importing book/journal publication data from JabRef in a CSV format.
When I import the data to Access one of the odd things that happens is that the page numbers are given two hyphens in between them, so the data in the "pages" column in Access would look something like "200--213"
I need to be able to count the number of pages that are referenced.
In order to do this I do the following in unbound text boxes on the form:
I find the length of the string in the "pages" column (have to rename the pages variable as it's a reserved name to pagesset): PLen = Len([pagesset])
I find the number of characters that happen from the left up to the "--": LPageVar = InStr([pagesset],"--")
I find the number of characters that happen from the right up to the "--": RPageVar = InStrRev([pagesset],"--")
I find the actual page number on the left side of the "--": LVal = Left([pagesset],[LPageVar]-1)
I find the actual page number on the right side of the "--": RVal = Right([pagesset],[RPageVar]-1)
I calculate the number of pages that appear: Pgcnt = RVal - LVal
Everything seems to work... except when the "InStrRev" hits an item that increments the number by the 10 or 100 spot, like this: "7--11", "7--23", or "92--101" as opposed to this: "102--123" or "103--110" (which causes no issues). When it hits these shorter pagethe RPageVar is too low by 1.
For each of these items on the right, RVal seems to drop the first character... so for "7--11" last page is reported as 1 or for "7--23" it would report the last page as 3 or "92--101" the last page is reported as 01. This causes these particular page counts to be negative.
Does anyone have an idea as to why I'm getting this behavior?
InStrRev() searches from the end of the string, but the location it returns is relative to the beginning of the string, not the end. So,
s = "this is a test--1"
Debug.Print InStrRev(s,"--")
displays 15, and
Right("this is a test--1",15)
is obviously not going to isolate the "1" at the end of the string. That would be done with
Mid(s,InStrRev(s,"--")+2)
or, in the case where there is only one instance of "--" in the string
Mid(s,InStr(s,"--")+2)
would also work.
Here is a different approach which is less confusing for me; not sure if it will be less confusing for anyone else, though.
? PageCount("200--213")
14
? PageCount("7--11")
5
Function PageCount(ByVal pIn As String) As Long
Dim astrPageRange() As String
astrPageRange = Split(pIn, "--")
PageCount = (Val(astrPageRange(1)) - Val(astrPageRange(0))) + 1
End Function