I'm trying to get all of a specific user's tweets.
I know there is a limit of retreiving 3600 tweets, so I'm wondering why I can't get more tweets from this line:
https://api.twitter.com/1/statuses/user_timeline.json?include_entities=true&include_rts=true&screen_name=mybringback&count=3600
Does anyone know how to fix this?
The API documentation specifies that the maximum number of statuses that this call will return is 200.
https://dev.twitter.com/docs/api/1/get/statuses/user_timeline
Specifies the number of tweets to try and retrieve, up to a maximum of 200. The value of count is best thought of as a limit to the number of tweets to return because suspended or deleted content is removed after the count has been applied. We include retweets in the count, even if include_rts is not supplied. It is recommended you always send include_rts=1 when using this API method.
Here's something I've used for a project that had to do just that:
import json
import commands
import time
def get_followers(screen_name):
followers_list = []
# start cursor at -1
next_cursor = -1
print("Getting list of followers for user '%s' from Twitter API..." % screen_name)
while next_cursor:
cmd = 'twurl "/1.1/followers/ids.json?cursor=' + str(next_cursor) + \
'&screen_name=' + screen_name + '"'
(status, output) = commands.getstatusoutput(cmd)
# convert json object to dictionary and ensure there are no errors
try:
data = json.loads(output)
if data.get("errors"):
# if we get an inactive account, write error message
if data.get('errors')[0]['message'] in ("Sorry, that page does not exist",
"User has been suspended"):
print("Skipping account %s. It doesn't seem to exist" % screen_name)
break
elif data.get('errors')[0]['message'] == "Rate limit exceeded":
print("\t*** Rate limit exceeded ... waiting 2 minutes ***")
time.sleep(120)
continue
# otherwise, raise an exception with the error
else:
raise Exception("The Twitter call returned errors: %s"
% data.get('errors')[0]['message'])
if data.get('ids'):
print("\t\tFound %s followers for user '%s'" % (len(data['ids']), screen_name))
followers_list += data['ids']
if data.get('next_cursor'):
next_cursor = data['next_cursor']
else:
break
except ValueError:
print("\t****No output - Retrying \t\t%s ****" % output)
return followers_list
screen_name = 'AshwinBalamohan'
followers = get_followers(screen_name)
print("\n\nThe followers for user '%s' are:\n%s" % followers)
In order to get this to work, you'll need to install the Ruby gem 'Twurl', which is available here: https://github.com/marcel/twurl
I found Twurl easier to work with than the other Python Twitter wrappers, so opted to call it from Python. Let me know if you'd like me to walk you through how to install Twurl and the Twitter API keys.
Related
I have seen this issue many times happening to many people (here). I am still struggling trying to validate whether what my dictionary captures from a JSON is "None" or not but I still get the following error.
This code is supposed to call a CURL looking for the 'closed' value in the 'status' key until it finds it (or 10 times). When payment is done by means of a QR code, status changes from opened to closed.
status = (my_dict['elements'][0]['status'])
TypeError: 'NoneType' object is not subscriptable
Any clue of what am I doing wrong and how can I fix it?
Also, if I run the part of the script that calls the JSON standalone, it executes smoothly everytime. Is it anything in the code that could be affecting the CURL execution?
By the way, I have started programming 1 week ago so please excuse me if I mix concepts or say something that lacks of common sense.
I have tried to validate the IF with "is not" instead of "!=" and also with "None" instead of "".
def show_qr():
reference_json = reference.replace(' ','%20') #replaces "space" with %20 for CURL assembly
url = "https://api.mercadopago.com/merchant_orders?external_reference=" + reference_json #CURL URL concatenate
headers = CaseInsensitiveDict()
headers["Authorization"] = "Bearer MY_TOKEN"
pygame.init()
ventana = pygame.display.set_mode(window_resolution,pygame.FULLSCREEN) #screen settings
producto = pygame.image.load("qrcode001.png") #Qr image load
producto = pygame.transform.scale(producto, [640,480]) #Qr size
trials = 0 #sets while loop variable start value
status = "undefined" #defines "status" variable
while status != "closed" and trials<10: #to repeat the loop until "status" value = "closed"
ventana.blit(producto, (448,192)) #QR code position setting
pygame.display.update() #
response = requests.request("GET", url, headers=headers) #makes CURL GET
lag = 0.5 #creates an incremental 0.5 seconds everytime return value is None
sleep(lag) #
json_data = (response.text) #Captures JSON response as text
my_dict = json.loads(json_data) #creates a dictionary with JSON data
if json_data != "": #Checks if json_data is None
status = (my_dict['elements'][0]['status']) #If json_data is not none, asigns 'status' key to "status" variable
else:
lag = lag + 0.5 #increments lag
trials = trials + 1 #increments loop variable
sleep (5) #time to avoid being banned from server.
print (trials)
From your original encountered error, it's not clear what the issue is. The problem is that basically any part of that statement can result in a TypeError being raised as the evaluated part is a None. For example, given my_dict['elements'][0]['status'] this can fail if my_dict is None, or also if my_dict['elements'] is None.
I would try inserting breakpoints to better assist with debugging the cause. another solution that might help would be to wrap each part of the statement in a try-catch block as below:
my_dict = None
try:
elements = my_dict['elements']
except TypeError as e:
print('It possible that my_dict maybe None.')
print('Error:', e)
else:
try:
ele = elements[0]
except TypeError as e:
print('It possible that elements maybe None.')
print('Error:', e)
else:
try:
status = ele['status']
except TypeError as e:
print('It possible that first element maybe None.')
print('Error:', e)
else:
print('Got the status successfully:', status)
Iam currently doing a tweet search using Twitter Api. However, taking the tweet id is not working for me.
Here is my code:
searchQuery = '#BLM' # this is what we're searching for
searchQuery = searchQuery + "-filter:retweets"
Geocode="39.8, -95.583068847656, 2500km"
maxTweets = 1000000 # Some arbitrary large number
tweetsPerQry = 100 # this is the max the API permits
fName = 'tweetsBLM.json' # We'll store the tweets in a json file.
sinceId = None
#max_id = -1 # initial search
max_id=1278836959926980609 # the last id of previous search
tweetCount = 0
print("Downloading max {0} tweets".format(maxTweets))
with open(fName, 'w') as f:
while tweetCount < maxTweets:
try:
if (max_id <= 0):
if (not sinceId):
new_tweets = api.search(q=searchQuery,lang="en", geocode=Geocode,
count=tweetsPerQry)
else:
new_tweets = api.search(q=searchQuery,lang="en",geocode=Geocode,
count=tweetsPerQry,
since_id=sinceId )
else:
if (not sinceId):
new_tweets = api.search(q=searchQuery, lang="en", geocode=Geocode,
count=tweetsPerQry,
max_id=str(max_id - 1) )
else:
new_tweets = api.search(q=searchQuery, lang="en", geocode=Geocode,
count=tweetsPerQry,
max_id=str(max_id - 1),
since_id=sinceId)
if not new_tweets:
print("No more tweets found")
break
for tweet in new_tweets:
f.write(jsonpickle.encode(tweet._json, unpicklable=False) +
'\n')
tweetCount += len(new_tweets)
print("Downloaded {0} tweets".format(tweetCount))
max_id = new_tweets[-1].id
except tweepy.TweepError as e:
# Just exit if any error
print("some error : " + str(e))
print('exception raised, waiting 15 minutes')
print('(until:', dt.datetime.now() + dt.timedelta(minutes=15), ')')
time.sleep(15*60)
break
print ("Downloaded {0} tweets, Saved to {1}".format(tweetCount, fName))
This code works perfectly fine. I initially run it and got about 40 000 tweets. Then i took the id of the last tweet of previous/initial search to go back in time. However, i was disappointed to see that there were no tweets anymore. I can not believe that for a second. I must be going wrong somewhere because #BLM has been very active in the last 2/3 months.
Any help is very welcome. Thank you
I may have found the answer. Using Twitter API, it is not possible to get older tweets (7 days old or more). Using max_id to get around this is not possible either.
The only way is to stream and wait for more than 7 days.
Finally, there is also this link that look for older tweets
https://pypi.org/project/GetOldTweets3/ it is an extension of the original Jefferson Henrique's work
Actually, I am trying to update one table with multiple processes via pymysql, and each process reads a CSV file split from a huge one in order to promote the speed. But I get the Lock wait timeout exceeded; try restarting transaction exception when I run the script. After searching the posts on this site, I found one post which mentioned that to set or build the built-in LOAD_DATA_INFILE, but no details on it. How can I do it with 'pymysql' to reach my aim?
---------------------------first edit----------------------------------------
Here's the job method:
`def importprogram(path, name):
begin = time.time()
print('begin to import program' + name + ' info.')
# "c:\\sometest.csv"
file = open(path, mode='rb')
csvfile = csv.reader(codecs.iterdecode(file, 'utf-8'))
connection = None
try:
connection = pymysql.connect(host='a host', user='someuser', password='somepsd', db='mydb',
cursorclass=pymysql.cursors.DictCursor)
count = 1
with connection.cursor() as cursor:
sql = '''update sometable set Acolumn='{guid}' where someid='{pid}';'''
next(csvfile, None)
for line in csvfile:
try:
count = count + 1
if ''.join(line).strip():
command = sql.format(guid=line[2], pid=line[1])
cursor.execute(command)
if count % 1000 == 0:
print('program' + name + ' cursor execute', count)
except csv.Error:
print('program csv.Error:', count)
continue
except IndexError:
print('program IndexError:', count)
continue
except StopIteration:
break
except Exception as e:
print('program' + name, str(e))
finally:
connection.commit()
connection.close()
file.close()
print('program' + name + ' info done.time cost:', time.time()-begin)`
And the multi-processing method:
import multiprocessing as mp
def multiproccess():
pool = mp.Pool(3)
results = []
paths = ['C:\\testfile01.csv', 'C:\\testfile02.csv', 'C:\\testfile03.csv']
name = 1
for path in paths:
results.append(pool.apply_async(importprogram, args=(path, str(name))))
name = name + 1
print(result.get() for result in results)
pool.close()
pool.join()
And the main method:
if __name__ == '__main__':
multiproccess()
I am new to Python. How can I make the code or the way itself goes wrong? Should I use only one single process to finish the data reading and importing?
Your issue is that you are exceeding the time allowed for a response to be fetched from the server, so the client is automatically timing out.
In my experience, adjust the wait timeout to something like 6000 seconds, combine into one CSV and just leave the data to import. Also, I would recommend running the query direct from MySQL rather than Python.
The way I usually import CSV data from Python to MySQL is through the INSERT ... VALUES ... method, and I only do so when some kind of manipulation of the data is required (i.e. inserting different rows into different tables).
I like your approach and understand your thinking but in reality there is no need. The benefit to the INSERT ... VALUES ... method is that you won't run into any timeout issue.
I am trying to make a program that has a user input numbers into multiple different lines of code and I am trying to make it so that if the user inputs something other than a number the program will ask the user again to input the number correctly. I was trying to define a function that I could use for all of them but every time I run the program, it crashes. Any help would be much appreciated, thank you.
My code:
def error():
global m1
global m2
global w1
global w2
while True:
try:
int(m1 or m2 or w1 or w2)
except ValueError:
try:
float(m1 or m2 or w1 or w2)
except ValueError:
m1 or m2 or w1 or w2=input("please input your response correctly...")
break
m1=input("\nWhat was your first marking period percentage?")
error()
w1=input("\nWhat is the weighting of the first marking period? (in decimal)")
error()
m2=input("\nWhat was your second marking period percentage?")
error()
w2=input("\nWhat is the weighting of the second marking period? (in decimal)")
error()
def user_input(msg):
inp = input(msg)
try:
return int(inp) if inp.isnumeric() else float(inp)
except ValueError as e:
return user_input("Please enter a numeric value")
m1=user_input("\nWhat was your first marking period percentage?")
w1=user_input("\nWhat is the weighting of the first marking period? (in decimal)")
m2=user_input("\nWhat was your second marking period percentage?")
w2=user_input("\nWhat is the weighting of the second marking period? (in decimal)")
You should write your function to get one number at a time. If at exception is triggered somewhere, it should be handled. Note how the get_number function shown below will keep asking for a number but also shows the prompt specified by its caller. If you are not running Python 3.6 or higher, you will need to comment out the call to print in the main function.
#! /usr/bin/env python3
def main():
p1 = get_number('What is your 1st marking period percentage? ')
w1 = get_number('What is the weighting of the 1st marking period? ')
p2 = get_number('What is your 2nd marking period percentage? ')
w2 = get_number('What is the weighting of the 2nd marking period? ')
score = calculate_score((p1, p2), (w1, w2))
print(f'Your score is {score:.2f}%.')
def get_number(prompt):
while True:
try:
text = input(prompt)
except EOFError:
raise SystemExit()
else:
try:
number = float(text)
except ValueError:
print('Please enter a number.')
else:
break
return number
def calculate_score(percentages, weights):
if len(percentages) != len(weights):
raise ValueError('percentages and weights must have same length')
return sum(p * w for p, w in zip(percentages, weights)) / sum(weights)
if __name__ == '__main__':
main()
By the following code you can able to make a function that only accept integer value:
def input_type(a):
if(type(10)==type(a)):
print("integer")
else:
print("not integer")
a=int(input())
input_type(a)
ok I am trying to create a definition which will read a list of IDS from an external Json file, Which it is doing. Its even putting the data into the database on load of the program, my issue is this. I cant seem to match the list IDs to a comparison. Here is my current code:
def check(account):
global ID_account
import json, httplib
if not hasattr(BigWorld, 'iddata'):
UID_DB = account['databaseID']
UID = ID_account
try:
conn = httplib.HTTPConnection('URL')
conn.request('GET', '/ids.json')
conn.sock.settimeout(2)
resp = conn.getresponse()
qresp = resp.read()
BigWorld.iddata = json.loads(qresp)
LOG_NOTE('[ABRO] Request of URL data successful.')
conn.close()
except:
LOG_NOTE('[ABRO] Http request to URL problem. Loading local data.')
if UID_DB is not None:
list = BigWorld.iddata["ids"]
#print (len(list) - 1)
for n in range(0, (len(list) - 1)):
#print UID_DB
#print list[n]
if UID_DB == list[n]:
#print '[ABRO] userid located:'
#print UID_DB
UID = UID_DB
else:
LOG_NOTE('[ABRO] userid not set.')
if 'databaseID' in account and account['databaseID'] != UID:
print '[ABRO] Account not active in database, game closing...... '
BigWorld.quit()
now my json file looks like this:
{
"ids":[
"1001583757",
"500687699",
"000000000"
]
}
now when I run this with all the commented out prints it seems to execute perfectly fine up till it tries to do the match inside the for loop. Even when the print shows UID_DB and list[n] being the same values, it does not set my variable, it doesn't post any errors, its just simply acting as if there was no match. am I possibly missing a loop break? here is the python log starting with the print of the length of the table print:
INFO: 2
INFO: 1001583757
INFO: 1001583757
INFO: 1001583757
INFO: 500687699
INFO: [ABRO] Account not active, game closing......
as you can see from the log, its never printing the User located print, so it is not matching them. its just continuing with the loop and using the default ID I defined above the definition. Anyone with an idea would definitely help me out as ive been poking and prodding this thing for 3 days now.
the answer to this was found by #VikasNehaOjha it was missing simply a conversion to match types before the match comparison I did this by adding in
list[n] = int(list[n])
that resolved my issue and it finally matched comparisons.