How can I answer my own HIT on Mechanical Turk using boto - boto

Are there any methods in the boto library that would allow me to answer my own HIT programmatically? This would be very useful for automated testing. From my reading of the boto docs it does not seem possible.
I would like to do something like this:
First, post HITs to the Turk. Nothing out of the ordinary here; a good example of using boto to publish to Turk can be found here.
# establish a connection to Mechanical Turk
mtc = MTurkConnection(aws_access_key_id = ACCESS_ID, aws_secret_access_key = SECRET_KEY, host = HOST)
# construct question forms (omitted for clarity; see tutorial above)
# publish the HIT to Mechanical Turk
mtc.create_hit(
questions=question_form,
max_assignments=1,
title=title,
description=description,
keywords=keywords,
duration = 60*5,
reward=0.05
)
Here is what I don't know how to do. I want to answer my own HIT so that when I get the results, the answer fields will be populated. I know it can be done manually using the Worker Sandbox, but I want to incorporate this into unit tests, so it would be nice if it could be automated. I imagine it might look something like this:
answers = { 'question_1':'answer_1', 'question_2':'answer_2' }
mtc.answer_hit(hit_id=hit_id, answers=answers)
And finally, I want to get the results. This is pretty standard as well.
rs = mtc.get_reviewable_hits(page_size=100)
for hit in rs:
assignments = mtc.get_assignments(hit.HITId)
for answer in assignment.answers[0]:
# process answers

Nope. There is no Worker API for MTurk, so it's not possible to do this programmatically. It has been requested various times on the developer forum, but there seems to be no activity from AWS about creating it.

Related

Pubmed API returns less results than web interface

I'm trying to access Pubmed results via R using their API, but I consistently get fewer results than what the same query achieves when used with the web interface. By digging in the output I noticed that the problem lays in a different query translation between the two access methods.
I am using the rentrez package, but the results I get are the same also with other related rpackages, so I guess it's related to the API itself.
here's the code to reproduce the results:
install.packages('rentrez')
rentrez::entrez_search(db="pubmed", term = '((model OR models OR modeling OR network OR networks) AND (dissemination OR transmission OR spread OR diffusion) AND (nosocomial OR hospital OR "long-term-care" OR "long term care" OR "longterm care" OR "long-term care" OR "hospital acquired" OR "healtcare associated") AND (infection OR resistance OR resistant)) AND (2010[PDAT]:2020[PDAT])')$count
[1] 7157
The same query on https://pubmed.ncbi.nlm.nih.gov/ returns 9263 results.
Not sure if you still need this now. Just in case someone else has the same problem.
I had the same issue as you did and I found something might be useful from a GitHub issue.
It seems that the API service needs to be updated to match the new web service, but it's been a year now and still no promising announcement has been made by the official.
An alternative is provided by the easyPubMed author. Hope this is what you were looking for.
easyPubMed Issue

Only Output Rule Alerts to Suricata EVE

I have Suricata setup as HIDS on a couple of lab instances, and wrote some sample rules to alert on custom User-Headers and internal IPs I can easily trigger for purpose of teaching someone how to use Suricata.
For an advanced use case, I want to output the EVE JSON file somewhere downstream for eventual data analytics and BI use cases.
For that purpose, I want to drop the "noise" from EVE, or have a way for the fast.log to be output in JSON.
For instance, this is what I would consider "noise" as I want to just see triggered
,"event_type":"stats","stats":{"uptime":168,"capture":{"kernel_packets":313,"kernel_drops":0,"errors":0},"decoder":{"pkts":313,"bytes":68519,"invalid":0,"ipv4":305,"ipv6":0,"ethernet":313,"r$
{"timestamp":"2019-08-13T14:29:09.058698+0000","event_type":"stats","stats":{"uptime":176,"capture":{"kernel_packets":313,"kernel_drops":0,"errors":0},"decoder":{"pkts":313,"bytes":68519,"invalid":0,"ipv4":305,"ipv6":0,"ethernet":313,"r$
{"timestamp":"2019-08-13T14:29:17.059944+0000","event_type":"stats","stats":{"uptime":184,"capture":{"kernel_packets":313,"kernel_drops":0,"errors":0},"decoder":{"pkts":313,"bytes":68519,"invalid":0,"ipv4":305,"ipv6":0,"ethernet":313,"r$
I would only want to see stuff like this from fast.log
[**] [1:200002:6] ET USER_AGENTS Suspicious User Agent (BlackSun) [**] [Classification: A Network Trojan was detected] [Priority: 1] {TCP}
So is there a way to get only the Alerts in EVE, or a way to transform Fast.log into JSON?
Found an answer for myself again.
On Line 60 in the YAML, there is a value you can set to "No" for stats - that will eliminate probably 80% of the noise you have. You can go further an eliminate metadata for DNS, TLS, TCP, HTTP, etc. to further reduce your log file if needed.

STM32 StdPeriph library USART example

I downloaded Stdperiph library and i want to make USART example run on STM32F4 - Discovery. I chose STM32F40_41xxx workplace, added stm32f324x7i.c file and compiled without any errors.
Issue is that I cant receive expected message in my terminal (using Hercules), also when I check RxBuffer it is receiving some bytes but not that I sent.
I checked baudrate, wordlength, parity several times. Do you have any idea what could I do wrong?
USART conf:
USART_InitStructure.USART_BaudRate = 9600;
USART_InitStructure.USART_WordLength = USART_WordLength_8b;
USART_InitStructure.USART_StopBits = USART_StopBits_2;
USART_InitStructure.USART_Parity = USART_Parity_Odd;
USART_InitStructure.USART_HardwareFlowControl = USART_HardwareFlowControl_None;
USART_InitStructure.USART_Mode = USART_Mode_Rx | USART_Mode_Tx;
STM_EVAL_COMInit(COM1, &USART_InitStructure);
Thank you.
First of all if you want to use hihg level abstraction libraries stop using obsolete SPL and start using HAL. Install the Cube. Generate the code - import into your favorite IDE and compile. Should work.
Your code does not show anything as USART clock may be net enabled as well as GPIOs. GPIOs may be configured wrong way. You system and peripheral clock may have wrong frequency. There are many more potential problems.

How to use the Google api-client python library for Google Logging

I've been using the Google apiclient library in python for various Google Cloud APIs - mostly for Google Compute - with great success.
I want to start using the library to create and control the Google Logging mechanism offered by the Google Cloud Platform.
However, this is a beta version, and I can't find any real documentation or example on how to use the logging API.
All I was able to find are high-level descriptions such as:
https://developers.google.com/apis-explorer/#p/logging/v1beta3/
Can anyone provide a simple example on how to use apiclient for logging purposes?
for example creating a new log entry...
Thanks for the help
Shahar
I found this page:
https://developers.google.com/api-client-library/python/guide/logging
Which states you can do the following to set the log level:
import logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)
However it doesn't seem to have any impact on the output which is always INFO for me.
I also tried setting httplib2 to debuglevel 4:
import httplib2
httplib2.debuglevel = 4
Yet I don't see any HTTP headers in the log :/
I know this question is old, but it is getting some attention, so I guess it might be worth answering to it, in case someone else comes here.
Stackdriver Logging Client Libraries for Google Cloud Platform are not in beta anymore, as they hit General Availability some time ago. The link I shared contains the most relevant documentation for installing and using them.
After running the command pip install --upgrade google-cloud-logging, you will be able to authenticate with your GCP account, and use the Client Libraries.
Using them is as easy as importing the library with a command such as from google.cloud import logging, then instantiate a new client (which you can use by default, or even pass the Project ID and Credentials explicitly) and finally work with Logs as you want.
You may also want to visit the official library documentation, where you will find all the details of how to use the library, which methods and classes are available, and how to do most of the things, with lots of self-explanatory examples, and even comparisons between the different alternatives on how to interact with Stackdriver Logging.
As a small example, let me also share a snippet of how to retrieve the five most recent logs which have status more sever than "warning":
# Import the Google Cloud Python client library
from google.cloud import logging
from google.cloud.logging import DESCENDING
# Instantiate a client
logging_client = logging.Client(project = <PROJECT_ID>)
# Set the filter to apply to the logs, this one retrieves GAE logs from the default service with a severity higher than "warning"
FILTER = 'resource.type:gae_app and resource.labels.module_id:default and severity>=WARNING'
i = 0
# List the entries in DESCENDING order and applying the FILTER
for entry in logging_client.list_entries(order_by=DESCENDING, filter_=FILTER): # API call
print('{} - Severity: {}'.format(entry.timestamp, entry.severity))
if (i >= 5):
break
i += 1
Bear in mind that this is just a simple example, and that many things can be achieved using the Logging Client Library, so you should refer to the official documentation pages that I shared in order to get a more deep understanding of how everything works.
However it doesn't seem to have any impact on the output which is
always INFO for me.
add a logging handler, e.g.:
formatter = logging.Formatter('%(asctime)s %(process)d %(levelname)s: %(message)s')
consoleHandler = logging.StreamHandler()
consoleHandler.setLevel(logging.DEBUG)
consoleHandler.setFormatter(formatter)
logger.addHandler(consoleHandler)

Get a warning if an expected schedule report email hasnt arrived [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I (like most tech admins I guess) have quite a lot of status infos from scheduled services in my inbox. However when one service email fails there's obviously no email sent. So I simply want a service looking at my inbox saying "Hey this service did not send an email report yesterday - somethings wrong!".
This one should be solved somewhere I guess. Perhaps Gmail (or some other email provider) has a service of this kind, that would be great.
Wouldn't it be a better option to have a centralized monitoring solution like Nagios that you configure in such way that it only send out notifications when a service misses its heartbeat, reaches highwatermarks, run out of fuel? And then off course of a second monitoring solution that monitors the main monitoring solution....
http://www.nagios.org/documentation
I'm not aware of any service you describe but a manual routine might go like this:
Have a folder/tag structure like this:
Services\Hourly-[NumberOfServices] (or add a folder per service)
Services\Daily-[NumberOfServicves]
Services\Weekly-[NumberOfServicves]
Services\Monthly-[NumberOfServicves]
Have rules for incoming mail to filter each specific service notification and move it to the right folder based on its expected timing.
Wakeup every hour and check if there are unread messages in your Hourly folder. The number of unread should be the same as the NumberOfServices mentioned in the folder. Read/Process them and make sure to all mark them as Read. Any service that didn't e-mailed get's spotted easily.
Wakeup at 0:00 and check if there are unread messages in your Daily folder. etc etc..
Wakeup at 0:00 and Saturday and check if there are unread messages in your Weekly folder. etc.....
Wakeup at 0:00 on the first of the month and check if there are unread messages in your Weekly folder. etc etc etc...
My advice would be to cut down the noise generated by the services.
If you still feel you need a service I can only provide a very very basic .Net implementation roughly based on the above process and works with gmail...
This is also portable to powershell...
static void Main(string[] args)
{
var resolver = new XmlUrlResolver
{
Credentials = new NetworkCredential("yourgoolgeaccount", "yourpassword")
};
var settings = new XmlReaderSettings();
settings.XmlResolver = resolver;
var xr = XmlReader
.Create("https://mail.google.com/mail/feed/atom/[name of your filter]"
, settings);
var navigator = new XPathDocument(xr).CreateNavigator();
var ns = new XmlNamespaceManager(new NameTable());
ns.AddNamespace("fd", "http://purl.org/atom/ns#");
var fullcountNode = navigator.SelectSingleNode(
"/fd:feed/fd:fullcount"
, ns);
Console.WriteLine(fullcountNode.Value);
int fullcount = Int32.Parse(fullcountNode.Value);
int expectCount = 10;
if (expectCount > fullcount)
{
Console.WriteLine("*** NOT EVERY ONE REPORTED BACK");
}
}
You mentioned Gmail, so you may be interested in googlecl, which gives you command-line controls for things like Google Calendar and Docs. Unfortunately they do not yet support Gmail, but if your long-term preference is to use a Gmail account as the hub of your status reports, then googlecl may be your best option.
In the short run, you can try out googlecl right now using the commands for Calendar, Blogger, or Docs, all of which are already supported. For example, these commands add events to Google Calendar:
google calendar add --cal server1 "I'm still alive at 13:45 today"
google calendar add "Server 1 is still alive at 2011-02-08 19:43"
...and these commands query the calendar:
google calendar list --fields title,when,where --cal "commitments"
google calendar list -q party --cal ".*"
Come to think of it, you may even find that Calendar, Blogger, or Docs are a more appropriate place than Gmail for tracking status updates. For example, a spreadsheet or calendar format should make it easier to generate a graphical representation of when a given service was up or down.
You still need to write a little program which uses googlecl to query the calendar (or blog, or docs, or whatever), but once you have simple command lines at your disposal, the rest should be pretty straightforward. Here's a link to further information about googlecl:
http://code.google.com/p/googlecl/
If you really want to use Gmail, and use it right now, they offer an IMAP interface. Using IMAP, you can perform numerous simple operations, such as determining if a message exists which contains a specified subject line. Here's one good place to learn about the details:
http://mail.google.com/support/bin/answer.py?hl=en&answer=75725
Here's a quick example that uses IMAP and Python to list the ten most-recent emails which have a given Gmail "Label":
import getpass, imaplib
# These gmail_* utilties are from https://github.com/drewbuschhorn/gmail_imap
import gmail_mailboxes, gmail_messages, gmail_message
# Update these next lines manually, or turn them into parms or somesuch.
gmail_account_name = "your_user_name#gmail.com" # Your full gmail address.
mailbox_name = "StatusReports" # Use Gmail "labels" to tag the relevant msgs.
class gmail_imap:
def __init__ (self, username, password):
self.imap_server = imaplib.IMAP4_SSL("imap.gmail.com",993)
self.username = username
self.password = password
self.loggedIn = False
self.mailboxes = gmail_mailboxes.gmail_mailboxes(self)
self.messages = gmail_messages.gmail_messages(self)
def login (self):
self.imap_server.login(self.username,self.password)
self.loggedIn = True
def logout (self):
self.imap_server.close()
self.imap_server.logout()
self.loggedIn = False
# Right now this prints a summary of the most-recent ten (or so) messages
# which have been labelled in Gmail with the string found in mailbox_name.
# It won't work unless you've used Gmail settings to allow IMAP access.
if __name__ == '__main__':
gmail = gmail_imap(gmail_account_name,getpass.getpass())
gmail.messages.process(mailbox_name)
for next in gmail.messages:
message = gmail.messages.getMessage(next.uid)
# This is a good point in the code to insert some kind of search
# of gmail.messages. Instead of unconditionally printing every
# entry (which is what the code below does), issue some sort of
# warning if the expected email (message.From and message.Subject)
# did not arrive within the expected time frame (message.date).
print message.date, message.From, message.Subject
gmail.logout()
As noted in the code comments, you could adapt it to issue some sort of warning if the most-recent messages in that mailbox do not contain an expected message. Then just run the Python program once per day (or whatever time period you require) to see if the expected email message was never received.