Prolog csv flight data analysis - csv

I'm working on expert system for flight data analysis.
the flight is a csv file. i would like to:
save each flight file( Aircraft number, Motor number and flight date)
define limitations of parameteres (motor and aircraft) in the Knowledge base:
for example T4(temperature before turbine) shouldn't exceed 650 over 30s.
save the report file as pdf or html
the problem is that to analyse the file you should make loops row by row in order to detect anomalies.
So, how can i do it with prolog? do you any suggestions?

You don't say what implementation you're using. I'm guessing SWI-Prolog.
You can write csv files using csv_write_file or csv//1,2
http://www.swi-prolog.org/pldoc/doc_for?object=csv//1
based on OP's feedback:
suppose you have some facts:
engine_temp(Time, Temp).
you could just get a list of them with findall
findall(Time-Temp, engine_temp(Time, Temp), List)
This binds List to pairs of form Time-Temp.
To generate HTML, using the SWI-Prolog library
:- use_module(library(http/html_write)).
:- http_handler('/temps', temp_hdlr, []).
temp_hdlr(_Request) :-
reply_html_page(title('engine temps'),
\temp_list
).
temp_list -->
{ findall(Time-Temp, engine_temp(Time, Temp), List) },
html(ul(\list_body(List))).
list_body([]) --> [].
list_body([Time-Temp | Rest]) -->
html(tr([td(Time), td(Temp)])),
list_body(Rest).
Hope a) that works, I'm away from dev machine, and b) the html generation boiler plate doesn't look too scary.
Hope that helps.

Related

Find all possible paths between two nodes on graph using a graph database

I have a collection of nodes that make up a DAG (directed acyclic graph) with no loops guaranteed. I want to store the nodes in a database and have the database execute a search that shows me all paths between two nodes.
For example, you could think that I have the git history of a complex project.
Each node can be described with a JSON object that has:
{'id':'id',
'outbound':['id1','id2','id3']}
}
So if I had these nodes in the database:
{'id':'id0',
'outbound':['id1','id2']}
}
{'id':'id1',
'outbound':['id2','id3','id4','id5,'id6']}
}
{'id':'id2',
'outbound':['id2','id3'}
}
And if I wanted to know all of the paths connecting id0 and id3, I would want to get three lists:
id0 -> id1 -> id3
id0 -> id2 -> id3
id0 -> id1 -> id2 -> id3
I have thousands of these nodes today, I will have tens of thousands of them tomorrow. However, there are many DAGs in the database, and the typical DAG only has 5-10 nodes, so this problem is tractable.
I believe that there is no way to do this efficiently MySQL (right now all of the objects are stored in a table in a JSON column), however I believe that it is possible to do it efficiently in a graph database like Neo4j.
I've looked at the Neo4J documentation on Path Finding Algorithms and perhaps I'm confused, but the examples don't really look like working examples. I found a MySQL example which uses stored procedures and it doesn't look like it parallelizes very well. I'm not even sure what Amazon Neptune is doing; I think that it is using Spark GraphX.
I'm sort of lost as to where to start on this.
It's perfectly doable with Neo4j.
Importing json data
[
{"id":"id0",
"outbound":["id1","id2"]
},
{"id":"id1",
"outbound":["id2","id3","id4","id5","id6"]
},
{"id":"id2",
"outbound":["id2","id3"]
}
]
CALL apoc.load.json("graph.json")
YIELD value
MERGE (n:Node {id: value.id})
WITH n, value.outbound AS outbound
UNWIND outbound AS o
MERGE (n2:Node {id: o})
MERGE (n)-[:Edge]->(n2)
Apparently the data you provided is not acyclic...
Getting all paths between two nodes
As you are not mentioning shortest paths, but all paths, there is no specific algorithm required:
MATCH p=(:Node {id: "id0"})-[:Edge*]->(:Node {id: "id3"}) RETURN nodes(p)
"[{""id"":id0},{""id"":id1},{""id"":id3}]"
"[{""id"":id0},{""id"":id2},{""id"":id3}]"
"[{""id"":id0},{""id"":id1},{""id"":id2},{""id"":id3}]"
"[{""id"":id0},{""id"":id2},{""id"":id2},{""id"":id3}]"
"[{""id"":id0},{""id"":id1},{""id"":id2},{""id"":id2},{""id"":id3}]"
Comparaison with MySql
See how-much-faster-is-a-graph-database-really
The Graph Data Science library pathfinding algorithms are designed to find the shortest weighted paths and use algorithms similar to Dijkstra to find them. In your case, it seems that you are dealing with a directed unweighted graph and you could use the native cypher allShortestPath procedure:
An example would be:
MATCH (n1:Node{id:"A"}),(n2:Node{id:"B"})
MATCH path=allShortestPaths((n1)-[*..10]->(n2))
RETURN [n in nodes(path) | n.id] as outbound_nodes_id
It is always useful to check the Cypher refcard to see what is available with Cypher in Neo4j

SSIS Processing money fields with what looks like signs over the last digit

I have a fixed length flat file input file. The records look like this
40000003858172870114823 0010087192017092762756014202METFORMIN HCL ER 500 MG 0000001200000300900000093E00000009E00000000{0000001{00000104{JOHN DOE 196907161423171289 2174558M2A2 000 xxxx YYYYY 100000000000 000020170915001 00010000300 000003zzzzzz 000{000000000{000000894{ aaaaaaaaaaaaaaa P2017092700000000{00000000{00000000{00000000{ 0000000{00000{ F89863 682004R0900001011B2017101109656 500 MG 2017010100000000{88044828665760
If you look just before the JOHN DOE you will see a field that represents a money field. It looks like 00000104{.
This looks like the type of field I used to process from a mainframe many years ago. How do I handle this in SSIS. If the { on the end is in fact a 0, then I want the field to be a string that reads 0000010.40.
I have other money fields that are, e.g. 00000159E. If my memory serves me correctly, that would be 00000015.95.
I can't find anything on how to do this transform.
Thanks,
Dick Rosenberg
import the values as strings
00000159E
00000104{
in derived column do your transforms with replace:
replace(replace(col,"E","5"),"{","0")
in another derived column cast to money and divide by 100
(DT_CY)(drvCol) / 100
I think you will need to either use a Script Component source in the data flow, or use a Derived Column transformation or Script Component transformation. I'd recommend a Script Component either way as it sounds like your custom logic will be fairly complex.
I have written a few detailed answers about how to implement a Script component source:
SSIS import a Flat File to SQL with the first row as header and last row as a total
How can I load in a pipe (|) delimited text file that has columns that sometimes contain line breaks?
Essentially, you need to locate the string, "00000104{", for example, and then convert it into decimal/money form before adding it into the data flow (or during it if you're using a Derived Column Transformation).
This could also be done in a Script Component transformation, which would function in a similar way to the Derived Column transformation, only you'd perhaps have a bit more scope for complex logic. Also in a Script Component transformation (as opposed to a source), you'd already have all of your other fields in place from the Flat File Source.

Working on migration of SPL 3.0 to 4.2 (TEDA)

I am working on migration of 3.0 code into new 4.2 framework. I am facing a few difficulties:
How to do CDR level deduplication in new 4.2 framework? (Note: Table deduplication is already done).
Where to implement PostDedupProcessor - context or chainsink custom? In either case, do I need to remove duplicate hashcodes from the list or just reject the tuples? Here I am also doing column updating for a few tuples.
My file is not moving into archive. The temporary output file is getting generated and that too empty and outside load directory. What could be the possible reasons? - I have thoroughly checked config parameters and after putting logs, it seems correct output is being sent from transformer custom, so I don't know where it is stuck. I had printed TableRowGenerator stream for logs(end of DataProcessor).
1. and 2.:
You need to select the type of deduplication. It is not a big difference if you choose "table-" or "cdr-level-deduplication".
The ite.businessLogic.transformation.outputType does affect this. There is one Dedup only. You can not have both.
Select recordStream for "cdr-level-deduplication", do the transformation to table row format (e.g. if you like to use the TableFileWriter) in xxx.chainsink.custom::PostContextDataProcessor.
In xxx.chainsink.custom::PostContextDataProcessor you need to add custom code for duplicate-handling: reject (discard) tuples or set special column values or write them to different target tables.
3.:
Possibly reasons could be:
Missing forwarding of window punctuations or statistic tuple
error in BloomFilter configuration, you would see it easily because PE is down and error log gives hints about wrong sha2 functions be used
To troubleshoot your ITE application, I recommend to enable the following debug sinks if checking the StreamsStudio live graph is not sufficient:
ite.businessLogic.transformation.debug=on
ite.businessLogic.group.debug=on
ite.businessLogic.sink.debug=on
Run a test with a single input file only and check the flow of your record and statistic tuples. "Debug sinks" write punctuations markers also to debug files.

How to download IMF Data Through JSON in R

I recently took an interest in retrieving data in R through JSON. Specifically, I want to be able to access data through the IMF. I know virtually nothing about JSON so I will share what I [think I] know so far, and what I have accomplished.
I browsed their web page for JSON, which helped a little bit. It gave me the start point URL. Here is the web page; http://datahelp.imf.org/knowledgebase/articles/667681-using-json-restful-web-service
I managed to download (using the GET() and the fromJSON() functions) some lists, which are really bulky. I know enough about the lists that the "call" was successful, but I cannot for the life of me get actual data. So far, I have been trying to use the rawToChar() function on the "content" data but I am virtually stuck there.
If anything, I managed to create data frames that contain the codes, which I presume would be used somewhere in the JSON link. Here is what I have.
all.imf.data = fromJSON("http://dataservices.imf.org/REST/SDMX_JSON.svc/Dataflow/")
str(all.imf.data)
#all.imf.data$Structure$Dataflows$Dataflow$Name[[2]] #for the catalogue of sources
catalogue1 = cbind(all.imf.data$Structure$Dataflows$Dataflow$KeyFamilyRef,
all.imf.data$Structure$Dataflows$Dataflow$Name[[2]])
catalogue1 = catalogue1[,-2] # catalogue of all the countries
data.structure = fromJSON("http://dataservices.imf.org/REST/SDMX_JSON.svc/DataStructure/IFS")
info1 = data.frame(data.structure$Structure$Concepts$ConceptScheme$Concept[,c(1,4)])
View(data.structure$Structure$CodeLists$CodeList$Description)
str(data.structure$Structure$CodeLists$CodeList$Code)
#Units
units = data.structure$Structure$CodeLists$CodeList$Code[[1]]
#Countries
countries = data.frame(data.structure$Structure$CodeLists$CodeList$Code[[3]])
countries = countries[,-length(countries)]
#Series Codes
codes = data.frame(data.structure$Structure$CodeLists$CodeList$Code[[4]])
codes = codes[,-length(codes)]
# all.imf.data # JSON from the starting point, provided on the website
# catalogue1 # data frame of all the data bases, International Financial Statistics, Government Financial Statistics, etc.
# codes # codes for the specific data sets (GDP, Current Account, etc).
# countries # data frame of all the countries and their ISO codes
# data.structure # large list, with starting URL and endpoint "IFS". Ideally, I want to find some data set somewhere within this data base.
"info1" # looks like parameters for retrieving the data (for instance, dates, units, etc).
# units # data frame that indicates the options for units
I would just like some advice about how to go about retrieving any data, something as simple as GDP (PPP) for a constant year. I have been following an article in R blogs (which retrieved data in the EU's database) but I cannot replicate the procedure for the IMF. I feel like I am close to retrieving something useful but I cannot quite get there. Given that I have data frames that contain the names for the databases, the series and the codes for the series, I think it is just a matter of figuring out how to construct the appropriate URL for getting the data, but I could be wrong.
Provided in the data frame codes are the codes for the data sets I presume. Is there a way to make a call for the data for, let's say, the US for BK_DB_BP6_USD, which is "Balance of Payments, Capital Account, Total, Debit, etc"? How should I go about doing this in the context of R?

Generating truth tables for basic logic circuits

Let's say I have a text file that looks like this:
<number> <name> <type> <inputs...>
1 XOR1 XOR A B
2 SUM XOR 1 C
What would be the best approach to generate the truth table for this circuit?
That depends on what you have available, and how big your file is.
Perl is optimized for reading files and generating simple text output. It doesn't have a library of boolean operators, but they're easy enough to write. I'd use that if I just wanted text-in, text-out.
If I wanted to display the data online AND generate a results file, I'd use PHP to read the data and write the table to a CSV file that could either be opened in Excel, or posted online in an HTML table.
If your data is in a REALLY BIG data file, I'd use SQL.
If your data is in a really huge file that you want to be accessible to authorized users online, and you want THEM to be able to create truth tables, I'd use Oracle's APEX to create an easy interface for them to build their own truth tables and play around with the data without altering it.
If you're in an electrical engineering environment, use the tools designed for your problem -- Verilog or similar.
Whatcha got? Whatcha wanna do with it?
-- Ada
I prefer using C#. I already have the code to 'parse' the input
text file. I just don't know where to start in terms of
actually 'simulating' it. The output can simply be a text file
with inputs and output values – Don 12 mins ago
How many inputs and how many outputs in the circuit you want to simulate?
The size of the simulation determines how it can most easily be run. If the circuit is small(ish), you can enter the inputs and circuit values into vector arrays, then cross them to get the output matrix.
Matlab is ideal for this, as it was written for processing arrays.
Again: Whatcha got, and whatcha wanna do with it?
-- Ada