Map and Filter in Haskell - function

I have two lists of tuples which are as follows: [(String,Integer)] and [(Float,Integer)]. Each list has several tuples.
For every Integer that has a Float in the second list, I need to check if its Integer matches the Integer in the first list, and if it does, return the String - although this function needs to return a list of Strings, i.e. [String] with all the results.
I have already defined a function which returns a list of Integers from the second list (for the comparison on the integers in the first list).
This should be solvable using "high-order functions". I've spent a considerably amount of time playing with map and filter but haven't found a solution!

You have a list of Integers from the second list. Let's call this ints.
Now you need to do two things--first, filter the (String, Integer) list so that it only contains pairs with corresponding integers in the ints list and secondly, turn this list into just a list of String.
These two steps correspond to the filter and map respectively.
First, you need a function to filter by. This function should take a (String, Integer) pair and return if the integer is in the ints list. So it should have a type of:
check :: (String, Integer) -> Bool
Writing this should not be too difficult. Once you have it, you can just filter the first list by it.
Next, you need a function to transform a (String, Integer) pair into a String. This will have type:
extract :: (String, Integer) -> String
This should also be easy to write. (A standard function like this actually exists, but if you're just learning it's healthy to figure it out yourself.) You then need to map this function over the result of your previous filter.
I hope this gives you enough hints to get the solution yourself.

One can see in this example how important it is to describe the problem accurately, not only to others but foremost to oneself.
You want the Strings from the first list, whose associated Integer does occur in the second list.
With such problems it is important to do the solutions in small steps. Most often one cannot write down a function that does it right away, yet this is what many beginners think they must do.
Start out by writing the type signature you need for your function:
findFirsts :: [(String, Integer)] -> [(Float, Integer)] -> [String]
Now, from the problem description, we can deduce, that we essentially have two things to do:
Transform a list of (String, Integer) to a list of String
Select the entries we want.
Hence, the basic skeleton of our function looks like:
findFirsts sis fis = map ... selected
where
selected = filter isWanted sis
isWanted :: (String, Integer) -> Bool
isWanted (_,i) = ....
You'll need the functions fst, elem and snd to fill out the empty spaces.
Side note: I personally would prefer to solve this with a list comprehension, which results often in better readable (for me, anyway) code than a combination of map and filter with nontrivial filter criteria.

Half of the problem is to get the string list if you have a single integer. There are various possibilities to do this, e.g. using filter and map. However you can combine both operations using a "fold":
findAll x axs = foldr extract [] axs where
extract (a,y) runningList | x==y = a:runningList
| otherwise = runningList
--usage:
findAll 2 [("a",2),("b",3),("c",2)]
--["c","a"]
For a fold you have a start value (here []) and an operation that combines the running values successively with all list elements, either starting from the left (foldl) or from the right (foldr). Here this operation is extract, and you use it to decide whether to add the string from the current element to the running list or not.
Having this part done, the other half is trivial: You need to get the integers from the (Float,Integer) list, call findAll for all of them, and combine the results.

Related

Convert List<dynamic> to List<String>

I am getting data from server. The run runtimeType shows that they have type List.
Currently I am using cast<String>() to get List<String>.
But is it's only\right way?
var value = await http.get('http://127.0.0.1:5001/regions');
if(value.statusCode == 200) {
return jsonDecode(value.body)['data'].cast<String>();
}
There are multiple ways, depending on how soon you want an error if the list contains a non-string, and how you're going to use the list.
list.cast<String>() creates a lazy wrapper around the original list. It checks on each read that the value is actually a String. If you plan to read often, all that type checking might be expensive, and if you want an early error if the last element of the list is not a string, it won't do that for you.
List<String>.from(list) creates a new list of String and copies each element from list into the new list, checking along the way that it's actually a String. This approach errs early if a value isn't actually a string. After creation, there are no further type checks. On the other hand, creating a new list costs extra memory.
[for (var s in list) s as String],
[... list.cast<String>()],
<String>[for (var s in list) s],
<String>[... list] are all other ways to create a new list of strings. The last two relies on implicit downcast from dynamic, the first two uses explicit casts.
I recommend using list literals where possible. Here, I'd probably go for the smallest version <String>[...list], if you want a new list. Otherwise .cast<String>() is fine.

F# Read File, Split string list, summarize data, Nonfloat decimal numbers

I'm new to F# and got this assignment to create a very simple bankrepresentation.
I do not want any code answers directly related to the problem, but preferally links or tips on where to find solutions or how to find do the solutions.
The issues are the following:
Reading lines of a file (a line looks like this: "126,145001,1500.00" and it's sequence_number, account_number, amount)
Split the line to use the data from the line
summarize the data (to return the bank account balance)
Not using floating point numbers representing the amount, due to rounding errors(?)
Doing all of these in one function.
I know how to read a file, in a function.
I also know how to split a string.
I know how to recursivly add values from a list.
I do not know how to add values that are decimal without floating-point variables.
I do not know how to retrieve the string from a list in a function and split it.
I do not know how to do all of these things in on function taking in file name, account number, and account currency.
The function should return the balance after the transactions in the file have been proccessed.
My idea to solve this is to create a datatype that have the three variables sequence_number, account_number and amount, and then do the following:
Read the file,
Split the data and create an object of my custom type for each line in the file
Add and remove the values from the types and return the final balance.
If anyone could point me in the right direction for each or any problem I would be really thankful!
.NET contains a type called System.Decimal that is indeed more appropriate for storing financial figures than the typical floating point types. In F#, you can use the decimal function to convert a value of a different type (say a string) to a System.Decimal (which F# abbreviates as a type also named decimal): let d = decimal "1.23" You can also create these values directly by using the M suffix: let d' = 1.23M, but in your case that doesn't seem relevant.
Regarding your other questions, if you use System.IO.File.ReadLines, then you can get the individual lines of your file as a sequence. Then you can string together a bunch of operations on that sequence to achieve your desired result. For instance, you can take the sequence and use Seq.map <your splitting code here> to split each line (and convert to instances of your specific data type, if desired), and then use Seq.groupBy to group the transactions by account number, and then Seq.map again to apply your summarization logic to each group. Ask follow-up questions if any of this is unclear.

How to check if two neighbors (element) in a string/array are the same

Well basically, I'm having a problem, how to make a function in haskell to work like this:
to take the first element of a string, then take the second one and compare them, then the function should continue with taking the third element from the string and comparing the second and the third one.
If it would have to compare the first two then the next two it would be easy, but I just can't figure it out in this particular situation.
I need to achieve this step in order to write a function which if finds two neighbor elements which are the same, returns True and if there aren't any elements like that returns False.
Thanks for any help.
A higher-order way to accomplish this (i.e. no explicit recursion) is to use zipWith to perform a point-wise comparison of the elements in the list, starting with the first, against the elements of the list, starting from the second (using tail), and then using or to collapse the point-wise results into a single result. You don't even need to special case the empty list since zipWith is non-strict in its third argument if its second argument is the empty list.
EDIT: Solution (hover to reveal)
hasNeighbors as = or . zipWith (==) as $ tail as
You can make a recursive function that solves this problem. There are 3 situations you must handle:
If the function gets the empty list or a list of one element, then obviously it won't contain any neighbors, so you return False.
If the list starts with two items that are not equal, then it means that it doesn't start with a neighbor pair, so you should perform the check on all of the list except for the first element.
If the list starts with two items that are equal, you know that the list contains a neighbor pair, so you can return True.
Tell me if you want me to provide the code that does this, or if you don't want any more hints.
EDIT: Solution (hover to reveal)
hasNeighbors :: Eq a => [a] -> TruehasNeighbors (a : allExceptA # (b : _)) | a == b = True | otherwise = hasNeighbors allExceptAhasNeighbors _ = False

Synonym dictionary implementation?

How should I approach this problem? I basically need to implement a dictionary of synonyms. It takes as input some "word/synonim" pairs and I have to be able to "query" it for the list of all synonims of a word.
For example:
Dictionary myDic;
myDic.Add("car", "automobile");
myDic.Add("car", "autovehicle");
myDic.Add("car", "vehicle");
myDic.Add("bike", "vehicle");
myDic.ListOSyns("car") // should return {"automobile","autovehicle","vehicle" ± "car"}
// but "bike" should NOT be among the words returned
I'll code this in C++, but I'm interested in an overall idea of the implementation, so the question is not exactly language-specific.
PS: The main idea is to have some groups of words (synonyms). In the example above there would be two such groups:
{"automobile","autovehicle","vehicle", "car"}
{"bike", "vehicle"}
"vehicle" belongs to both, "bike" just to the second one, the others just to the first
I would implement it as a Graph + hash table / search tree
each keyword would be a Vertex, and each connection between 2 keywords would be an edge.
a hash table or a search tree will connect from each word to its node (and vice versa).
when a query is submitted - you find the node with your hash/tree and do BFS/DFS of the required depth. (meaning you cannot continue after a certain depth)
complexity: O(E(d)+V(d)) for searching graph (d = depth) (E(d) = number of edges in the relevant depth, same for V(d))
O(1) for creating an edge (not including searching for the node, detailed below its search)
O(logn) / O(1) for finding node (for tree/hash table)
O(logn) /O(1) for adding a keyword to the tree/hash table and O(1) to add a Vertex
p.s. as mentioned: the designer should keep in mind if he needs a directed or indirected Graph, as mentioned in the comments to the question.
hope that helps...
With the clarification in the comments to the question, it's relatively simple since you're not storing groups of mutual synonyms, but rather separately defining the acceptable synonyms for each word. The obvious container is either:
std::map<std::string, std::set<std::string> >
or:
std::multi_map<std::string, std::string>
if you're not worried about duplicates being inserted, like this:
myDic.Add("car", "automobile");
myDic.Add("car", "auto");
myDic.Add("car", "automobile");
In the case of multi_map, use the equal_range member function to extract the synonyms for each word, maybe like this:
struct Dictionary {
vector<string> ListOSyns(const string &key) const {
typedef multi_map<string, string>::const_iterator constit;
pair<constit, constit> x = innermap.equal_range(key);
vector<string> retval(x.first, x.second);
retval.push_back(key);
return retval;
}
};
Finally, if you prefer a hashtable-like structure to a tree-like structure, then unordered_multimap might be available in your C++ implementation, and basically the same code works.

How do you return two values from a single method?

When your in a situation where you need to return two things in a single method, what is the best approach?
I understand the philosophy that a method should do one thing only, but say you have a method that runs a database select and you need to pull two columns. I'm assuming you only want to traverse through the database result set once, but you want to return two columns worth of data.
The options I have come up with:
Use global variables to hold returns. I personally try and avoid globals where I can.
Pass in two empty variables as parameters then assign the variables inside the method, which now is a void. I don't like the idea of methods that have a side effects.
Return a collection that contains two variables. This can lead to confusing code.
Build a container class to hold the double return. This is more self-documenting then a collection containing other collections, but it seems like it might be confusing to create a class just for the purpose of a return.
This is not entirely language-agnostic: in Lisp, you can actually return any number of values from a function, including (but not limited to) none, one, two, ...
(defun returns-two-values ()
(values 1 2))
The same thing holds for Scheme and Dylan. In Python, I would actually use a tuple containing 2 values like
def returns_two_values():
return (1, 2)
As others have pointed out, you can return multiple values using the out parameters in C#. In C++, you would use references.
void
returns_two_values(int& v1, int& v2)
{
v1 = 1; v2 = 2;
}
In C, your method would take pointers to locations, where your function should store the result values.
void
returns_two_values(int* v1, int* v2)
{
*v1 = 1; *v2 = 2;
}
For Java, I usually use either a dedicated class, or a pretty generic little helper (currently, there are two in my private "commons" library: Pair<F,S> and Triple<F,S,T>, both nothing more than simple immutable containers for 2 resp. 3 values)
I would create data transfer objects. If it is a group of information (first and last name) I would make a Name class and return that. #4 is the way to go. It seems like more work up front (which it is), but makes it up in clarity later.
If it is a list of records (rows in a database) I would return a Collection of some sort.
I would never use globals unless the app is trivial.
Not my own thoughts (Uncle Bob's):
If there's cohesion between those two variables - I've heard him say, you're missing a class where those two are fields. (He said the same thing about functions with long parameter lists.)
On the other hand, if there is no cohesion, then the function does more than one thing.
I think the most preferred approach is to build a container (may it be a class or a struct - if you don't want to create a separate class for this, struct is the way to go) that will hold all the parameters to be returned.
In the C/C++ world it would actually be quite common to pass two variables by reference (an example, your no. 2).
I think it all depends on the scenario.
Thinking from a C# mentality:
1: I would avoid globals as a solution to this problem, as it is accepted as bad practice.
4: If the two return values are uniquely tied together in some way or form that it could exist as its own object, then you can return a single object that holds the two values. If this object is only being designed and used for this method's return type, then it likely isn't the best solution.
3: A collection is a great option if the returned values are the same type and can be thought of as a collection. However, if the specific example needs 2 items, and each item is it's 'own' thing -> maybe one represents the beginning of something, and the other represents the end, and the returned items are not being used interchangably, then this may not be the best option.
2: I like this option the best, if 4, and 3 do not make sense for your scenario. As stated in 3, if you wanted to get two objects that represent the beginning and end items of something. Then I would use parameters by reference (or out parameters, again, depending on how it's all being used). This way your parameters can explicitly define their purpose: MethodCall(ref object StartObject, ref object EndObject)
Personally I try to use languages that allow functions to return something more than a simple integer value.
First, you should distinguish what you want: an arbitrary-length return or fixed-length return.
If you want your method to return an arbitrary number of arguments, you should stick to collection returns. Because the collections--whatever your language is--are specifically tied to fulfill such a task.
But sometimes you just need to return two values. How does returning two values--when you're sure it's always two values--differ from returning one value? No way it differs, I say! And modern languages, including perl, ruby, C++, python, ocaml etc allow function to return tuples, either built-in or as a third-party syntactic sugar (yes, I'm talking about boost::tuple). It looks like that:
tuple<int, int, double> add_multiply_divide(int a, int b) {
return make_tuple(a+b, a*b, double(a)/double(b));
}
Specifying an "out parameter", in my opinion, is overused due to the limitations of older languages and paradigms learned those days. But there still are many cases when it's usable (if your method needs to modify an object passed as parameter, that object being not the class that contains a method).
The conclusion is that there's no generic answer--each situation has its own solution. But one common thing there is: it's not violation of any paradigm that function returns several items. That's a language limitation later somehow transferred to human mind.
Python (like Lisp) also allows you to return any number of
values from a function, including (but not limited to)
none, one, two
def quadcube (x):
return x**2, x**3
a, b = quadcube(3)
Some languages make doing #3 native and easy. Example: Perl. "return ($a, $b);". Ditto Lisp.
Barring that, check if your language has a collection suited to the task, ala pair/tuple in C++
Barring that, create a pair/tuple class and/or collection and re-use it, especially if your language supports templating.
If your function has return value(s), it's presumably returning it/them for assignment to either a variable or an implied variable (to perform operations on, for instance.) Anything you can usefully express as a variable (or a testable value) should be fair game, and should dictate what you return.
Your example mentions a row or a set of rows from a SQL query. Then you reasonably should be ready to deal with those as objects or arrays, which suggests an appropriate answer to your question.
When your in a situation where you
need to return two things in a single
method, what is the best approach?
It depends on WHY you are returning two things.
Basically, as everyone here seems to agree, #2 and #4 are the two best answers...
I understand the philosophy that a
method should do one thing only, but
say you have a method that runs a
database select and you need to pull
two columns. I'm assuming you only
want to traverse through the database
result set once, but you want to
return two columns worth of data.
If the two pieces of data from the database are related, such as a customer's First Name and Last Name, I would indeed still consider this to be doing "one thing."
On the other hand, suppose you have come up with a strange SELECT statement that returns your company's gross sales total for a given date, and also reads the name of the customer that placed the first sale for today's date. Here you're doing two unrelated things!
If it's really true that performance of this strange SELECT statement is much better than doing two SELECT statements for the two different pieces of data, and both pieces of data really are needed on a frequent basis (so that the entire application would be slower if you didn't do it that way), then using this strange SELECT might be a good idea - but you better be prepared to demonstrate why your way really makes a difference in perceived response time.
The options I have come up with:
1 Use global variables to hold returns. I personally try and avoid
globals where I can.
There are some situations where creating a global is the right thing to do. But "returning two things from a function" is not one of those situations. Doing it for this purpose is just a Bad Idea.
2 Pass in two empty variables as parameters then assign the variables
inside the method, which now is a
void.
Yes, that's usually the best idea. This is exactly why "by reference" (or "output", depending on which language you're using) parameters exist.
I don't like the idea of methods that have a side effects.
Good theory, but you can take it too far. What would be the point of calling SaveCustomer() if that method didn't have a side-effect of saving the customer's data?
By Reference parameters are understood to be parameters that contain returned data.
3 Return a collection that contains two variables. This can lead to confusing code.
True. It wouldn't make sense, for instance, to return an array where element 0 was the first name and element 1 was the last name. This would be a Bad Idea.
4 Build a container class to hold the double return. This is more self-documenting then a collection containing other collections, but it seems like it might be confusing to create a class just for the purpose of a return.
Yes and no. As you say, I wouldn't want to create an object called FirstAndLastNames just to be used by one method. But if there was already an object which had basically this information, then it would make perfect sense to use it here.
If I was returning two of the exact same thing, a collection might be appropriate, but in general I would usually build a specialized class to hold exactly what I needed.
And if if you are returning two things today from those two columns, tomorrow you might want a third. Maintaining a custom object is going to be a lot easier than any of the other options.
Use var/out parameters or pass variables by reference, not by value. In Delphi:
function ReturnTwoValues(out Param1: Integer):Integer;
begin
Param1 := 10;
Result := 20;
end;
If you use var instead of out, you can pre-initialize the parameter.
With databases, you could have an out parameter per column and the result of the function would be a boolean indicating if the record is retrieved correctly or not. (Although I would use a single record class to hold the column values.)
As much as it pains me to do it, I find the most readable way to return multiple values in PHP (which is what I work with, mostly) is using a (multi-dimensional) array, like this:
function doStuff($someThing)
{
// do stuff
$status = 1;
$message = 'it worked, good job';
return array('status' => $status, 'message' => $message);
}
Not pretty, but it works and it's not terribly difficult to figure out what's going on.
I generally use tuples. I mainly work in C# and its very easy to design generic tuple constructs. I assume it would be very similar for most languages which have generics. As an aside, 1 is a terrible idea, and 3 only works when you are getting two returns that are the same type unless you work in a language where everything derives from the same basic type (i.e. object). 2 and 4 are also good choices. 2 doesn't introduce any side effects a priori, its just unwieldy.
Use std::vector, QList, or some managed library container to hold however many X you want to return:
QList<X> getMultipleItems()
{
QList<X> returnValue;
for (int i = 0; i < countOfItems; ++i)
{
returnValue.push_back(<your data here>);
}
return returnValue;
}
For the situation you described, pulling two fields from a single table, the appropriate answer is #4 given that two properties (fields) of the same entity (table) will exhibit strong cohesion.
Your concern that "it might be confusing to create a class just for the purpose of a return" is probably not that realistic. If your application is non-trivial you are likely going to need to re-use that class/object elsewhere anyway.
You should also consider whether the design of your method is primarily returning a single value, and you are getting another value for reference along with it, or if you really have a single returnable thing like first name - last name.
For instance, you might have an inventory module that queries the number of widgets you have in inventory. The return value you want to give is the actual number of widgets.. However, you may also want to record how often someone is querying inventory and return the number of queries so far. In that case it can be tempting to return both values together. However, remember that you have class vars availabe for storing data, so you can store an internal query count, and not return it every time, then use a second method call to retrieve the related value. Only group the two values together if they are truly related. If they are not, use separate methods to retrieve them separately.
Haskell also allows multiple return values using built in tuples:
sumAndDifference :: Int -> Int -> (Int, Int)
sumAndDifference x y = (x + y, x - y)
> let (s, d) = sumAndDifference 3 5 in s * d
-16
Being a pure language, options 1 and 2 are not allowed.
Even using a state monad, the return value contains (at least conceptually) a bag of all relevant state, including any changes the function just made. It's just a fancy convention for passing that state through a sequence of operations.
I will usually opt for approach #4 as I prefer the clarity of knowing what the function produces or calculate is it's return value (rather than byref parameters). Also, it lends to a rather "functional" style in program flow.
The disadvantage of option #4 with generic tuple classes is it isn't much better than returning a collection (the only gain is type safety).
public IList CalculateStuffCollection(int arg1, int arg2)
public Tuple<int, int> CalculateStuffType(int arg1, int arg2)
var resultCollection = CalculateStuffCollection(1,2);
var resultTuple = CalculateStuffTuple(1,2);
resultCollection[0] // Was it index 0 or 1 I wanted?
resultTuple.A // Was it A or B I wanted?
I would like a language that allowed me to return an immutable tuple of named variables (similar to a dictionary, but immutable, typesafe and statically checked). But, sadly, such an option isn't available to me in the world of VB.NET, it may be elsewhere.
I dislike option #2 because it breaks that "functional" style and forces you back into a procedural world (when often I don't want to do that just to call a simple method like TryParse).
I have sometimes used continuation-passing style to work around this, passing a function value as an argument, and returning that function call passing the multiple values.
Objects in place of function values in languages without first-class functions.
My choice is #4. Define a reference parameter in your function. That pointer references to a Value Object.
In PHP:
class TwoValuesVO {
public $expectedOne;
public $expectedTwo;
}
/* parameter $_vo references to a TwoValuesVO instance */
function twoValues( & $_vo ) {
$vo->expectedOne = 1;
$vo->expectedTwo = 2;
}
In Java:
class TwoValuesVO {
public int expectedOne;
public int expectedTwo;
}
class TwoValuesTest {
void twoValues( TwoValuesVO vo ) {
vo.expectedOne = 1;
vo.expectedTwo = 2;
}
}