How to create a table with 3000 columns in cassandra 3.0? - csv

Hi this is my first project with Cassandra 3.0
I have a csv file with around 3000 columns and I need to import csv to cassandra. How can I achieve this?
Example of csv:
Unnamed row_nr PRD_ID X_01 X_02 X_03 X_04 X_05_1 X_05_10 X_05_11 X_05_12 X_05_13 X_05_14 X_05_15 X_05_16 X_05_17 X_05_18 X_05_19 X_05_2 X_05_20 X_05_21 X_05_22 X_05_23 X_05_24 X_05_25 X_05_26 X_05_27 X_05_28 X_05_29 X_05_3 X_05_30 X_05_31 X_05_32 X_05_33 X_05_34 X_05_35 X_05_36 X_05_37 X_05_38 X_05_39 X_05_4 X_05_40 X_05_41 X_05_42 X_05_43 X_05_44 X_05_45 X_05_46 X_05_47 X_05_48 X_05_49 X_05_5 X_05_50 X_05_51 X_05_52 X_05_53 X_05_54 X_05_55 X_05_56 X_05_57
I have googled and read questions in stackoverflow but they are not helpful. Please shed some light.

Related

QGIS CSV not geolocating correctly

I've been adding text layers without issue in 3.22.3-Białowieża and files that I concert from Excel position in the wrong location.
I've added Shape files and some text (csv) but the files that don't plot properly...never work. I saw a post about restarting the program but that doesnt fix the problem
Here is a picture of the plotted position and example of the data...The red is from the ESRI Shape and green the CSV. Also an example of the CSV data...
"EXPED_CD","DATE","LAT","LONG"
"68006","1968-01-25T19:00",44.433331,-57.116661
"68006","1968-01-25T20:00",44.433331,-56.796669
"68006","1968-01-25T21:00",44.413879,-56.495022
"68006","1968-01-25T22:00",44.394409,-56.193272
"68006","1968-01-26T00:00",44.37167,-55.575001
"68006","1968-01-26T01:00",44.333328,-55.25
"68006","1968-01-26T07:00",44.25,-55.333328
"68006","1968-01-26T08:00",44.099998,-55.549999
"68006","1968-01-26T09:00",43.933331,-55.73333
"68006","1968-01-26T10:00",43.75,-55.916672
"68006","1968-01-26T11:00",43.566669,-56.066669
"68006","1968-01-26T12:00",43.415001,-56.26667
"68006","1968-01-26T13:00",43.263329,-56.466671
"68006","1968-01-26T14:00",43.10833,-56.666672
"68006","1968-01-26T15:00",42.964581,-56.831669
"68006","1968-01-27T02:00",42.386669,-57.369999
"68006","1968-01-27T03:00",42.255001,-57.544998
"68006","1968-01-27T04:00",42.105,-57.68716
"68006","1968-01-27T05:00",41.943939,-57.843941
"68006","1968-01-27T06:00",41.757069,-58.034851
"68006","1968-01-27T07:00",41.570202,-58.225761
"68006","1968-01-27T08:00",41.383339,-58.416672
"68006","1968-01-27T09:00",41.22084,-58.5825
"68006","1968-01-27T10:00",41.058331,-58.748329
"68006","1968-01-27T11:00",40.903332,-58.900002
"68006","1968-01-27T12:00",40.700001,-59.099998
"68006","1968-01-28T01:00",40.145,-59.738331
"68006","1968-01-28T02:00",39.959999,-59.88834
"68006","1968-01-28T03:00",39.775002,-60.03833
"68006","1968-01-28T04:00",39.587502,-60.202499
"68006","1968-01-28T05:00",39.400002,-60.366661
"68006","1968-01-28T06:00",39.23333,-60.541618
"68006","1968-01-28T07:00",39.066669,-60.716671
"68006","1968-01-28T10:00",39.025002,-60.60833
"68006","1968-01-28T11:00",38.888329,-60.805
"68006","1968-01-28T12:00",38.73333,-60.963329
"68006","1968-01-28T13:00",38.575001,-61.131672
"68006","1968-01-29T07:00",38,-62.599998
"68006","1968-01-29T08:00",37.816669,-62.633339
"68006","1968-01-29T09:00",37.625,-62.681671
"68006","1968-01-29T10:00",37.433331,-62.73
"68006","1968-01-29T11:00",37.25,-62.783329
"68006","1968-01-29T12:00",37.049999,-62.833328
"68006","1968-01-29T13:00",36.818329,-62.898331
"68006","1968-01-29T14:00",36.616661,-62.96167
"68006","1968-01-30T03:00",35.833328,-63.41333
"68006","1968-01-30T04:00",35.647049,-63.549019
"68006","1968-01-30T05:00",35.460758,-63.6847
"68006","1968-01-30T06:00",35.274479,-63.820389
"68006","1968-01-30T07:00",35.088188,-63.956081
"68006","1968-01-30T08:00",34.901909,-64.091766
"68006","1968-01-30T10:00",34.529381,-64.363037
"68006","1968-01-30T11:00",34.343121,-64.498787
"68006","1968-01-30T12:00",34.14167,-64.599998
"68006","1968-01-30T13:00",33.970791,-64.699997
"68006","1968-01-30T14:00",33.799999,-64.800003
"68006","1968-01-30T17:00",33.616661,-64.916656
"68006","1968-01-30T18:00",33.395,-64.949997
"68006","1968-01-30T19:00",33.183331,-65.008171
"68006","1968-01-31T03:00",32.818329,-65.025002
"68006","1968-01-31T04:00",32.64167,-65.099998
"68006","1968-01-31T05:00",32.451981,-65.17775
"68006","1968-01-31T13:00",31.658331,-65.041656
"68006","1968-01-31T15:00",31.229071,-65.087463
"68006","1968-01-31T16:00",31.01453,-65.110397
"68006","1968-01-31T17:00",30.799999,-65.133331
"68006","1968-01-31T18:00",30.58333,-65.133331
"68006","1968-02-01T03:00",30.11167,-65.316673
"68006","1968-02-01T04:00",29.92333,-65.28833
"68006","1968-02-01T05:00",29.72665,-65.225517
"68006","1968-02-01T06:00",29.529961,-65.162712
"68006","1968-02-01T07:00",29.33333,-65.099998
"68006","1968-02-01T08:00",29.133329,-65.029114
"68006","1968-02-01T09:00",28.933331,-64.958344
"68006","1968-02-01T10:00",28.705,-64.925003
"68006","1968-02-01T11:00",28.450001,-64.966667
"68006","1968-02-01T12:00",28.21386,-64.911087
"68006","1968-02-01T13:00",27.97802,-64.866966
"68006","1968-02-01T20:00",27.799999,-64.883331
"68006","1968-02-01T21:00",27.6,-64.85183
"68006","1968-02-02T02:00",27.385,-64.946663
"68006","1968-02-02T03:00",27.15167,-64.910004
"68006","1968-02-02T04:00",26.988331,-64.85833
"68006","1968-02-02T05:00",26.816669,-64.883331
"68006","1968-02-03T05:00",24.51667,-64.633331
"68006","1968-02-03T06:00",24.299999,-64.550003
"68006","1968-02-03T07:00",24.116671,-64.466667
"68006","1968-02-03T08:00",23.9125,-64.402496
"68006","1968-02-03T09:00",23.70833,-64.338333
"68006","1968-02-03T10:00",23.495831,-64.290001
"68006","1968-02-03T11:00",23.283331,-64.241669
"68006","1968-02-03T12:00",23.08,-64.175003
"68006","1968-02-03T17:00",23.11515,-64.280296
"68006","1968-02-03T18:00",22.91667,-64.283333
"68006","1968-02-03T19:00",22.700001,-64.26667
"68006","1968-02-03T22:00",22.480869,-64.227539
"68006","1968-02-04T02:00",22.11833,-64.228333
restarted program expected fix

Count coords dots in polygon

I would like to count the appearance of several dots in a prespecified polygons. I am loading the EU NUTS Region by
nuts = 'https://raw.githubusercontent.com/eurostat/Nuts2json/master/2016/4258/60M/nutsrg_2.json'
geo_json_nuts = json.loads(requests.get(nuts).text)
and I have a list of tuples or a DataFrame, which contains data as follows:
Index lon lat
0 -178.1328187 -14.3087256
1 -176.2036596 -13.3469813
2 -176.1720255 -13.2789922
3 -151.3381037 -22.4532474
4 -151.0331577 -16.7159449
... ... ...
Now I would like to match the lon/lat in the DataFrame to the Feature properties id contained in geo_json_nuts. Meaning if lon/lat is contained in one of the polygons in geo_json_nuts it should get the properties id, e.g. BE31 or AT32, etc.
Does anyone know how to handle this?
Thank you in advance!
Best regards
Alex
you can try to normalize your json file.
nuts = 'https://raw.githubusercontent.com/eurostat/Nuts2json/master/2016/4258/60M/nutsrg_2.json'
geo_json_nuts = json.loads(requests.get(nuts).text)
df = json_normalize(geo_json_nuts, 'features', ['coordinates'],
errors='ignore',
record_prefix='features_')
After this process, you will have a dataframe contains 'features_properties.id' for the id and the features_geometry.coordinates for the points.

Why is my stepwise regression causing a shorter output?

I'm trying to run a stepwise regression but the output that I'm getting is shorter than the input data frame. I can't share my data unfortunately but any help would be much appreciated. Thank you in advance!
#training data
a3<-na.omit(train_occur)
sum(is.na(train_occur))
> 0
dim(a3)
>2228 10
full_log<-glm(formula = occurrence ~ . , family=binomial(link=logit), data= train_occur, control = list(maxit = 50))
back_log_occur<-step(full_log)
length(back_log_occur$fitted.values)
>66
#test data
dim(test_occur) #I took out the response variable although I found it doesn't seem to matter whether or not the response variable is there...
>243 9
pred_back_log_occur<-predict(object=back_log_occur,data=test_occur,type="response")
length(pred_back_log_occur)
> 66
I expected 2228 fitted values for the training and 243 predicted values for the test set.

Difficulty unpacking JSON tuple string

I figured out how to use rebar. I'm trying to use jsx (jiffy doesn't work properly on Windows) to parse json that I obtained using the openexchangerates.org API, but I can't even figure out how to correctly utilize Erlang's extensive binary functionality in order to unpack the JSON tuple obtained. Using the following code snippet, I managed to get a tuple that has all the data I need:
-module(currency).
-export([start/0]).
start() ->
URL = "http://openexchangerates.org",
Endpoint = "/api/latest.json?app_id=<myprivateid>",
X = string:concat(URL, Endpoint),
% io:format("~p~n",[X]).
inets:start(),
{ok, Req} = httpc:request(X),
Req.
Here is the obtained response:
9> currency:start().
{{"HTTP/1.1",200,"OK"},
[{"cache-control","public"},
{"connection","close"},
{"date","Fri, 15 Aug 2014 01:28:06 GMT"},
{"etag","\"d9ad180d4af1caaedab6e622ec0a8a70\""},
{"server","Apache"},
{"content-length","4370"},
{"content-type","application/json; charset=utf-8"},
{"last-modified","Fri, 15 Aug 2014 01:00:56 GMT"},
{"access-control-allow-origin","*"}],
"{\n \"disclaimer\": \"Exchange rates are provided for informational purposes only, and do not constitute financial advice of any kind. Although every attempt is made to ensure quality, NO guarantees are given whatsoever of accuracy, validity, availability, or fitness for any purpose - please use at your own risk. All usage is subject to your acceptance of the Terms and Conditions of Service, available at: https://openexchangerates.org/terms/\",\n \"license\": \"Data sourced from various providers with public-facing APIs; copyright may apply; resale is prohibited; no warranties given of any kind. Bitcoin data provided by http://coindesk.com. All usage is subject to your acceptance of the License Agreement available at: https://openexchangerates.org/license/\",\n \"timestamp\": 1408064456,\n \"base\": \"USD\",\n \"rates\": {\n \"AED\": 3.673128,\n \"AFN\": 56.479925,\n \"ALL\": 104.147599,\n \"AMD\": 413.859001,\n \"ANG\": 1.789,\n \"AOA\": 97.913074,\n \"ARS\": 8.274908,\n \"AUD\": 1.073302,\n \"AWG\": 1.79005,\n \"AZN\": 0.783933,\n \"BAM\": 1.46437,\n \"BBD\": 2,\n \"BDT\": 77.478631,\n \"BGN\": 1.464338,\n \"BHD\": 0.377041,\n \"BIF\": 1546.956667,\n \"BMD\": 1,\n \"BND\": 1.247024,\n \"BOB\": 6.91391,\n \"BRL\": 2.269422,\n \"BSD\": 1,\n \"BTC\": 0.0019571961,\n \"BTN\": 60.843812,\n \"BWP\": 8.833083,\n \"BYR\": 10385.016667,\n \"BZD\": 1.99597,\n \"CAD\": 1.0906,\n \"CDF\": 924.311667,\n \"CHF\": 0.906799,\n \"CLF\": 0.02399,\n \"CLP\": 577.521099,\n \"CNY\": 6.153677,\n \"COP\": 1880.690016,\n \"CRC\": 540.082202,\n \"CUP\": 1.000688,\n \"CVE\": 82.102201,\n \"CZK\": 20.81766,\n \"DJF\": 178.76812,\n \"DKK\": 5.579046,\n \"DOP\": 43.43789,\n \"DZD\": 79.8973,\n \"EEK\": 11.70595,\n \"EGP\": 7.151305,\n \"ERN\": 15.062575,\n \"ETB\": 19.83205,\n \"EUR\": 0.748385,\n \"FJD\": 1.85028,\n \"FKP\": 0.599315,\n \"GBP\": 0.599315,\n \"GEL\": 1.74167,\n \"GGP\": 0.599315,\n \"GHS\": 3.735499,\n \"GIP\": 0.599315,\n \"GMD\": 39.73668,\n \"GNF\": 6995.309935,\n \"GTQ\": 7.839405,\n \"GYD\": 205.351249,\n \"HKD\": 7.750863,\n \"HNL\": 21.04854,\n \"HRK\": 5.708371,\n \"HTG\": 44.66625,\n \"HUF\": 233.847801,\n \"IDR\": 11682.083333,\n \"ILS\": 3.471749,\n \"IMP\": 0.599315,\n \"INR\": 60.81923,\n \"IQD\": 1178.211753,\n \"IRR\": 26354,\n \"ISK\": 115.976,\n \"JEP\": 0.599315,\n \"JMD\": 112.604801,\n \"JOD\": 0.707578,\n \"JPY\": 102.501401,\n \"KES\": 88.106539,\n \"KGS\": 51.96,\n \"KHR\": 4056.578416,\n \"KMF\": 368.149,\n \"KPW\": 900,\n \"KRW\": 1021.166657,\n \"KWD\": 0.283537,\n \"KYD\": 0.826373,\n \"KZT\": 182.076001,\n \"LAK\": 8049.834935,\n \"LBP\": 1509.068333,\n \"LKR\": 130.184301,\n \"LRD\": 91.49085,\n \"LSL\": 10.56165,\n \"LTL\": 2.583284,\n \"LVL\": 0.521303,\n \"LYD\": 1.244127,\n \"MAD\": 8.372529,\n \"MDL\": 13.7178,\n \"MGA\": 2495.605,\n \"MKD\": 45.99967,\n \"MMK\": 972.1784,\n \"MNT\": 1884.666667,\n \"MOP\": 7.986251,\n \"MRO\": 292.0081,\n \"MTL\": 0.683602,\n \"MUR\": 30.61708,\n \"MVR\": 15.37833,\n \"MWK\": 392.9201,\n \"MXN\": 13.07888,\n \"MYR\": 3.175156,\n \"MZN\": 30.3522,\n \"NAD\": 10.56145,\n \"NGN\": 162.303701,\n \"NIO\": 26.07651,\n \"NOK\": 6.157432,\n \"NPR\": 97.66846,\n \"NZD\": 1.179688,\n \"OMR\": 0.38501,\n \"PAB\": 1,\n \"PEN\": 2.795018,\n \"PGK\": 2.464545,\n \"PHP\": 43.66429,\n \"PKR\": 99.5662,\n \"PLN\": 3.126223,\n \"PYG\": 4272.421673,\n \"QAR\": 3.641137,\n \"RON\": 3.320192,\n \"RSD\": 87.82784,\n \"RUB\": 36.00216,\n \"RWF\": 690.269,\n \"SAR\": 3.750523,\n \"SBD\": 7.269337,\n \"SCR\": 12.40801,\n \"SDG\": 5.699103,\n \"SEK\": 6.86018,\n \"SGD\": 1.246263,\n \"SHP\": 0.599315,\n \"SLL\": 4372.166667,\n \"SOS\": 841.5678,\n \"SRD\": 3.275,\n \"STD\": 18316.816667,\n \"SVC\": 8.745567,\n \"SYP\": 150.751249,\n \"SZL\": 10.56279,\n \"THB\": 31.86192,\n \"TJS\": 4.9856,\n \"TMT\": 2.8501,\n \"TND\": 1.719658,\n \"TOP\": 1.8861,\n \"TRY\": 2.15338,\n \"TTD\": 6.343484,\n \"TWD\": 30.00481,\n \"TZS\": 1661.865,\n \"UAH\": 13.02466,\n \"UGX\": 2614.28,\n \"USD\": 1,\n \"UYU\": 23.70693,\n \"UZS\": 2337.106637,\n \"VEF\": 6.295009,\n \"VND\": 21191.15,\n \"VUV\": 94.6,\n \"WST\": 2.301222,\n \"XAF\": 491.286739,\n \"XAG\": 0.05031657,\n \"XAU\": 0.00076203,\n \"XCD\": 2.70154,\n \"XDR\": 0.654135,\n \"XOF\": 491.394602,\n \"XPF\": 89.414091,\n \"YER\": 214.985901,\n \"ZAR\": 10.55678,\n \"ZMK\": 5253.075255,\n \"ZMW\": 6.169833,\n \"ZWL\": 322.355006\n }\n}"}
I don't understand why this code oesn't work:
X = "Arthur".
B = <<X>>.
JSX allows a lot of parsing functionality but only if I have a binary as my representation of JSON, and this JSON I'm getting from the currency API is a string in a tuple... I'm a bit lost as to where to start to research. Unpacking a tuple using pattern matching is supposedly quite simple (I've done some Prolog programming and I can see that erlang has similar behavior) but is there a another, better, Erlang-appropriate way to grab the "rates" part of the JSON I'm receiving as a response?
Thank you! I'm working on a cool web app to learn erlang and this is a good first step. I have three Erlang books and I'm reading through them diligently but the problem is that I want as much practical exposure as early on as possible. I love this language but I want to get a solid grounding as fast as possible.
Thank you!
get_currencies() ->
URL = "http://openexchangerates.org",
Endpoint = "/api/latest.json?app_id=<myprivateid>",
X = string:concat(URL, Endpoint),
% io:format("~p~n",[X]).
inets:start(),
{ok, {_,_,R}} = httpc:request(X),
E = jsx:decode(lists_to_binary(R)),
Base = proplists:get_value(<<"base">>,E),
Sec = proplists:get_value(<<"timestamp">>,E),
{Days,Time} = calendar:seconds_to_daystime(Sec),
Date = calendar:gregorian_days_to_date(Days+719528),
Currencies = proplists:get_value(<<"rates">>,E),
fun(C) -> V = proplists:get_value(C,Currencies),
{change,Date,Time,Base,C,V}
end.
and somewhere in your code:
GC = get_currencies(), %% you can store GC in an ets, a server state...
%% but don't forget to refresh it :o)
and use it later
{change,D,T,B,C,V} = GC(<<"ZWL">>),
%% should return {change,{2014,8,15},{2,0,12},<<"USD">>,<<"ZWL">>,322.355006}
[edit]
When I use an external application such as jsx (using rebar itself), I use also rebar and its dependency mechanism to create my own application, in my opinion it is the most convenient way. (In other cases, I use also rebar :o)
Then I build my application using the OTP behaviors (application, supervisor, gen_server...). It is a lot of modules to write, but some of them are very very short (application and supervisors) and they facilitate the application structure (see What is OTP if you are not familiar with this).
In your case, my first idea is to have a gen server that build and store the GC anonymous function in its state, each time it get a cast message such as update_currencies, and provide the answer each time it get a call message such as {get_change,Cur} (and maybe refresh GC if it is undefined or out dated).
You will also have to decide where the errors will be managed - it may be nowhere: if the gen_server does nothing else but answer to this currency query: if something wrong appears it will crash and be restarted by its supervisor - because this code has many interfaces with the out world and so subject to numerous failures (no Internet access, structure of answer change from site, bad currency request from user...)
I figured out my problem.
So first of all, I wasn't thinking of how simple it is to simply count how many elements there are in the tuple. That being said, I realized there were only three.
So the line I needed was
{A,B,C} = Req.
After that, I only wanted to look at the last one (C, the JSON payload), which was a string.
I found out through another source (not to disregard what you told me, Kay!) that you need to use the list functions, since strings and just lists of integers within an ASCII range (I think), in this case list_to_binary.
Once I used this line:
D = list_to_binary(C), and subsequently
E = jsx:decode(D), I got this output:
[{<<"disclaimer">>,
<<"Exchange rates are provided for infor
attempt is made to ensure quality, NO gua
se - please use at your own risk. All usag
s://openexchangerates.org/terms/">>},
{<<"license">>,
<<"Data sourced from various providers w
any kind. Bitcoin data provided by http://
t: https://openexchangerates.org/license/"
{<<"timestamp">>,1408068012},
{<<"base">>,<<"USD">>},
{<<"rates">>,
[{<<"AED">>,3.673128},
{<<"AFN">>,56.479925},
{<<"ALL">>,104.1526},
{<<"AMD">>,413.859001},
{<<"ANG">>,1.789},
{<<"AOA">>,97.913949},
{<<"ARS">>,8.274608},
{<<"AUD">>,1.073236},
{<<"AWG">>,1.79005},
{<<"AZN">>,0.783933},
{<<"BAM">>,1.46437},
{<<"BBD">>,2},
{<<"BDT">>,77.478631},
{<<"BGN">>,1.464358},
{<<"BHD">>,0.377041},
{<<"BIF">>,1546.956667},
{<<"BMD">>,1},
{<<"BND">>,1.246774},
{<<"BOB">>,6.91391},
{<<"BRL">>,2.269462},
{<<"BSD">>,1},
{<<"BTC">>,0.0019393375},
{<<"BTN">>,60.843812},
{<<"BWP">>,8.833083},
{<<"BYR">>,10385.016667},
{<<"BZD">>,1.99597},
{<<"CAD">>,1.090486},
{<<"CDF">>,924.311667},
{<<"CHF">>,0.906833},
{<<"CLF">>,0.02399},
{<<"CLP">>,577.521099},
{<<"CNY">>,6.151637},
{<<"COP">>,1880.690016},
{<<"CRC">>,540.082202},
{<<"CUP">>,1.000688},
{<<"CVE">>,82.049699},
{<<"CZK">>,20.818},
{<<"DJF">>,179.084119},
{<<"DKK">>,5.579049},
{<<"DOP">>,43.43789},
{<<"DZD">>,79.8641},
{<<"EEK">>,11.7064},
{<<"EGP">>,7.150475},
{<<"ERN">>,15.062575},
{<<"ETB">>,19.83205},
{<<"EUR">>,0.748419},
{<<"FJD">>,1.850441},
{<<"FKP">>,0.599402},
{<<"GBP">>,0.599402},
{<<"GEL">>,1.74167},
{<<"GGP">>,0.599402},
{<<"GHS">>,3.735499},
{<<"GIP">>,0.599402},
{<<"GMD">>,39.73668},
{<<"GNF">>,6995.309935},
{<<"GTQ">>,7.839405},
{<<"GYD">>,205.351249},
{<<"HKD">>,7.750754},
{<<"HNL">>,21.04854},
{<<"HRK">>,5.708511},
{<<"HTG">>,44.66625},
{<<"HUF">>,233.8448},
{<<"IDR">>,11685.75},
{<<"ILS">>,3.471469},
{<<"IMP">>,0.599402},
{<<"INR">>,60.82523},
{<<"IQD">>,1178.211753},
{<<"IRR">>,26355.666667},
{<<"ISK">>,115.96},
{<<"JEP">>,0.599402},
{<<"JMD">>,112.604801},
{<<"JOD">>,0.707778},
{<<"JPY">>,102.495401},
{<<"KES">>,88.107639},
{<<"KGS">>,51.991},
{<<"KHR">>,4056.578416},
{<<"KMF">>,368.142141},
{<<"KPW">>,900},
{<<"KRW">>,1021.353328},
{<<"KWD">>,0.283537},
{<<"KYD">>,0.826373},
{<<"KZT">>,182.076001},
{<<"LAK">>,8049.834935},
{<<"LBP">>,1509.068333},
{<<"LKR">>,130.184301},
{<<"LRD">>,91.49085},
{<<"LSL">>,10.56165},
{<<"LTL">>,2.583364},
{<<"LVL">>,0.521328},
{<<"LYD">>,1.244147},
{<<"MAD">>,8.372619},
{<<"MDL">>,13.7178},
{<<"MGA">>,2495.605},
{<<"MKD">>,46.00037},
{<<"MMK">>,972.1784},
{<<"MNT">>,1885},
{<<"MOP">>,7.986291},
{<<"MRO">>,292.0081},
{<<"MTL">>,0.683738},
{<<"MUR">>,30.61748},
{<<"MVR">>,15.37833},
{<<"MWK">>,392.9201},
{<<"MXN">>,13.07883},
{<<"MYR">>,3.175406},
{<<"MZN">>,30.3272},
{<<"NAD">>,10.56145},
{<<"NGN">>,162.303701},
{<<"NIO">>,26.07651},
{<<"NOK">>,6.156902},
{<<"NPR">>,97.66846},
{<<"NZD">>,1.179692},
{<<"OMR">>,0.38501},
{<<"PAB">>,1},
{<<"PEN">>,2.795018},
{<<"PGK">>,2.464545},
{<<"PHP">>,43.68439},
{<<"PKR">>,99.5642},
{<<"PLN">>,3.126203},
{<<"PYG">>,4272.421673},
{<<"QAR">>,3.641297},
{<<"RON">>,3.319212},
{<<"RSD">>,87.8205},
{<<"RUB">>,36.00206},
{<<"RWF">>,690.088},
{<<"SAR">>,3.750583},
{<<"SBD">>,7.258136},
{<<"SCR">>,12.40829},
{<<"SDG">>,5.697837},
{<<"SEK">>,6.857347},
{<<"SGD">>,1.246447},
{<<"SHP">>,0.599402},
{<<"SLL">>,4360},
{<<"SOS">>,841.5678},
{<<"SRD">>,3.275},
{<<"STD">>,18322.733333},
{<<"SVC">>,8.745567},
{<<"SYP">>,150.793749},
{<<"SZL">>,10.56279},
{<<"THB">>,31.87122},
{<<"TJS">>,4.985575},
{<<"TMT">>,2.8501},
{<<"TND">>,1.719698},
{<<"TOP">>,1.889033},
{<<"TRY">>,2.15342},
{<<"TTD">>,6.343484},
{<<"TWD">>,29.99281},
{<<"TZS">>,1661.865},
{<<"UAH">>,13.02466},
{<<"UGX">>,2614.28},
{<<"USD">>,1},
{<<"UYU">>,23.70693},
{<<"UZS">>,2337.773304},
{<<"VEF">>,6.295009},
{<<"VND">>,21191.15},
{<<"VUV">>,94.4875},
{<<"WST">>,2.301222},
{<<"XAF">>,491.283058},
{<<"XAG">>,0.05031404},
{<<"XAU">>,7.6211e-4},
{<<"XCD">>,2.70154},
{<<"XDR">>,0.654135},
{<<"XOF">>,491.394602},
{<<"XPF">>,89.416091},
{<<"YER">>,214.985901},
{<<"ZAR">>,10.55649},
{<<"ZMK">>,5252.024745},
{<<"ZMW">>,6.169833},
{<<"ZWL">>,322.355006}]}]
ok
So this is closer to what I want, but how do I extract a specific currency easily? Like ZWL, for example.

Clustering similar values in a matrix

I have an interesting problem and I'm sure there is an elegant algorithm with which to solve the solution but I'm having trouble describing is succinctly which would help finding such an algorithm.
I have a symmetric matrix of comparison values e.g:
-104.2732 -180.3972 -130.6969 -160.8333 -141.5499 -139.2758 -144.7697 -114.0545 -117.6409 -140.1391
-180.3972 -93.05421 -171.618 -162.0157 -156.8562 -156.3221 -159.9527 -163.2649 -170.127 -153.2709
-130.6969 -171.618 -101.1591 -154.4978 -143.6272 -116.3477 -137.2391 -125.5645 -128.9505 -131.6046
-160.8333 -162.0157 -154.4978 -96.96312 -122.7894 -141.5103 -127.7861 -149.6883 -153.0445 -130.2555
-141.5499 -156.8562 -143.6272 -122.7894 -101.7487 -141.451 -123.9087 -138.7041 -139.2517 -125.3494
-139.2758 -156.3221 -116.3477 -141.5103 -141.451 -99.99486 -134.6553 -132.7735 -138.7249 -134.1319
-144.7697 -159.9527 -137.2391 -127.7861 -123.9087 -134.6553 -100.0683 -141.3492 -138.0292 -120.5331
-114.0545 -163.2649 -125.5645 -149.6883 -138.7041 -132.7735 -141.3492 -106.8555 -115.58 -139.3355
-117.6409 -170.127 -128.9505 -153.0445 -139.2517 -138.7249 -138.0292 -115.58 -104.9484 -140.4741
-140.1391 -153.2709 -131.6046 -130.2555 -125.3494 -134.1319 -120.5331 -139.3355 -140.4741 -101.3919
The diagonal will always show the maximum score (as it is a self-to-self comparison). However I know that of these values some of them represent the same item. Taking a quick look at the matrix I can see (and have confirmed manually) that items 0, 7 & 8 as well as 2 & 5 and 3, 4, 6 & 9 all identify the same item.
Now what I'd like to do is find an elegant solution as to how I would cluster these together to produce me 4 clusters.
Does anyone know of such an algorithm? Any help would be much appreciated as I seem so close to a solution to my problem but am tripping at this one last stumbling block :(
Cheers!