scala - how to divide nested array using RDD in spark -
i trying divide nested array using rdd in spark. example, there textfile
contains 4 sentences, this:
"he good", "she good", "i good", "we good"
i used val arr = sc.textfile("filename").map(_.split(" "))
command , got this:
array[array[string]] = array(array(he, is, good), array(she, is, good), ... )
i want use each array elements (i.e. array(he, is, good)
) don't know how divide this. how can divide this?
it unclear mean 'divided', typically in functional programming languages, when want each element of collection (or 'iterable'), can use map
function. map
converts each element based on function passed it. example, in worksheet can this:
val sentences = array(array("he", "is", "good"), array("she", "is", "very", "good")) def yodaize(sentence: array[string]): array[string] = { val reversed = sentence.reverse println("yoda says, '%s'".format(reversed.mkstring(" "))) reversed } yodaize(array("i", "am", "small")) val yodasentences = sentences.map(yodaize)
the function yodaize
2 things: reverses sentence passed , as side effect prints out reversed sentence. worksheet output of above is:
sentences: array[array[string]] = [[ljava.lang.string;@faffecf yodaize: yodaize[](val sentence: array[string]) => array[string] yoda says, 'small i' res0: array[string] = [ljava.lang.string;@4bf1c779 yoda says, 'good he' yoda says, 'good she' yodasentences: array[array[string]] = [[ljava.lang.string;@40a19a85
it's hard see directly here, yodasentences
original array each sub-array reversed:
array(array("good", "is", "he"), array("good", "very", "is", "she"))
with map
can pass in function. can directly convert element or have side effect. in manner functional languages can deal each element without ever needing 'divide' them. note other functions flatmap
, foldleft
, filter
can used perform other sorts of permutations on collection.
Comments
Post a Comment