Usoro nhicha data Spark Streaming
(I) DStream na RDD
Dị ka anyị si mara, Spark Streaming computation dabere na Spark Core, na isi nke Spark Core bụ RDD, yabụ Spark Streaming ga-ejikọrịrị na RDD.Otú ọ dị, Spark Streaming adịghị ekwe ka ndị ọrụ jiri RDD ozugbo, ma abstracts a set nke DStream echiche, DStream na RDD bụ gụnyere mmekọrịta, ị nwere ike ịghọta ya dị ka mma ụkpụrụ na Java, ya bụ, DStream bụ nkwalite nke RDD, ma. omume yiri RDD.
DStream na RDD abụọ nwere ọtụtụ ọnọdụ.
(1) nwere omume mgbanwe yiri nke ahụ, dị ka maapụ, reduceByKey, wdg, mana nwekwara ụfọdụ pụrụ iche, dị ka Window, mapWithStated, wdg.
(2) ha niile nwere omume, dị ka foreachRDD, gụọ, wdg.
Ụdị mmemme na-agbanwe agbanwe.
(B) Mmalite nke DStream na Spark Streaming
DStream nwere ọtụtụ klas.
(1) Klas isi iyi data, dị ka InputDStream, kpọmkwem dị ka DirectKafkaInputStream, wdg.
(2) Klas ngbanwe, na-abụkarị MappedDStream, ShuffledDStream
(3) klaasị mmepụta, dị ka ForEachDStream
Site na nke a dị n'elu, data sitere na mmalite (ntinye) ruo na njedebe (mpụta) bụ usoro DStream na-eme, nke pụtara na onye ọrụ na-ejikarị apụghị ịmepụta na ịchịkwa RDD, nke pụtara na DStream nwere ohere na ọrụ ịbụ. maka ọrụ okirikiri ndụ nke RDD.
N'ikwu ya n'ụzọ ọzọ, Spark Streaming nwereakpaka nhichaọrụ.
(iii) Usoro nke ọgbọ RDD na Spark Streaming
Usoro ndụ nke RDD na Spark Streaming siri ike dịka ndị a.
(1) Na InputDStream, a na-agbanwe data natara ka ọ bụrụ RDD, dị ka DirectKafkaInputStream, nke na-emepụta KafkaRDD.
(2) wee site na MappedDStream na ntụgharị data ndị ọzọ, a na-akpọ oge a RDD dabara na usoro maapụ maka ntụgharị.
(3) Na arụ ọrụ klaasị mmepụta, naanị mgbe ekpughere RDD, ị nwere ike ikwe ka onye ọrụ rụọ nchekwa kwekọrọ, mgbako ndị ọzọ na arụmọrụ ndị ọzọ.