Spark Streaming の textFileStream で複数のディレクトリを対象にしてみる
単に2つの DStream を作成してあげるだけです。
// create DStream from text file String logDir = "/tmp/logs"; String logDir2 = "/tmp/logs2"; JavaDStream<String> logData = jssc.textFileStream(logDir); JavaDStream<String> logData2 = jssc.textFileStream(logDir2); // output logData.print(); logData2.print(); // start streaming jssc.start(); // wait for end of job jssc.awaitTermination();
出力結果
print() の出力結果ですが、以下のように1つの間隔(今回は一秒)に対して
2つの結果が出力される場所が出来ていました。
------------------------------------------- Time: 1498053561000 ms ------------------------------------------- ------------------------------------------- Time: 1498053561000 ms ------------------------------------------- ------------------------------------------- Time: 1498053562000 ms ------------------------------------------- 2017-06-21T22:58:19+09:00 test.access1 {"message":"66.249.69.97 - - [24/Sep/2014:22:25:44 +0000] \"GET /071300/242153 HTTP/1.1\" 404 514 \"-\" \"Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)\""} 2017-06-21T22:58:19+09:00 test.access1 {"message":"71.19.157.174 - - [24/Sep/2014:22:26:12 +0000] \"GET /error HTTP/1.1\" 404 505 \"-\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.94 Safari/537.36\""} 2017-06-21T22:58:19+09:00 test.access1 {"message":"71.19.157.174 - - [24/Sep/2014:22:26:12 +0000] \"GET /favicon.ico HTTP/1.1\" 200 1713 \"-\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.94 Safari/537.36\""} 2017-06-21T22:58:19+09:00 test.access1 {"message":"71.19.157.174 - - [24/Sep/2014:22:26:37 +0000] \"GET / HTTP/1.1\" 200 18785 \"-\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.94 Safari/537.36\""} 2017-06-21T22:58:19+09:00 test.access1 {"message":"71.19.157.174 - - [24/Sep/2014:22:26:37 +0000] \"GET /jobmineimg.php?q=m HTTP/1.1\" 200 222 \"http://www.holdenkarau.com/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.94 Safari/537.36\""} ------------------------------------------- Time: 1498053562000 ms ------------------------------------------- ------------------------------------------- Time: 1498053571000 ms ------------------------------------------- ------------------------------------------- Time: 1498053571000 ms ------------------------------------------- ------------------------------------------- Time: 1498053572000 ms ------------------------------------------- 2017-06-21T22:58:29+09:00 test.access2 {"message":"71.19.157.174 - - [24/Sep/2014:22:26:12 +0000] \"GET /error78978 HTTP/1.1\" 404 505 \"-\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.94 Safari/537.36\""} ------------------------------------------- Time: 1498053572000 ms -------------------------------------------