scalding-io / programmingwithscalding Goto Github PK
View Code? Open in Web Editor NEWProgramming MapReduce with Scalding
Home Page: http://scalding.io
License: Other
Programming MapReduce with Scalding
Home Page: http://scalding.io
License: Other
Hi,
My work flow is:
Job1:
A list of images => A Binary Sequence File
I have gotten so far:
class TestImageJob(args : Args) extends Job(args) {
implicitly[Mode] match {
case Hdfs(_, configuration) => {
TextLine(args("input"))
.map('line , 'image){
line : String =>
val fs = FileSystem.get(URI.create(line), configuration)
val in = fs.open(new Path(line))
IOUtils.toByteArray(in)
}
.project('image)
.write(SequenceFile(args("output")))
}
}
But this doesn't seem to be working!
Job2:
The Binary Sequence File => Images
I have the code for this using Java Hadoop
Job1: http://pastebin.com/m1EuBaqj
Job2: http://pastebin.com/Y6gZZkmi
Thanks,
Anil
Hey,
I am trying to run the kmeans algorithm in your project here, but seems like mahout and mahout dependencies are not satisfied. i get the following error:
java.lang.NoClassDefFoundError: org/apache/commons/cli2/Option
So I added the mahout-core, mahout-math3 to the scald.rb. Now to to solve the above error i have to download the commons-cli2 jar separately. I am stuck at including that in the scald.rb, or command line.
How did you do this?
Got book on kindle. Tried running code in a windows 64 on latest IDE with Java 8. Pom has issues. Runs if you change the windows path and JAVA_HOME to java 7. Same in IDE. (Use Scala runtime version 2.10.5)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.