Just a bunch of useful links
-
Scala Design Patterns - great stuff, how you do (or don't) traditional Java / OOP patterns in Scala
-
The Human Side of Scala - great post on styling Scala for readability
-
Sneaking Scala Through the Back Door - how to promote Scala in an organization
-
Effective Scala - Twitter's guide to writing good Scala code
-
Between Zero & Hero - tips and tricks for the intermediate Scala developer
-
Type of Types - an unfinished tutorial on the Scala type system
-
Monads are not Metaphors - a great explanation of monads
-
Important compiler flags
-
Recursive Types - signatures like
class Foo[T <: Foo[T]]
, useful for inheritance and proper return types. Tho if you hit this, there are probably better ways of solving the problem, ie via composition.
- Simple Binary Encoding - supposedly 20-50x faster than Google Protobuf !!
- Comparison of Cap'n Proto, SBE, FlatBuffers from the Cap'n Proto people
- Jawn - @d3's new fast JSON parser, parses to multiple ASTs including rojoma-json, spray-json, argonaut
- Extracting case class param names using Macros
- Fast-Serialization - a drop in replacement for Java Serialization but much faster
-
CKite - Raft Scala implementation, Finagle, MapDB etc.
-
Wake - A Java event-driven framework from Microsoft (!)
-
Dirigiste - dynamic scalable / smarter Threadpools
-
Scala-gopher - a #golang-style CSP / channels implementation for Scala. Other niceties: defer()
-
Retry for futures. Also, SafeFuture CancellableFuture etc - very useful
-
Execute Futures serially - in nonblocking fashion
-
Scala.Rx - "Reactive variables" - smart variables who auto-update themselves when the values they depend on change
-
Monifu - a nice set of wrappers around j.u.c.Atomic, as well as super-lightweight cancellable tasks and futures utilities. Accompanying blog post.
-
Kamon - great looking Actor monitoring using bytecode weaving? no code change required.
-
Actor Provisioning pattern - if you have a long, failure-prone initialization procedure for an actor, this trait splits out the work, to say another actor and dispatcher
-
Running an Akka cluster with Docker Containers
-
Ask, Tell, and Per-Request Actors - why one company moved from Ask/Futures to per-request
-
Dos and Donts deploying Akka in Production - an excellent read, full of advice even for non-Akka JVM apps
- Asyncpools - Akka-based async connection pool for Slick. Akka 2.2 / Scala 2.10.
- Postgresql-Async - Netty-based async drivers for PostgreSQL and MySQL
- Cacheable - a clever memoization / caching library (with Guava, Redis, Memcached or EHCache backends) using Scala 2.10 macros to remember function parameters
-
Great list of Big Data Projects
-
Debasish G's list of streaming papers and algorithms - esp stuff on CountMinSketch and HyperLogLog
-
Summingbird - For any dataset that can be aggregated using a monoid, promises to unify Storm, Hadoop, and in the future, Akka and Spark with a single DSL. Also has a neat library of monoids built in.
-
Making Zookeeper Resilient, an excellent blog post from Pinterest
-
Fast SQL Query Parser in Scala - based on the Scala-LMS project, compiles a query down to C!
-
Probability Monad - super useful for stats or random data generation
-
stringmetric - Approximate string matching and phonetic algorithms
-
Factorie - a Scala library for Natural Language Processing
- Jaws - Spark SQL REST server, includes query cancellation, logs, load balancing. Based originally on my own spark-jobserver
- Supplemental Spark Projects - lots of other interesting projects, including IPython notebooks, dataframe stuff, stream + historical data processing, and more.
- Elastic Mesos - create Mesos clusters on AWS with ZK, HDFS
-
GeoTrellis - distributed raster processing, adding Vector/geom support, Akka Cluster and Spark implementations!
-
Spatial framework for Hadoop - PostGIS-like operators / UDFs for Hive. We want this for Spark!
-
trails - parser combinators for graph traversal. Supports Tinker/Blueprints/Neo4j APIs.
-
scala-graph - in-memory graph API based on scala collections. Work in progress.
- Breeze, Spire, and Saddle - Scala numeric libraries
- spire-ops - a set of macros for no-overhead implicit operator enrichment
- ScalaXY - collection of macros for performant for loops, extension methods etc
- Squants - The Scala API for Quantities, Units of Measure and Dimensional Analysis
- FastTuple - a dynamic (runtime-defined) C-style struct library, with support for off-heap storage. Would work really well for in-memory queries.
- and the excellent blog covers all of the on- and off-heap access and allocation patterns on the JVM very thoroughly.
- Unboxing, Runtime Specialization - a cool post on how to do really fast aggregations using unboxed integers
- product-collections - useful library for working with collections of tuples
- SuperFastHash - also see Murmur3
-
Phantom - Scala DSL for Cassandra, supports CQL3 collections, CQL generation from data models, async API based on Datastax driver
-
Athena - Asynchronous Cassandra client built on Akka-IO
-
CCM - easily build local Cassandra clusters for testing!
-
Stubbed Cassandra - super useful for testing C* apps
-
Pithos - an S3-API-compatible object store for Cassandra
-
Sirius - Akka-based in-memory fast key-value store for JVM objects, with Paxos consistency, persistence/txn logs, HA recovery
-
Storehaus - Twitter's key-value wrapper around Redis, MySql, and other stores. Has a neat merge() functionality for aggregation of values, lists, etc.
-
MapDB - Not a database, but rather a database engine with tunable consistency / ACIDness; support for off-heap memory; fast performance; indexing and other features.
-
HPaste - a nice Scala client for HBase
-
OctopusDB paper - interesting idea of using a WAL of RDF triples as the primary storage, with secondary views of row or column orientation
-
Scalaj-http - really simple REST API. Although, the latest Spray-client has been vastly simplified as well.
-
REPL as a service - would be kick ass if integrated into Spark
-
IScala - Scala backend for IPython. Looks promising. There is also Scala Notebook but it's more of a research project.
-
Scaposer - i18n / .po file library
-
Adding Reflection to Scala Macros - example of using reflection in an annotation macro to add automatic ByteBuffer serialization to case classes :)
-
Scaldi - A lightweight dependency injection library, with Akka integration
-
How to use Typesafe Config across multiple environments
-
lamma.io - the easiest date generation library
-
Pimpathon - a set of useful pimp-my-library extensions
-
Scala-rainbow - super simple terminal color output, easier than Console.XXX
-
Run Scala scripts with dependencies - ie you don't need a project file
-
sbt-assembly 0.10.2 supports adding a shell script to your jar to make it executable! No more "java ...." to start your Scala program, and no more
ps ax | grep java | grep ....
-
Other useful SBT plugins - sbt-sonatype, sbt-pom-reader, sbt-sound, plugins page
-
SCoverage - statement coverage tool, much more useful than line-based or branch-based tools. Has SBT plugin. Blog post on why it's an improvement.
-
sbt-jmh - Plugin for running SBT projects with the JMH profiling tool
-
SBT updates - Tool for discovering updated versions of SBT dependencies
-
Thyme and Parsley - microbenchmarking and profiling tools, seems useful
-
ScalaStyle - Scala style checker / linter
-
scala-abide - an official linter from Typesafe
-
utest - a small micro test framework
-
lions share - a neat JVM heap and GC analysis tool, with charts and SBT integration.
SBuild seems like a promising replacement for SBT. Still Scala, but much much simpler, more like Scala version of Make. With MVN dependency and ScalaTest support.
- Quick dumping your JVM heap using GDB -- too bad it doesn't work on OSX.
- jHiccup -- "Hiccup" or GC pause analysis tool
- Bintray - friendlier alternative to Sonatype OSS / Maven central. Also see bintray-sbt plugin.
- Adaptive Radix Trees - cache friendly indexing for in-memory databases
- Quotient Cubes - semantic grouping and rollup algorithm for OLAP cubes. Ruby implementation.
- Top K queries and cubes
- Scalable In-memory Aggregation - column-oriented, in memory with bitmap indexing and memoization
- LearnDS - A set of IPython notebooks for learning data science
- Raft Visualization - great 5-min visualization of the distributed consensus protocol
I love Sublime and use it for everything, even Scala! Going to put my Sublime stuff in a separate page.
- Semver - Semantic versioning, how to deal with dev workflows and corner cases -- a must read
- Pragmatic RESTful API Design - really good stuff
- Blameless Post-Mortems - why they are crucial to good culture
- GitHub Flow - how github.com does continuous deploys, uses pull requests for an automated, process-free development workflow. Some gems include naming branches descriptively and using github.com to browse the work currently in progress by looking at active branches.
- Pull Requests and other good Github Practices
-
JQ - JSON processor for the shell. Super useful with RESTful servers.
-
Underscore-CLI - a Node-JS based command line JSON parser
-
MacroPy - Scala-like macros, case classes, pattern matching, parser combos for Python (!!)
-
Scala 2.11 vs Swift - Apple's new iOS language is often compared to Scala.
-
Rust By Example - also the guide on their site is pretty good.
-
Gherkin - a Lisp implemented in bash !!
-
Nimrod - a neat, compile-straight-to-binary, static systems language with beautiful Python-like syntax, union types, generics, macros, first-class functions. What Go should have been.
-
Bret Victor - A set of excellent essays and talks from a great visual designer