Giter Club home page Giter Club logo

drujensen / fib Goto Github PK

View Code? Open in Web Editor NEW
847.0 847.0 107.0 838 KB

Performance Benchmark of top Github languages

C# 1.01% Java 9.07% C 0.93% C++ 1.72% Crystal 1.47% Go 2.11% JavaScript 1.18% PHP 0.53% Python 1.02% Ruby 44.86% Swift 2.20% Rust 3.35% Elixir 1.36% Dart 0.51% R 0.47% Julia 0.46% Nim 4.75% D 1.72% Shell 19.19% Haskell 2.11%
clang cpp crystal csharp dart elixir fibonacci-benchmark go java-8 javascript julia node perl php python python3 r ruby rust swift

fib's Introduction

Recursive Fibonacci Benchmark using top languages on Github

Top 10: JavaScript, Python, Java, TypeScript, C#, Php, C++, C, Shell, Ruby reference

Others: Go, Rust, Swift, Crystal, Pony, Ada, Pascal, Fortran, Kotlin, Clojure, Scala, Mono, R, Dart, Julia, D, Nim, Cython, Python3, PyPy, Ruby jit, OCaml, Lisp, Haskell, Erlang, Elixir, Escript, Dart, Scheme, Lua, Perl, Perl6, Bash, Emoji

The code performs a recursive fibonacci to the 47th position with the result of 2,971,215,073.

Fibonacci can be written many different ways. The goal of this project is to compare how each language handles the exact same code.

Here is the Ruby version:

def fib(n)
  return n if n <= 1
  fib(n - 1) + fib(n - 2)
end

puts fib(47)

Here is the Crystal version:

def fib(n)
  return n if n <= 1
  fib(n - 1) + fib(n - 2)
end

puts fib(47_u64)

Too keep a level playing field, only common "release" flags are used in the compilation step. This allows for compiler optimizations like inlining and constant propogation but removes anything considered dangerous i.e. bypassing out of bounds checks.

All tests are run on:

  • AWS EC2 - m5.large 2 vCPU
  • Processor: Intel Xeon 3.1Ghz
  • Memory: 8 GiB
  • OS: Ubuntu 20.04
  • Docker Base Image: ubuntu:20.04

How to run them

You can run the tests using Docker: docker run -it drujensen/fib

By default, it will compile and run all languages 5 times. Totals are calculated by adding the average compile and run times.

To only run a subset of the languages, provide a list of extensions and optionally the count:

docker run -it drujensen/fib ./run.sh s,c,cpp,go,rs,swift 5

To run in the background using nohup:

nohup docker run drujensen/fib:latest > results.txt 2>&1 &

Results

Last benchmark was ran on January 18, 2022

Natively compiled, statically typed

Language Total Compile Time, s Run Time, s Ext
C 6.865 gcc -O3 -o fib fib.c 0.058 ./fib 6.807 c
Fortran 6.893 gfortran -O3 -o fib fib.f03 0.119 ./fib 6.774 f03
C++ 6.954 g++ -O3 -o fib fib.cpp 0.126 ./fib 6.828 cpp
Cython 7.114 cython -3 --embed -o fib.pyx.c fib.pyx && gcc -O3 -o fib fib.pyx.c $(pkg-config --cflags --libs python) 0.297 ./fib 6.817 pyx
Nim 7.337 nim c -d:release fib.nim 0.920 ./fib 6.416 nim
Assembly 7.401 gcc -no-pie -O3 -o fib fib.s 0.025 ./fib 7.376 s
Ada 7.645 gnat make -O3 -gnatp -o fib fib.adb 0.213 ./fib 7.433 adb
Rust 7.652 rustc -C opt-level=3 fib.rs 0.638 ./fib 7.014 rs
Pascal 8.845 fpc -O3 -Si ./fib.pas 0.078 ./fib 8.766 pas
Pony 9.117 ponyc -s -b fib -p ./fib.pony 1.038 ./fib 8.079 pony
D 9.481 ldc2 -O3 -release -flto=full -of=fib fib.d 0.612 ./fib 8.869 d
Swift 10.568 swiftc -O -g fib.swift 0.474 ./fib 10.094 swift
OCaml 10.729 ocamlopt -O3 -o fib fib.ml 0.294 ./fib 10.435 ml
V 10.752 v -prod -o fib fib.v 4.352 ./fib 6.399 v
Go 12.486 go build fib.go 0.600 ./fib 11.885 go
Haskell 12.544 rm ./fib.o && ghc -O3 -o fib fib.hs 0.001 ./fib 12.543 hs
Crystal 17.900 crystal build --release fib.cr 4.940 ./fib 12.960 cr
Lisp 17.996 sbcl --load fib.lisp 1.704 ./fib 16.293 lisp
Dart Compiled 30.942 dart compile exe -o fib ./fib.dart 5.058 ./fib 25.884 dartc
Cobol 4380.728 cobc -x -O3 -o fib ./fib.cbl 0.133 ./fib 4380.596 cbl

VM compiled bytecode, statically typed

Language Total Compile Time, s Run Time, s Ext
Scala 9.047 scalac Fib.scala 0.942 scala Fib 8.104 scala
Java 10.391 javac Fib.java 0.894 java Fib 9.497 java
Kotlin 11.721 kotlinc Fib.kt 2.097 java FibKt 9.624 kt
C# (Mono) 16.107 mcs Fib.cs 0.491 mono Fib.exe 15.616 mono
C# 16.218 dotnet build -c Release -o ./bin 2.162 dotnet ./bin/fib.dll 14.056 cs
Erlang 28.868 erlc +native +'{hipe,[o3]}' fib.erl 0.500 erl -noinput -noshell -s fib 28.368 erl

VM compiled before execution, mixed/dynamically typed

Language Time, s Run Ext
Dart 11.286 dart fib.dart dart
Julia 11.318 julia -O3 fib.jl jl
Lua Jit 16.382 luajit fib.lua luajit
Clojure 23.448 clojure -M fib.cljc cljc
Elixir 28.604 ERL_COMPILER_OPTIONS='[native,{hipe, [o3]}]' elixir Fib.exs exs
Node 30.232 node fib.js js
Python3 (PyPy) 37.827 pypy3 fib.py pypy
Ruby (jit) 51.793 ruby --jit fib.rb rbjit

Interpreted, dynamically typed

Language Time, s Run Ext
Scheme 66.668 guile fib.scm scm
Php 118.468 php fib.php php
Lua 169.710 lua fib.lua lua
Ruby 223.653 ruby fib.rb rb
Janet 292.366 janet ./fib.janet janet
Python 507.845 python fib.py py
Python3 527.247 python3 fib.py py3
Perl 1085.181 perl fib.pl pl
Tcl 1580.804 tclsh fib.tcl tcl
R 2413.682 R -f fib.r r
Raku 2841.391 rakudo fib.raku raku
Escript 3448.282 escript fib.es es

Versions

All compilers are installed using apt or asdf on Ubuntu 20.04 docker image.

language version
ada 9.3.0
assembly 9.3.0
bash 5.0.0
crystal 1.3.0
clojure 1.10.3.1040
cython 0.29.26
dart 2.15.1
dotnet-core 6.0.101
elixir 1.12.0
elm 0.19.1
erlang 24.2
fortran 9.3.0
g++ 9.3.0
gcc 9.3.0
golang 1.17.5
guile 3.0.7
haskell 9.2.1
janet 1.19.2
java openjdk-17
julia 1.7.1
K 3.6
kotlin 1.6.10
ldc2 1.2.4
lua 5.4.3
luaJIT 2.1.0
mono 6.8.0
nim 1.6.2
nodejs 17.3.0
ocaml 4.11.1
pascal 3.0.4
perl 5.34.0
php 8.1.1
pony 0.38.1
powershell v7.0.0
python 3.10.1
pypy 7.3.7
qb64 1.5
R 4.1.2
rakudo 2021.12
ruby 3.1.0
rust 1.57.0
sbcl 2.11.1
scala 3.1.0
swift 5.5.2
tcl 8.6.10
v 0.2.2

fib's People

Contributors

abhinickz avatar adrianv avatar agrison avatar ahungry avatar akira13641 avatar anthony-michael-bennett avatar autinerd avatar benziahamed avatar bport avatar drujensen avatar faustinoaq avatar guicho271828 avatar ikhoury avatar jruhland-nvidia avatar kevinchiu avatar mahdisml avatar masterduke17 avatar obahareth avatar p7g avatar pichi avatar pierrebeaucamp avatar qdzo avatar shakna-israel avatar shrx avatar tomhennigan avatar utdemir avatar vlazar avatar xyproto avatar zardoz89 avatar zhangkaizhao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fib's Issues

Interpreted OCaml ?

OCaml can run as an interpreted language (via the command ocaml fib.ml) and can be compiled for a VM via ocamlc (and executed via ocamlrun).

Base on my setup, wich tends to be 50% slower than yours, the interpreted version would be the fastest : 136s on my machine (maybe 91s on yours ?).

(grmbl : not a match for luajit which scores 19s on my setup).

Probably worth benching PHP 7.2

PHP isn't going to win this; the recursive algorithm is the least efficient way to do this on the platform. However, PHP 5.6 has been EOL for a few years now. PHP 7.0 is already EOL. Current stable is 7.2, old stable is 7.1.

On a very similar MBP (2015, 16GB ram) using PHP 7.2.4 (cli) from home-brew, the results are

# time php fib.php
2971215073php fib.php  189.72s user 0.38s system 99% cpu 3:10.54 total

Again, no speed record, but 3:10 vs 9 min is a big difference.

FYI, with the iterative approach:

<?php

function fastfib($n)
{
    $a = 0;
    $b = 1;

    foreach(range(0, $n) as $i) {
        [$a, $b] = [$b, $a + $b];
    }

    return $a;
}

echo fastfib(46) . PHP_EOL;

PHP 7.2 takes 0.042 total.

The fastest languages are cheating

When compiling Nim to C++, this is the simplified result:

#include <stdint.h>
#include <stdio.h>

uint64_t fib(uint64_t n){
    if (n > 1){
        uint64_t a = fib(n - 1);
        uint64_t b = fib(n - 2);

        return a + (b);
    }
    return 1;
}

extern "C" void foo(){
    uint64_t result = fib(46);

    printf("%lu\n", result);
}

int main(){
    foo();
}

Nothing wrong here yet. But when compiled with g++, the disassembled program will look similar to this:

; Run like so:
; nasm -felf64 fib.asm && gcc -no-pie fib.o -o fib && time ./fib
global main
extern printf

section .text

fib:
        cmp     rdi, 1
        mov     eax, 1
        ja      L15
        ret
L15:
        push    r15
        push    r14
        mov     r14, -1
        push    r13
        push    r12
        lea     r12, [rdi-2]
        push    rbp
        push    rbx
        mov     rbp, rdi
        xor     ebx, ebx
        sub     rsp, 8
L3:
        cmp     rbp, 2
        jne     L9
        cmp     r12, 1
        ja      L7
        mov     r13d, 1
L5:
        lea     rax, [r13+1+rbx]
L1:
        add     rsp, 8
        pop     rbx
        pop     rbp
        pop     r12
        pop     r13
        pop     r14
        pop     r15
        ret
L7:
        mov     rdi, r14
        sub     r12, 4
        mov     rbp, -2
        call    fib
        lea     rbx, [rax+1+rbx]
L9:
        lea     r15, [rbp-3]
        mov     rdi, r12
        call    fib
        mov     rdi, r15
        mov     r13, rax
        call    fib
        add     r13, rax
        cmp     r12, 1
        jbe     L5
        mov     rdi, r15
        sub     rbp, 4
        sub     r12, 4
        call    fib
        add     r13, rax
        add     rbx, r13
        cmp     rbp, 1
        ja      L3
        lea     rax, [rbx+1]
        jmp     L1

foo:
        push    r12
        push    rbp
        mov     edi, 40
        push    rbx
        call    fib
        mov     edi, 39
        mov     rdx, rax
        call    fib
        mov     edi, 38
        mov     rbp, rax
        call    fib
        mov     edi, 37
        lea     r11, [rax+rbp]
        mov     rcx, rax
        call    fib
        mov     edi, 36
        add     rcx, rax
        mov     rsi, rax
        lea     r10, [r11+rcx]
        call    fib
        add     rsi, rax
        mov     rbx, rax
        mov     edi, 35
        add     rcx, rsi
        lea     r9, [r10+rcx]
        call    fib
        add     rbx, rax
        mov     edi, 34
        mov     r12, rax
        add     rsi, rbx
        add     rcx, rsi
        lea     r8, [r9+rcx]
        call    fib
        add     rdx, rbp
        mov     edi, format
        add     rdx, r12
        add     rax, rdx
        add     rax, r11
        add     rax, rbx
        add     rax, r10
        add     rax, rsi
        pop     rbx
        add     rax, r9
        pop     rbp
        pop     r12
        add     rax, rcx
        lea     rsi, [rax+r8*2]
        xor     eax, eax
        jmp     printf
main:
        sub     rsp, 8
        call    foo
        xor     eax, eax
        add     rsp, 8
        ret

format: db  "%lu", 10, 0

Apparently, fib is only called for the parameters 34, 35, 36, 37, 38, 39, 40 and then the results are added up. This is not a fair comparison, because Nim is not doing the same computations as the other languages. I believe that Nim should be moved to the section Optimized. Otherwise, one would have to argue how much unrolling is ok and how much is too much, but there is no good answer for that.

Likewise, C and C++ partially unroll the fib function calls, although not as far. I have not checked the Fortran and Cython binaries, but they are probably doing the same thing.

It was suggested in another issue that the parameter n should be configurable to prevent optimization. However, that is insufficient since it still allows to unroll the fib function for small values of n. The only solution seems to be to inspect the generated assembly code

Here is a fair assembly implementation to compare against. If a program is faster, it is likely cheating in some way or another.

; Run like so:
; nasm -felf64 fib.asm && gcc -no-pie fib.o -o fib && time ./fib
global main
extern printf
section .text

fib:
    mov rax, 1
    dec rdi
    jle done
    push rdi
    call fib
    pop rdi
    push rax
    dec rdi
    call fib
    pop rdx
    add rax, rdx
done:
    ret

main:
    mov rdi, 46
    call fib

    mov rdi, format
    mov rsi, rax
    xor rax, rax
    call printf
    xor rax, rax
    ret

format: db "%lu", 10, 0

Using printf inside fib.c makes it unnecessary slow

clib printf can be quite slow comparing to other methods of writing long to the stdout, this may be unfair when comparing with other languages that use printing routines optimized for int/longs (e.g. C# versions uses one). Please consider using ultoa and puts instead.

Nim version is out of date and the compiled code is too slow because of it

Decided to see how OCaml would do, and since my computer is old and slow I got a baseline first with fib.nim for nim (the unoptimized one) and I got very different times:

╰─➤  time nim cpp -d:release fib.nim
...snip...
Hint: operation successful (12272 lines compiled; 2.119 sec total; 13.188MiB peakmem; Release Build) [SuccessX]
nim cpp -d:release fib.nim  2.22s user 0.14s system 107% cpu 2.187 total
╰─➤  time ./fib
2971215073
./fib  0.53s user 0.00s system 99% cpu 0.536 total

So... I'm thinking your nim compiler is out of date, I'm using:

╰─➤  nim --version
Nim Compiler Version 0.19.0 [Linux: amd64]
Compiled at 2018-09-28
Copyright (c) 2006-2018 by Andreas Rumpf

git hash: f6c5c636bb1a1f4e1301ae0ba5a8afecef439132
active boot switches: -d:release

So then I tried crystal:

╰─➤  crystal --version
Crystal 0.26.1 [391785249] (2018-08-27)

LLVM: 4.0.0
Default target: x86_64-unknown-linux-gnu

╰─➤  time crystal build --release fib.cr
crystal build --release fib.cr  0.90s user 0.12s system 126% cpu 0.803 total

╰─➤  time ./fib
2971215073
./fib  7.87s user 0.00s system 99% cpu 7.876 total

╰─➤  calc 7.867/5.687
        ~1.38333040267276244065

So my computer appears to be roughly 38% slower than the one used in the readme, so based on that, ocaml:

╰─➤  cat fib.ml
let rec fib = function
| 0 -> 1
| 1 -> 1
| n -> fib(n-1) + fib(n-2)

let _ = print_int (fib 46)

╰─➤  ocamlopt.opt -version                 
4.08.0+dev0-2018-04-09

╰─➤  time ocamlopt.opt fib.ml -o fib # Compiling
ocamlopt.opt fib.ml -o fib  0.08s user 0.02s system 85% cpu 0.126 total

╰─➤  time ./fib # Running                       
2971215073./fib  11.58s user 0.00s system 99% cpu 11.580 total

╰─➤  calc '11.580/1.38333040267276244065'
        ~8.37110207194610397867

As is usual OCaml has one of the fastest native optimizing compilers out in compiling speed, and it looks like this non-memoized or statically calculated (in other words, as generic as it can get) OCaml version ranks it roughly between D and Swift without further compiler or code optimizations (there are a few commandline arguments that might get it faster). If curious, the code is cat'd above.

Ruby tooks ~40 seconds less on a much less performant CPU

defined fib as described in the readme, but on a Raspberry Pi 4 (4G RAM, 1.5*4 ghz CPU), but on my irb, it takes 245.66542729600042 running fib(46).

IRB:

def time(&block)
  t = Process.clock_gettime(Process::CLOCK_MONOTONIC)
  yield(block)
  (Process.clock_gettime(Process::CLOCK_MONOTONIC) - t)
end

def fib(n)
  return 1 if n <= 1
  fib(n - 1) + fib(n - 2)
end

time { fib(46) } # => 245.665...

If this ran in 245 seconds, shouldn't it take much less on an Intel i7?

dotnet run defaults to debug mode

The csharp (dotnet core) should probably be ran with dotnet run -c release. That’s likely why it’s so much slower than msc/java.

https://docs.microsoft.com/en-us/dotnet/core/tools/dotnet-run?tabs=netcore21

It might also be better to use dotnet build -c release, and then run it with dotnet fib.dll (to be comparable to the java/msc)... the fastest dotnet solution would likely be dotnet publish -c release -r osx-x64 and then run ./fib in the output directory.

https://docs.microsoft.com/en-us/dotnet/core/tools/dotnet-publish?tabs=netcore21

Much faster version for Julia, using the same algorithm but with proper coding.

The Julia solution hasn't been optimized, as with other languages.

This code, created by Tamas_Papp, is much faster while using the same algorithm:
https://discourse.julialang.org/t/simple-recursive-fibonaci-example-how-to-make-it-faster/32369/14
Could you please update the fib.jl file with it, please?

function fibval(::Val{N}) where N
    if N ≤ 1
        1
    else
        fibval(Val{N - 1}()) + fibval(Val{N - 2}())
    end
end

Other solutions are also faster:
For example, this one created by DNF. But it uses an external package for memoization.

@memoize function fibmem(n)
         if n <= 1 return 1 end
         return fibmem(n - 1) + fibmem(n - 2)
       end

HELP - Why did I get 21 PR's in one day?

Can someone tell me why this project is all of the sudden very active? 21 PR's in one day. I did this for a crystal presentation as a fun side project but I don't know if I can maintain it at this pace.

Does someone have time to help work on this?

Pypy benchmark for python

It would be nice if there were a pypy benchmark here, too. Pypy aims to be faster than the normal CPython implementation by JIT compiling the code. In my tests, this benchmark is significantly faster than vanilla CPython:

$ time python fib.py  # CPython implementation
2971215073
_________________________
Executed in  630.05 secs
   usr time  627.57 secs
   sys time    0.21 secs

$ time pypy fib.py  # Pypy
2971215073
_________________________
Executed in   33.43 secs
   usr time   33.31 secs
   sys time    0.05 secs

As shown, pypy is almost 19x faster on my machine!

Python: Another version

Would it be possible to use a slightly different coding style?

def fib(n):
    return 1 if n <= 1 else fib(n - 1) + fib(n - 2)

It makes a noticable difference on my machine without beeing much different from the original.

Use proper technique in the language

When we focus on solving problems efficiently you can use iterators for the javascript version. It will out performs the recursion version by factor x1000. Many other languages support tail-recursion. Ignoring such things means you don't know the language or what is this benchmark about? Stack handling, GC, compiler performance/optimization?

Generator time: 2.951ms
Recursive time: 76817.010ms
function* fib(n) {
  const isInfinite = n === undefined
  let current = 0
  let next = 1

  while (isInfinite || n--) {
    yield current
    ;[current, next] = [next, current + next]
  }
}

Reference: https://medium.com/javascript-scene/7-surprising-things-i-learned-writing-a-fibonacci-generator-4886a5c87710

Tcl Version not making use of coroutines

Since Tcl 8.6, Tcl has had the non-recursive engine which greatly helps performance in cases like this, where you need a generator. Here is my take on the implementation:

proc fib_gen {} {
  set seq0 0
  set seq1 1
  yield [info coroutine]
  yield 1
  while 1 {
    set seq [expr {$seq0+$seq1}]
    set seq0 $seq1
    set seq1 $seq
    yield $seq
  }
}

coroutine FIB fib_gen
for {set x 0} {$x < 46} {incr x} {
  FIB
}
puts [list [FIB]]

If you would prefer fib be expresses as a function, we can produce a proc that will create the generator and destroy it on close:

proc fib_gen {} {
  set seq0 0
  set seq1 1
  yield [info coroutine]
  yield 1
  while 1 {
    set seq [expr {$seq0+$seq1}]
    set seq0 $seq1
    set seq1 $seq
    yield $seq
  }
}

proc fib {N} {
  set coro  FIB#[incr ::fibcount]
  coroutine $coro fib_gen
  for {set x 0} {$x < $N} {incr x} {
    $coro
  }
  set result [$coro]
  rename $coro {}
  return $result
}

puts [fib 46]

Lua Optimised

lua Lua 5.3.5 Copyright (C) 1994-2018 Lua.org, PUC-Rio

The Lua Optimised benchmark was run using Lua 5.3.

However, I wrote it with LuaJIT in mind, which is actually somewhere between Lua 5.1 and Lua 5.2, and one of the fastest JIT compilers. Most real-world code written for Lua, tends to be either written for, or compatible with, LuaJIT, because of the performance.

I would expect a bit of a time difference to result. Probably preferring LuaJIT and bumping it a few implementations up the rungs.

Compiler flag for rustc "-C lto=fat" not necessary and slows down execution

Hi,
according to the official docs (https://doc.rust-lang.org/rustc/codegen-options/index.html) the compiler flag "lto=fat" is used for "attempts to perform optimizations across all crates within the dependency graph".
I expected Rust to be 'more comparable' with respect to speed compared to C and C++.
I did some testing and figured out that for this example (calculation of Fibonacci numbers) the execution is slowed down quite a bit by using this compiler flag.

I think it would be fair to delete the part "-C lto=fat" from "rustc -C opt-level=3 -C lto=fat fib.rs" in the "run.rb" file.

optimise for size when compiling rust

optimising for size with rust produces a significant difference, comparable to nim on my machine

$ rustc -O fib.rs && hyperfine ./fib
Benchmark #1: ./fib
  Time (mean ± σ):      6.878 s ±  0.036 s    [User: 6.862 s, System: 0.009 s]
  Range (min … max):    6.822 s …  6.948 s
$ rustc -C opt-level=s fib.rs && hyperfine ./fib
Benchmark #1: ./fib
  Time (mean ± σ):      5.334 s ±  0.035 s    [User: 5.320 s, System: 0.006 s]
  Range (min … max):    5.284 s …  5.386 s
$ nim cpp -d:release fib.nim && hyperfine ./fib
Hint: used config file '/<redacted>/.choosenim/toolchains/nim-0.19.0/config/nim.cfg' [Conf]
Hint: system [Processing]
Hint: fib [Processing]
Hint:  [Link]
Hint: operation successful (12272 lines compiled; 0.214 sec total; 13.191MiB peakmem; Release Build) [SuccessX]
Benchmark #1: ./fib
  Time (mean ± σ):      5.338 s ±  0.030 s    [User: 5.324 s, System: 0.007 s]
  Range (min … max):    5.286 s …  5.385 s

Change all examples to take a parameter

@OvermindDL1 says "I think all the tests should be adjusted to take the integer on the program parameters or an environment variable or something. That would both prevent such optimizations and would allow for easily CI'd consistency checks."

Try latest GNU g++ 8.1

And add constexpr:

#include <iostream>

using namespace std;

constexpr long fib(long x) {
    if (x <= 1) {
        return 1;
    } else {
        return fib(x - 1) + fib(x - 2);
    }
}

int main()
{
    cout << fib(46);
}

This includes startup times

This does not accuratly measure the performance as startup times are included.
Also Python is compiled to bytecode before execution (which is also included in the startup time)

Swift: disable runtime safety checks

Other languages either don't have those safety checks or have them disabled, so Swift should also have them disabled, in my machine, it cut 33% of the execution time.

$ swiftc -O -g Fib.swift && hyperfine ./Fib
Benchmark #1: ./Fib

  Time (mean ± σ):      9.750 s ±  0.006 s    [User: 9.732 s, System: 0.008 s]
 
  Range (min … max):    9.741 s …  9.758 s

To

$ swiftc -O -Ounchecked -g Fib.swift && hyperfine ./Fib
Benchmark #1: ./Fib

  Time (mean ± σ):      6.407 s ±  0.006 s    [User: 6.393 s, System: 0.007 s]
 
  Range (min … max):    6.398 s …  6.419 s

Add Assembly Script

Would it be possible to also add Assembly Script to the benchmark? Thanks

fib.rs has some room for improvement

For starter I've noticed most programs were compiled with optimization level 3 and some with lto while fib.rs is compiled with optimizations for size.

I compiled rosettacode's rust entries for Fibonacci Sequence and they were all much faster than fib.rs. I don't what you consider a fair comparison so I'll enter them all here rather than submit a pull request.

All programs were compiled with rustc -C opt-level=3 -C lto=fat
Performance was measured with hyperfine

Iterative

fn main() {
    let mut prev = 0;
    let mut curr = 1usize;

    while let Some(n) = curr.checked_add(prev) {
        prev = curr;
        curr = n;
        println!("{}", n);
    }
}
Benchmark: 
  Time (mean ± σ):       0.4 ms ±   0.3 ms    [User: 0.5 ms, System: 0.4 ms]
  Range (min … max):     0.2 ms …   3.9 ms    1278 runs

Recursive

use std::mem;
fn main() {
    fibonacci(0,1);
}
 
fn fibonacci(mut prev: usize, mut curr: usize) {
    mem::swap(&mut prev, &mut curr);
    if let Some(n) = curr.checked_add(prev) {
        println!("{}", n);
        fibonacci(prev, n);
    }
}
Benchmark: 
  Time (mean ± σ):       0.4 ms ±   0.4 ms    [User: 0.5 ms, System: 0.5 ms]
  Range (min … max):     0.2 ms …   8.8 ms    1255 runs

Recursive (with pattern matching)

fn fib(n: u32) -> u32 {
    match n {
        0 => 0,
        1 => 1,
        n => fib(n - 1) + fib(n - 2),
    }
}

fn main() {
    println!("{}",fib(47))
}
Benchmark: 
  Time (mean ± σ):     14.374 s ±  0.090 s    [User: 14.355 s, System: 0.002 s]
  Range (min … max):   14.232 s … 14.522 s    10 runs

Tail Recursive (with pattern matching)

fn fib_tail_recursive(nth: usize) -> usize {
    fn fib_tail_iter(n: usize, prev_fib: usize, fib: usize) -> usize {
        match n {      
            0 => prev_fib,
            n => fib_tail_iter(n - 1, fib, prev_fib + fib),
        }
    }
    fib_tail_iter(nth, 0, 1)
}

fn main() {
    println!("{}",fib_tail_recursive(47))
}
Benchmark: 
  Time (mean ± σ):       0.5 ms ±   0.4 ms    [User: 0.6 ms, System: 0.4 ms]
  Range (min … max):     0.2 ms …   2.7 ms    1209 runs

Analytic

fn main() {
    for num in fibonacci_gen(48) {
        println!("{}", num);
    }
}
 
fn fibonacci_gen(terms: i32) -> impl Iterator<Item=u64> {
    let sqrt_5 = 5.0f64.sqrt();
    let p  = (1.0 + sqrt_5) / 2.0;
    let q = 1.0/p;
    (1..terms).map(move |n| ((p.powi(n) + q.powi(n)) / sqrt_5 + 0.5) as u64)
}
Benchmark: 
  Time (mean ± σ):       0.4 ms ±   0.3 ms    [User: 0.5 ms, System: 0.4 ms]
  Range (min … max):     0.2 ms …   5.7 ms    1245 runs

Using an Iterator

use std::mem;
 
struct Fib {
    prev: usize,
    curr: usize,
} 
 
impl Fib {
    fn new() -> Self {
        Fib {prev: 0, curr: 1}
    }
}
 
impl Iterator for Fib {
    type Item = usize;
    fn next(&mut self) -> Option<Self::Item>{
        mem::swap(&mut self.curr, &mut self.prev);
        self.curr.checked_add(self.prev).map(|n| { 
            self.curr = n;
            n
        })
    }
}
 
fn main() {
    let mut x = Fib::new();
    for _n in 1..=46 {
        x.next();
    }
    println!("{}", x.curr);
}
Benchmark: 
  Time (mean ± σ):       0.4 ms ±   0.3 ms    [User: 0.5 ms, System: 0.5 ms]
  Range (min … max):     0.2 ms …   2.5 ms    1189 runs
for reference the one used here as of Nov 14, 2019
Benchmark: 
  Time (mean ± σ):      4.401 s ±  0.029 s    [User: 4.392 s, System: 0.002 s]
  Range (min … max):    4.381 s …  4.479 s    10 runs

Notes

  • Analytic prints everything that comes before and up to Fibonacci to the 46th position
  • Iterative & Recursive prints the first 92 elements in the sequence

Suggestion: align fib() function to DSB boundary (64 byte)

It seems that some benchmarks for statically compiled languages (C, Rust, Fortran etc.) heavily depends on where the instructions of fib() function will be placed. In modern processors, which has 64-byte DSB boundaries, small loops or recursive calls may fit in a single μops cache, but it depends on the code alignment.

Here is my experiments for C benchmark:

memory address total execution time [s]
0x1880 11.208
0x1890 11.323
0x18a0 13.320
0x18b0 10.769

This alignment issue causes different benchmark results on different platforms, compiler versions, compiler options (e.g. #28), or even what linker you use. If you stick with fib() function benchmark, you should manually specify function alignment to get consistent results.

Not a fair assesment

One is using memorization for go and all other languages are actually doing the work, which is completely unfair, and all other languages can easily made to use memorization technique.

Compile Nim with different target

After some quick testing it turns out that switching from the C target to C++ makes the Nim version run much faster. I propose changing the command to this:

nim cpp -d:release fib.nim

Elixir code issue

Assuming wanting to keep the invalid non-TCO recursion of fib, then these issues remain:

  1. It's not being compiled with the compiler made for numerical work, rather the default compiler is built for IO work, need to choose the correct compiler for the correct task.
  2. Scripts are being executed as they are compiled, not being compiled ahead of time, elixir is not designed to be used as a scripting language, it is designed for long-running systems.
  3. VM start-up time is also being measured, however this will be ignored as it is also included in the other VM languages, which is putting them not on par with the pre-compiled languages.
  4. And as a side-note, such a test as Fib doesn't test multi-core parallel work, which languages, for example, such as Ruby, Python, Crystal, and so forth do very poorly at. Consequently BEAM VM languages (like Elixir and Erlang) are not designed for numerical work at all and are designed for that work to be slaved out to another language.

To fix issue 1 you need to specify which JIT to use for the given module (file):

╰─➤  # Normal timing, yes my computer is anciently slow:
╰─➤  time elixir Fib.exs
2971215073
elixir Fib.exs  115.85s user 0.12s system 100% cpu 1:55.89 total

╰─➤  # Specify to use the numerical JIT (HiPE) for this single module (file):
╰─➤  time ERL_COMPILER_OPTIONS='[native,{hipe, [o3]}]' elixir Fib.exs
2971215073
ERL_COMPILER_OPTIONS='[native,{hipe, [o3]}]' elixir Fib.exs  27.66s user 0.14s system 100% cpu 27.584 total

╰─➤  calc 27.66/115.85
        ~0.23875701337936987484

So it is roughly 24% the time taken of the one that is not using the numerical JIT

The second issue and so on is less of an issue, the first is the big speed change, but the rest are 'usage' issues.

However, just changing that one JIT that is used changes Elixir's result on the readme, if following the same ~24% change, then:

╰─➤  calc 69.101*0.23875701337936987484
        ~16.49834838152783772132

Puts it ahead of even NodeJS (I didn't actually expect that...).

Bonus issue: C++ is using the streaming operator to output the result along with a flush, that means that the program will freeze upon that call until the flush is resolved and confirmed by the operating system. This is not something that most of the languages are doing, thus it is artificially slowing down C++ by an amount based on the terminal being used and the OS being used. Also, in C++ nowadays generally use constexpr for such functions. However, passing in the call 'to' the program would prevent such optimizations anyway and should be used for all languages since hard-coding the call can perform unintended optimizations that would not occur in normal code flow.

Average time

You are running calculation once in every language, and then one time per language. The result would be more reliable if it was averaged for multiple calculations, or just a sum of multiple calculation. A good example of it is optimised Nim and C++. In your benchmark Nim is twice as fast as C++, meanwhile by adding a 1'000'000 times loop, resulting difference is that C++ is 0.01s faster (with piping to /dev/null)
benchmark

Differences between C# and Java versions

  1. In C# you use uint type which is 4-byte, in Java you use long which is 8-byte integer type
  2. Java version uses System.out.print which does NOT print new line at the end of the line, C# version uses Console.WriteLine which prints new line at the end
  3. In C# code using [MethodImpl(MethodImplOptions.AggressiveInlining)] is questionable, first because using this attribute is not considered the best practice. Second because what will be done depends on runtime environment version.

Wrong Categorization For Elixir Language

Elixir is dynamically typed language. Also, elixir and elixirc always compiles and executes modules. Next, the resulting files are ran on the Bogdan/Björn's Erlang Abstract Machine (BEAM). The BEAM is a virtual machine (VM). Here's a good reference for a deep dive:

https://medium.com/@fxn/how-does-elixir-compile-execute-code-c1b36c9ec8cf

Well, thanks again for the repository because it's always good to see actual performance benchmarks for Crystal Language.

Why no-prepopulate-passes in Rust?

Currently Rust is compiled with the no-prepopulate-passes option.

This prevents LLVM to run an optimization pass which probably compilers of the other languages do run so it seems an unfair disadvantage for Rust.

I would suggest to just do
rustc -O fib.rs

or to keep it in line with the options for C/C++ do:
rustc -C opt-level=3 fib.rs

(I did not see any performance difference between those two above options)

I am not submitting the PR because I think the change should come with the new performance numbers which I can not run as my hardware setup is different from yours.

Does this still work?

I tried to add WebAssembly, but I could not get the docker image to build.

On the first run, I got:

Step 7/109 : RUN curl -fsS https://dlang.org/install.sh | bash -s ldc
 ---> Running in eb55bdfecf13
main: line 178: USERPROFILE: unbound variable
The command '/bin/sh -c curl -fsS https://dlang.org/install.sh | bash -s ldc' returned a non-zero code: 1

On the second run, I uncommented most languages and got:

Step 13/25 : RUN DEBIAN_FRONTEND=noninteractive apt-get install -qq -y             cython             fp-compiler             gfortran             ocaml             perl6             sbcl             tcl             guile-2.2             luajit
 ---> Running in eb2c7c851b44
E: Unable to locate package cython
E: Unable to locate package fp-compiler                                                                                                                                                                      
E: Package 'gfortran' has no installation candidate                                                                                                                                                          
E: Unable to locate package ocaml                                                                                                                                                                            
E: Unable to locate package perl6                                                                                                                                                                            
E: Unable to locate package sbcl                                                                                                                                                                             
E: Unable to locate package tcl                                                                                                                                                                              
E: Unable to locate package guile-2.2                                                                                                                                                                        
E: Couldn't find any package by glob 'guile-2.2'                                                                                                                                                             
E: Couldn't find any package by regex 'guile-2.2'                                                                                                                                                            
E: Unable to locate package luajit                                                                                                                                                                           
The command '/bin/sh -c DEBIAN_FRONTEND=noninteractive apt-get install -qq -y             cython             fp-compiler             gfortran             ocaml             perl6             sbcl             tcl             guile-2.2             luajit' returned a non-zero code: 10

I then commented all languages except ruby and node, but the result for my WebAssembly benchmark is 0.000 seconds when run.sh is trying to execute it, no error messages given. It already fails to compile with wat2wasm fib.wat. Both wat2wasm fib.wat and node fib_wasm.js run fine when executed by hand, i.e. not with run.sh.

Any advice?

Here is the code to install wat2wasm:

https://github.com/983/fib/blob/b850b849ed53d932adf6b7ed67c61b54ae9d6c77/Dockerfile#L106

Here is the code to compile and run the benchmark:

https://github.com/983/fib/blob/b850b849ed53d932adf6b7ed67c61b54ae9d6c77/run.sh#L52

Some miscelaneous notes:

  • Is this asdf-vm thing worth it? There are a lot of warnings about gpg keys which could not be checked. I also tried installing ruby and node from binaries and got much faster docker build times.
  • The path command can be split up by escaping the $ sign with a backslash so more than one export can be declared without messing things up and the whole thing will become more modular.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.