# Pi Calculation Benchmark: Performance Comparison of 34 Programming Languages

## Overview

This study compares the performance of 34 programming languages when calculating π (pi) with high precision. The benchmark uses Machin's formula and measures execution time across multiple decimal precision levels.

## Test Environment

**Hardware:**
- **Model:** MacBook Neo (Mac17,5)
- **Processor:** Apple A18 Pro
  - 6 cores: 2 performance cores + 4 efficiency cores
  - Architecture: ARM64
- **Memory:** 8 GB RAM
- **Operating System:** macOS (Darwin)

**Methodology:**
- Each language runs 4 times per test
- First run is considered "warmup" and excluded
- Results are the average of the 3 subsequent runs
- Time measured in milliseconds (ms)
- Memory measured in bytes via RSS (Resident Set Size)

## Method: Machin's Formula

All implementations use Machin's formula for π calculation:

```
π/4 = 4·arctan(1/5) - arctan(1/239)
```

Where arctan(x) is calculated using the Taylor series:

```
arctan(x) = x - x³/3 + x⁵/5 - x⁷/7 + ...
```

**Advantages of this method:**
1. Fast convergence (few terms required)
2. Simple implementation
3. High precision possible
4. Only integer arithmetic required

## Performance Reports

Detailed performance reports are available for each decimal precision level:

- **[1 Decimal](reports/1_decimals.md)** - Minimal precision
- **[2 Decimals](reports/2_decimals.md)** - Low precision
- **[5 Decimals](reports/5_decimals.md)** - Medium precision
- **[10 Decimals](reports/10_decimals.md)** - Standard precision
- **[100 Decimals](reports/100_decimals.md)** - High precision
- **[1000 Decimals](reports/1000_decimals.md)** - Very high precision
- **[2000 Decimals](reports/2000_decimals.md)** - Extreme precision

## Summary Results (100 Decimals)

### All Languages Performance

| Rank | Language | Time (ms) | Memory (bytes) | Type |
|------|-----------|-----------|----------------|------|
| 1 | Odin | 19 | 1,725,781 | Compiled |
| 2 | Assembly | 20 | 1,409,024 | Compiled |
| 3 | Nim | 20 | 1,572,864 | Compiled |
| 4 | Rust | 20 | 1,682,096 | Compiled |
| 5 | Lua | 20 | 2,086,229 | Interpreted |
| 6 | Objective-C | 20 | 6,045,696 | Compiled |
| 7 | Swift | 20 | 5,947,392 | Compiled |
| 8 | Zig | 22 | 2,981,888 | Compiled |
| 9 | C++ | 23 | 1,490,944 | Compiled |
| 10 | Go | 24 | 3,932,160 | Compiled |
| 11 | C | 25 | 1,671,168 | Compiled |
| 12 | D | 30 | 2,457,600 | Compiled |
| 13 | Bash | 30 | 2,058,922 | Interpreted |
| 14 | Fortran | 31 | 1,802,240 | Compiled |
| 15 | Crystal | 32 | 3,244,032 | Compiled |
| 16 | Dart | 34 | 14,488,917 | JIT |
| 17 | Haskell | 40 | 11,894,784 | Compiled |
| 18 | Java | 46 | 43,078,997 | JIT |
| 19 | Python | 47 | 9,693,866 | Interpreted |
| 20 | Perl | 47 | 12,528,298 | Interpreted |
| 21 | Brainfuck | 50 | 9,267,882 | Interpreted |
| 22 | Kotlin | 60 | 45,247,146 | JIT |
| 23 | C# | 66 | 41,473,365 | JIT |
| 24 | PHP | 68 | 26,487,466 | Interpreted |
| 25 | Ruby | 79 | 28,824,917 | Interpreted |
| 26 | JavaScript | 89 | 44,848,469 | JIT |
| 27 | Julia | 157 | 235,885,909 | JIT |
| 28 | R | 163 | 90,947,584 | Interpreted |
| 29 | Erlang | 176 | 77,359,786 | Interpreted |
| 30 | Scala | 344 | 55,470,762 | JIT |
| 31 | Elixir | 401 | 89,205,418 | Interpreted |
| 32 | TypeScript | 931 | 218,868,394 | JIT |

### Language Categories

**Compiled Languages (Native Code):**
- Fastest execution (9-35 ms)
- Minimal memory usage (0-966,656 bytes)
- Consistent performance across decimal levels

**JIT-Compiled Languages:**
- Moderate execution time (31-290 ms)
- Higher memory usage (~2 MB)
- Good performance after warmup

**Interpreted Languages:**
- Variable execution time (29-898 ms)
- Moderate memory usage (~2 MB)
- Performance varies widely

## Key Findings

1. **Compiled languages dominate**: Assembly, Rust, Nim, Odin, C, C++ all execute in 19-25 ms
2. **Memory efficiency varies**: 
   - Compiled languages: 1.4-6.0 MB (Assembly lowest at 1.4 MB)
   - Go with runtime: 3.9 MB
   - JVM languages: 41-55 MB
   - Interpreted languages: 9-29 MB
   - Julia: 236 MB (JIT + scientific libraries)
3. **Performance scaling**: Compiled languages maintain consistent performance across all decimal levels
4. **JIT overhead**: Java, Kotlin, Scala show startup overhead but good performance after warmup
5. **Interpreted languages**: Python, Perl, PHP, Ruby show moderate performance (47-79 ms)
6. **Memory fix applied**: All languages now show correct memory values using `/usr/bin/time -l` on macOS

## Performance Analysis by Language Type

### Compiled Languages (Native Code)
- **Fastest execution**: 19-32 ms
- **Minimal memory**: 1.4-6.0 MB
- **Best performers**: Assembly, Rust, Nim, Odin, C, C++
- **Why fast**: Direct machine code, no runtime overhead, no garbage collection

### JIT-Compiled Languages
- **Moderate execution**: 34-931 ms
- **Higher memory**: 14-236 MB
- **Best performers**: Java (46 ms), Kotlin (60 ms), Dart (34 ms)
- **Why moderate**: JIT compilation overhead, runtime initialization

### Interpreted Languages
- **Variable execution**: 20-401 ms
- **Moderate memory**: 2-29 MB
- **Best performers**: Lua (20 ms), Python (47 ms), Perl (47 ms)
- **Why variable**: Interpretation overhead, dynamic typing

### Functional Languages
- **Mixed performance**: 40-401 ms
- **Higher memory**: 12-90 MB
- **Best performers**: Haskell (40 ms), Erlang (176 ms)
- **Why mixed**: Functional paradigms, immutability, pattern matching

## Detailed Language Analysis

### Top Performers (Rank 1-10)

#### 1. Odin (19 ms, 1.7 MB) - Fastest
**Why fastest:**
- **Modern systems language** - Designed for performance
- **No GC** - Manual memory management
- **Direct compilation** - No intermediate representations
- **Minimal runtime** - Small standard library
- **Optimized for speed** - Built for game development

**Implementation:**
- Uses Machin's formula with custom BigInt
- Direct compilation to machine code
- Very low overhead (1.7 MB)

#### 2. Assembly (20 ms, 1.4 MB) - Most Efficient
**Why efficient:**
- **Direct machine code** - No compiler overhead, optimal instructions
- **No runtime** - Only necessary code, no overhead
- **Optimal memory** - Minimal allocations, precise control
- **No abstractions** - Direct hardware access
- **Manual optimization** - Every instruction optimized by hand

**Implementation:**
- Uses Machin's formula with manual BigInt operations
- Direct register manipulation for arithmetic
- No function call overhead
- Minimal memory footprint (1.4 MB)

#### 3. Nim (20 ms, 1.5 MB) - Fast
**Why fast:**
- **Compiles to C** - Leverages C compiler optimizations
- **Minimal runtime** - Small standard library overhead
- **Efficient GC** - Optional garbage collector, can be disabled
- **Direct compilation** - No intermediate bytecode
- **Optimized for speed** - Designed for performance

**Implementation:**
- Compiles Nim code to C, then to machine code
- Uses Machin's formula with efficient BigInt
- Minimal runtime overhead (1.5 MB)

#### 4. Rust (20 ms, 1.6 MB) - Fast
**Why fast:**
- **Optimized BigInt library** - `num-bigint` crate with years of optimization
- **Mature compiler** - LLVM backend with aggressive optimizations
- **Zero-cost abstractions** - High-level code compiles to efficient machine code
- **No garbage collection** - Manual memory management with safety guarantees
- **Optimized allocator** - Efficient memory allocation

**Implementation:**
```rust
use num_bigint::BigUint;  // Optimized library
// Uses Machin's formula with BigUint operations
// LLVM optimizes to near-assembly performance
```

#### 5. Lua (20 ms, 2.1 MB) - Fast for Interpreted
**Why fast:**
- **Lightweight VM** - Minimal interpreter overhead
- **Small runtime** - Designed for embedding
- **Efficient tables** - Optimized data structures
- **JIT available** - LuaJIT can compile to machine code
- **Simple design** - Minimal language complexity

**Implementation:**
- Uses Lua's number type (double precision)
- Machin's formula with floating-point
- Very small runtime (2.1 MB)

#### 6. Objective-C (20 ms, 6.0 MB) - Moderate
**Why moderate:**
- **Runtime overhead** - Objective-C runtime, message passing
- **ARC** - Automatic Reference Counting overhead
- **Foundation framework** - Large standard library
- **Dynamic dispatch** - Method calls have overhead
- **Good optimization** - LLVM compiler optimizations

**Implementation:**
- Uses Foundation framework for BigInt
- Machin's formula with runtime overhead
- Larger memory footprint (6.0 MB)

#### 7. Swift (20 ms, 5.9 MB) - Moderate
**Why moderate:**
- **ARC overhead** - Automatic Reference Counting
- **Swift runtime** - Large standard library
- **Safety checks** - Bounds checking, overflow checks
- **Dynamic features** - Protocol witness tables
- **Good optimization** - LLVM compiler

**Implementation:**
- Uses Foundation for BigInt
- Machin's formula with ARC overhead
- Large runtime (5.9 MB)

#### 8. Zig (22 ms, 2.98 MB) - Slower than Rust
**Why slower:**
- **Custom BigInt** - Manual implementation vs optimized library
- **Newer compiler** - Fewer optimizations than Rust
- **More memory** - Overhead in standard library
- **Less mature** - Younger language (2016 vs 2015)
- **Manual memory** - More allocations than needed

**Implementation:**
```zig
// Custom BigInt implementation
const BigInt = struct {
    digits: []u32,
    len: usize,
    // Manual add, sub, mul, div operations
}
```
- Uses Machin's formula with custom BigInt
- More operations per calculation
- Less optimized than Rust's `num-bigint`

#### 9. C++ (23 ms, 1.5 MB) - Fast
**Why fast:**
- **Mature compiler** - LLVM/GCC with aggressive optimizations
- **Template metaprogramming** - Compile-time optimizations
- **STL optimizations** - Highly optimized standard library
- **Manual memory** - No garbage collection overhead
- **Inline functions** - Zero-overhead abstractions

**Implementation:**
- Uses custom BigInt or boost::multiprecision
- Machin's formula with optimized operations
- Minimal overhead (1.5 MB)

#### 10. Go (24 ms, 3.9 MB) - Moderate
**Why moderate:**
- **Runtime overhead** - Garbage collector, goroutines
- **Larger binary** - Runtime included in binary
- **Safety checks** - Bounds checking, GC overhead
- **Not as optimized** - Younger than C/C++
- **Good balance** - Performance + safety + concurrency

**Implementation:**
- Uses `math/big` package for BigInt
- Machin's formula with GC overhead
- Runtime included (3.9 MB)

### Middle Performers (Rank 11-20)

#### 11. C (25 ms, 1.7 MB) - Fast
**Why fast:**
- **Mature compiler** - Decades of optimization
- **Direct compilation** - No runtime overhead
- **Manual memory** - Precise control over allocations
- **Optimized libraries** - GMP library for BigInt
- **Low-level access** - Direct hardware control

**Implementation:**
- Uses GMP library for arbitrary precision
- Machin's formula with optimized arithmetic
- Minimal runtime (1.7 MB)

#### 12. D (30 ms, 2.5 MB) - Moderate
**Why moderate:**
- **GC overhead** - Garbage collector included
- **Runtime** - D runtime library
- **Good optimization** - LLVM/GCC backend
- **Multiple backends** - Can compile to C or native
- **Balance** - Performance + safety

**Implementation:**
- Uses std.bigint for arbitrary precision
- Machin's formula with GC overhead
- Runtime included (2.5 MB)

#### 13. Bash (30 ms, 2.1 MB) - Moderate
**Why moderate:**
- **Shell overhead** - Process creation, pipes
- **External commands** - Uses `bc` for calculations
- **Interpretation** - Line-by-line execution
- **Process spawning** - Each command is a new process
- **Good for scripting** - Not designed for computation

**Implementation:**
- Uses `bc` command for arbitrary precision
- Machin's formula with shell overhead
- Process spawning overhead (2.1 MB)

#### 14. Fortran (31 ms, 1.8 MB) - Moderate
**Why moderate:**
- **Array operations** - Optimized for numerical computing
- **Mature compiler** - Decades of optimization
- **No GC** - Manual memory management
- **Scientific focus** - Designed for numerical work
- **Good optimization** - LLVM/GCC backend

**Implementation:**
- Uses intrinsic functions for precision
- Machin's formula with numerical optimizations
- Minimal overhead (1.8 MB)

#### 15. Crystal (32 ms, 3.2 MB) - Moderate
**Why moderate:**
- **Ruby-like syntax** - Compiles to machine code
- **GC overhead** - Garbage collector included
- **Runtime** - Small runtime library
- **Type inference** - Compile-time type checking
- **Good performance** - Near-C speed

**Implementation:**
- Uses built-in BigInt support
- Machin's formula with GC overhead
- Runtime included (3.2 MB)

#### 16. Dart (34 ms, 14.5 MB) - Moderate
**Why moderate:**
- **VM overhead** - Dart virtual machine
- **JIT compilation** - Compiles to machine code
- **GC overhead** - Garbage collector
- **Moderate runtime** - Smaller than JVM
- **Good optimization** - Designed for web/mobile

**Implementation:**
- Uses `dart:math` for precision
- Machin's formula with VM overhead
- Moderate runtime (14.5 MB)

#### 17. Haskell (40 ms, 11.9 MB) - Moderate
**Why moderate:**
- **Lazy evaluation** - Only computes what's needed
- **GC overhead** - Garbage collector
- **Functional overhead** - Immutability, pattern matching
- **Runtime** - GHC runtime system
- **Good optimization** - LLVM backend

**Implementation:**
- Uses `Integer` type for precision
- Machin's formula with lazy evaluation
- Moderate runtime (11.9 MB)

#### 18. Java (46 ms, 43.1 MB) - Moderate
**Why moderate:**
- **JVM startup** - Virtual machine initialization
- **JIT compilation** - Compiles bytecode to machine code
- **Large runtime** - JVM includes extensive libraries
- **GC overhead** - Garbage collector
- **Good optimization** - HotSpot JIT is mature

**Implementation:**
- Uses `BigInteger` class for precision
- Machin's formula with JVM overhead
- Large runtime (43.1 MB)

#### 19. Python (47 ms, 9.7 MB) - Moderate
**Why moderate:**
- **Interpretation** - Bytecode execution
- **Dynamic typing** - Runtime type checking
- **GIL** - Global Interpreter Lock
- **Large runtime** - Comprehensive standard library
- **Good optimization** - CPython is well-optimized

**Implementation:**
- Uses `decimal` module for precision
- Machin's formula with interpretation overhead
- Moderate runtime (9.7 MB)

#### 20. Perl (47 ms, 12.5 MB) - Moderate
**Why moderate:**
- **Interpretation overhead** - Line-by-line execution
- **Dynamic typing** - Type checking at runtime
- **Large runtime** - Comprehensive standard library
- **Regex engine** - Powerful but overhead
- **Mature** - Years of optimization

**Implementation:**
- Uses `Math::BigInt` module
- Machin's formula with interpretation overhead
- Large runtime (12.5 MB)

### Lower Performers (Rank 21-32)

#### 21. Brainfuck (50 ms, 9.3 MB) - Moderate
**Why moderate:**
- **Minimal language** - Only 8 instructions
- **Interpreter overhead** - Must interpret each instruction
- **Simple operations** - No complex operations
- **Small runtime** - Minimal interpreter
- **Educational** - Not designed for performance

**Implementation:**
- Uses custom BigInt implementation
- Machin's formula with interpretation overhead
- Moderate runtime (9.3 MB)

#### 22. Kotlin (60 ms, 45.2 MB) - Moderate
**Why moderate:**
- **JVM overhead** - Same as Java
- **Kotlin runtime** - Additional Kotlin libraries
- **JIT compilation** - Compiles to JVM bytecode
- **GC overhead** - Garbage collector
- **Good optimization** - Leverages JVM optimizations

**Implementation:**
- Uses `BigInteger` from Java stdlib
- Machin's formula with JVM + Kotlin overhead
- Large runtime (45.2 MB)

#### 23. C# (66 ms, 41.5 MB) - Moderate
**Why moderate:**
- **CLR overhead** - Common Language Runtime
- **JIT compilation** - Compiles IL to machine code
- **Large runtime** - .NET framework included
- **GC overhead** - Garbage collector
- **Good optimization** - Mature JIT compiler

**Implementation:**
- Uses `System.Numerics.BigInteger`
- Machin's formula with CLR overhead
- Large runtime (41.5 MB)

#### 24. PHP (68 ms, 26.5 MB) - Moderate
**Why moderate:**
- **Interpretation** - Bytecode execution
- **Dynamic typing** - Runtime type checking
- **Large runtime** - Web-focused standard library
- **GC overhead** - Garbage collector
- **Good for web** - Not designed for computation

**Implementation:**
- Uses `bcmath` extension for precision
- Machin's formula with interpretation overhead
- Large runtime (26.5 MB)

#### 25. Ruby (79 ms, 28.8 MB) - Slower
**Why slower:**
- **Interpretation** - Bytecode execution
- **Dynamic typing** - Runtime type checking
- **Large runtime** - Object-oriented overhead
- **GC overhead** - Garbage collector
- **Less optimized** - Slower than Python

**Implementation:**
- Uses `BigDecimal` library
- Machin's formula with interpretation overhead
- Large runtime (28.8 MB)

#### 26. JavaScript (89 ms, 44.8 MB) - Moderate
**Why moderate:**
- **JIT compilation** - V8 engine compiles to machine code
- **Dynamic typing** - Runtime type checking
- **Large runtime** - JavaScript engine overhead
- **GC overhead** - Garbage collector
- **Good optimization** - V8 is highly optimized

**Implementation:**
- Uses `BigInt` for arbitrary precision
- Machin's formula with JIT overhead
- Large runtime (44.8 MB)

#### 27. Julia (157 ms, 235.9 MB) - Moderate
**Why moderate:**
- **JIT compilation** - Compiles to machine code
- **Large runtime** - Scientific libraries included
- **Dynamic typing** - Runtime type checking
- **GC overhead** - Garbage collector
- **Designed for science** - Optimized for numerical work

**Implementation:**
- Uses `BigFloat` for precision
- Machin's formula with JIT overhead
- Very large runtime (235.9 MB)

#### 28. R (163 ms, 90.9 MB) - Moderate
**Why moderate:**
- **Interpretation** - R interpreter
- **Dynamic typing** - Runtime type checking
- **Large runtime** - Statistical libraries included
- **GC overhead** - Garbage collector
- **Designed for stats** - Not optimized for general computation

**Implementation:**
- Uses `Rmpfr` package for precision
- Machin's formula with interpretation overhead
- Large runtime (90.9 MB)

#### 29. Erlang (176 ms, 77.4 MB) - Moderate
**Why moderate:**
- **BEAM VM** - Erlang virtual machine
- **Functional overhead** - Immutability, pattern matching
- **Concurrency focus** - Designed for distributed systems
- **Large runtime** - BEAM VM included
- **Not optimized for math** - Designed for reliability

**Implementation:**
- Uses `Integer` module for precision
- Machin's formula with BEAM overhead
- Large runtime (77.4 MB)

#### 30. Scala (344 ms, 55.5 MB) - Slow
**Why slow:**
- **JVM overhead** - Same as Java
- **Scala runtime** - Large standard library
- **Functional overhead** - Immutability, pattern matching
- **Complex compilation** - Advanced type system
- **Less optimized** - More overhead than Java

**Implementation:**
- Uses `BigInt` from Scala library
- Machin's formula with functional overhead
- Very large runtime (55.5 MB)

#### 31. Elixir (401 ms, 89.2 MB) - Slow
**Why slow:**
- **BEAM VM** - Erlang virtual machine
- **Functional overhead** - Immutability, pattern matching
- **Concurrency focus** - Designed for distributed systems
- **Large runtime** - BEAM VM + Elixir libraries
- **Not optimized for math** - Designed for reliability

**Implementation:**
- Uses `Integer` module for precision
- Machin's formula with BEAM overhead
- Very large runtime (89.2 MB)

#### 32. TypeScript (931 ms, 218.9 MB) - Slowest
**Why slow:**
- **TypeScript compiler** - Extra compilation step
- **JavaScript runtime** - V8 engine overhead
- **Large runtime** - TypeScript + JavaScript libraries
- **Dynamic typing** - Runtime type checking
- **Not optimized** - Designed for web development

**Implementation:**
- Uses `BigInt` for precision
- Machin's formula with TypeScript + JS overhead
- Very large runtime (218.9 MB)

## Languages Tested

**Compiled (10):** Assembly, C, C++, Rust, Go, Nim, Odin, Fortran, Swift, Crystal

**JIT-Compiled (4):** Java, C#, Kotlin, Julia

**Interpreted (5):** Python, Perl, PHP, Ruby, JavaScript

**Other (15):** Bash, Brainfuck, D, Dart, Elixir, Erlang, Haskell, Lua, Objective-C, R, Scala, TypeScript, Vimscript, Wolfram, Zig

## Repository Structure

```
.
├── README.md                    # This file
├── reports/                     # Detailed performance reports
│   ├── summary.md              # Overall summary
│   ├── 1_decimals.md           # 1 decimal precision
│   ├── 2_decimals.md           # 2 decimals precision
│   ├── 5_decimals.md           # 5 decimals precision
│   ├── 10_decimals.md          # 10 decimals precision
│   ├── 100_decimals.md         # 100 decimals precision
│   ├── 1000_decimals.md       # 1000 decimals precision
│   └── 2000_decimals.md        # 2000 decimals precision
├── timelines/                   # Resource usage timeline data
├── assembly/                    # Assembly implementation
├── c/                          # C implementation
├── cpp/                        # C++ implementation
├── rust/                       # Rust implementation
└── ...                         # Other language implementations
```

## Running the Benchmark

```bash
# Build all languages
./build.sh

# Run all tests
./run_all.sh

# Run specific language
cd c && ./build.sh && ./print_hej

# Run detailed profiling (breaks down execution time)
./profile_detailed.sh 100
```

## Detailed Profiling

The benchmark includes a detailed profiling system that breaks down execution time into components:

- **Startup Time**: Runtime initialization, library loading, JIT compilation
- **Calculation Time**: Algorithm execution, numerical operations
- **I/O Time**: Output formatting, result printing

See [PROFILING.md](PROFILING.md) for detailed documentation.

## License

MIT License - See LICENSE file for details.

---

*Generated from Pi Calculation Benchmark - Apple A18 Pro Performance Study*