# Guide - How it works ## Overview This project demonstrates a minimal **TLS client/server** that speaks a simple, line-based protocol and includes a **Proof-of-Work (WORK)** step solved by a fast **C++ helper**. You also get a thorough **pytest** suite (unit + integration) and **Sphinx** docs/doctests. --- ## How to run ![Demo](demo.gif) For complete instructions, see [Installation](https://github.com/elliottbache/tls_line_protocol/blob/master/README.md#installation) and [Execution and usage](https://github.com/elliottbache/tls_line_protocol/blob/master/README.md#execution--usage). --- ## High-level flow ```{mermaid} sequenceDiagram autonumber participant C as Client participant S as Server S->>C: HELLO\n C->>S: HELLOBACK\n S->>C: WORK \n C->>C: find suffix so that SHA256(token+suffix) has N zero trailing bits C->>S: \n S->>C: SOCIAL \n (and other info requests) C->>S: \n S->>C: DONE\n C->>S: OK\n ``` README shows a static snapshot; this Mermaid diagram is the source. Changes made to this diagram can be compiled in an online editor, exported as ```.svg```, and placed in ```docs/_static/``` to replace the current diagram in README.md. ## Key properties - **TLS** for transport security (optionally mutual auth). - **Deterministic protocol** (plain text, newline-terminated). - **Time-bounded WORK** (30mn cap) with multi-threaded C++ backend. - **Portable tests** that mock subprocess/SSL or use a throwaway TLS server. --- ## Architecture ### Components - **Python package** (`src/tlslp/`) - `tlslp.protocol` - `send_message(...)` / `receive_message(...)` – newline-delimited UTF-8 transport helpers. - `ProtocolError` / `TransportError` – consistent exceptions for protocol vs transport/TLS failures. - `_parse_positive_int(...)` – CLI helper for positive integer arguments. - `tlslp.server` - `prepare_server_socket(...)` – bind/listen and create a TLS server `SSLContext`. - `send_and_receive(...)` – one request/response with validation (checksum / WORK suffix). - `handle_one_session(...)` – run handshake + message loop for one client connection. - `main(...)` – CLI entrypoint. - `tlslp.client` - `prepare_client_socket(...)` – build TLS client context and return a wrapped socket. - `connect_to_server(...)` – connect with clear error reporting. - `decipher_message(...)` – validate/parse incoming line into tokens. - `run_work_binary(...)` / `handle_work_cpp(...)` – run external WORK solver and parse `RESULT:`. - `define_response(...)` – generate a response string for a command. - `_process_message_with_timeout(...)` – enforce per-command timeouts (WORK vs others). - `main(...)` – CLI entrypoint. - `tlslp.logging_utils` - `configure_logging(...)` – root logger setup (file + stderr). - **C++ WORK solver** (`cpp/`) - `work_challenge.cpp` + `work_core.*` – compiled to an executable that prints a `RESULT:` line. - Default runtime location is `src/tlslp/_bin/work_challenge` (override with `--work-binary`). ### Repository layout ```text ├── .dockerignore ├── .gitattributes ├── .github │   ├── ISSUE_TEMPLATE │   │   ├── bug_report.md │   │   └── feature_request.md │   ├── PULL_REQUEST_TEMPLATE.md │   ├── codeql │   │   └── codeql-config.yaml │   ├── dependabot.yaml │   └── workflows │   ├── ci.yaml │   ├── codeql.yaml │   └── pages.yaml ├── .gitignore ├── .pre-commit-config.yaml ├── .readthedocs.yaml ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── LICENSE ├── Makefile ├── README.md ├── cpp │   ├── CMakeLists.txt │   ├── work_challenge.cpp │   ├── work_core.cpp │   ├── work_core.h │   ├── work_core_internal.h │   └── tests │   └── work_core_test.cpp ├── docker │   ├── client.Dockerfile │   └── server.Dockerfile ├── docker-compose.yaml ├── docs │   ├── Doxyfile │   ├── Makefile │   ├── _static │   │   └── flow_diagram.svg │   ├── conf.py │   ├── cpp_api.rst │   ├── demo.cast │   ├── demo.gif │   ├── guide.md │   ├── index.rst │   ├── intro.md │   ├── make.bat │   ├── python_api.rst │   └── requirements.txt ├── pyproject.toml ├── scripts │   ├── compare_tutorial_logs.sh │   ├── install-cpp-deps.sh │   ├── install-python-deps.sh │   └── make-certs.sh ├── src │   └── tlslp │   ├── __init__.py │   ├── client.py │   ├── logging_utils.py │   ├── protocol.py │   └── server.py └── tests ├── _helpers.py ├── conftest.py ├── test_client.py ├── test_protocol.py ├── test_server.py └── test_tls.py ``` Generated with: ```bash tree -a -L 4 -I ".git|.venv|__pycache__|*.egg-info|.pytest_cache|.mypy_cache|.ruff_cache|build|_build|_codeql_build_dir|certificates" ``` --- ## Dataflow (client-side) 1. Establish a TLS connection (optionally insecure for localhost development). 2. Receive **exactly one line** (newline-delimited UTF-8) via `receive_message(...)`. 3. Parse into tokens with `decipher_message(...)` and validate the command name. 4. Generate a response, enforcing per-command timeouts: - Non-WORK commands use a short timeout (default 10s, configurable). - WORK can be long (default 30mn, configurable) and is solved by an external binary. 5. Send the response using `send_message(...)` (adds `\n` if missing). 6. Stop on `FAIL` from the server or after responding to `DONE`. The following should be taken into account: - Multiline messages are not supported. Each command sent by the server was meant to be answered with a single-line. Any more lines would fall outside the scope of the proper functioning of this program and should thus be treated as an exception. - Multiprocessing is used to take into account the imposed timeouts. All commands have a timeout of 10 seconds except the WORK challenge, which has a 30 minute timeout. --- ## Protocol - **Transport**: TLS. - **Encoding**: UTF-8 text. - **Framing**: newline-delimited (`\n`). `receive_message(...)` returns the line **without** the trailing newline. - **Size limits**: single line must not exceed `MAX_LINE_LENGTH` bytes. - **Hasher**: SHA256 is used by the challenge for checksums / WORK validity. ### Commands (server → client) - `HELLO\n` → client replies `HELLOBACK\n`. - `WORK \n` → client replies with a valid `\n`. - **Validity**: `SHA256(token + suffix)` ends with `` zero bits. - Info requests (examples below use `` as server-provided string): - `FULL_NAME \n` - `EMAIL1 \n` - `EMAIL2 \n` - `SOCIAL \n` - `BIRTHDATE \n` - `COUNTRY \n` - `ADDR_LINE1 \n` - `ADDR_LINE2 \n` - `FAIL \n` → client should stop. - `DONE\n` → client replies `OK\n` and closes. ### Client responses Client reply format: - `HELLOBACK\n` for `HELLO\n`. - `\n` for `WORK\n`. - ` \n` for info commands. - `OK\n` for `DONE\n`. ### Error handling - Invalid UTF-8, missing newline (typically surfaces as a timeout), overlong line, or unexpected tokenization → treat as a protocol/transport error and stop. - Timeouts are enforced **per operation**: - Client: WORK vs non-WORK handling is bounded (`--work-timeout` / `--other-timeout`). - Server: socket timeouts are set per request/response step. --- ## TLS Setup This repo supports two modes: ### Insecure mode (local dev) Enable with `--insecure`. - Client disables server certificate verification: - `verify_mode = ssl.CERT_NONE` - `check_hostname = False` - Server does not require a client certificate: - `verify_mode = ssl.CERT_NONE` Use this **only** for localhost development and tests where you explicitly want to skip verification. ### Secure mode (mTLS) Default mode (when `--insecure` is not set). - Client verifies the server certificate against the provided CA cert **and** presents a client cert: - `verify_mode = ssl.CERT_REQUIRED` - `check_hostname = True` - `cafile = ` - `load_cert_chain(client_cert, private_key)` - Server requires and verifies a client certificate: - `verify_mode = ssl.CERT_REQUIRED` - `load_verify_locations(cafile=)` Important: in secure mode, `--host` must match the server certificate’s hostname/SAN, because hostname verification is enabled. ### What’s tested (and why it matters) The test suite validates the key security properties: - **Server authentication**: a client trusts a CA and verifies the server certificate (including hostname via `server_hostname`). - **Client authentication (mTLS)**: the server requires a client certificate and rejects clients that don’t present one. - **Ephemeral certs in tests**: integration tests generate short-lived CA/server/client certs at runtime (via `trustme`), keeping the repo free of committed keys. - A guide for certificate generation is linked from the README installation section. --- ## Testing This repo uses **pytest** for unit tests. The goal is fast feedback locally while still validating the core TLS behaviors end-to-end. ### Run everything ```make test``` or ```pytest``` What this covers: - Protocol helpers (```send_message```, ```receive_message```) using ```socket.socketpair()``` (no network). - Client/server logic using faked sockets and faked SSL contexts. - WORK handling by mocking ```subprocess.run``` (no real C++ binary execution). ### Sphinx doctests Doctests are treated as **documentation checks**, not full integration coverage. When running doctests: - It is preferred to run them as part of the docs build (Sphinx doctest) or with pytest configured to include doctests. - Doctests are kept deterministic, they avoid: - needing real certificate files - binding to real ports - time-based output If a doctest needs TLS objects, a **mocked contexts** approach is used (e.g. FakeContext) rather than real files on disk. --- ## Security implementation This repo is a demo, but it’s still useful to treat it as if it exposed beyond localhost. In the future, extending this repo to a production-grade project will be easier this way. The goal is to fail closed: treat all network input as untrusted, validate aggressively, and stop on protocol violations. ### Security Notes - In production: - **Don’t** disable verification in production. Use CERT_REQUIRED and verify hostnames. - Keep keys/PEMs with restricted file permissions. - Treat **all server input** as untrusted; never eval or exec remote data. - The WORK solver uses only deterministic hashing—no code execution. ### Timeouts **Why**: Prevent hangs (slowloris-style reads, stuck handshakes, blocked subprocesses), and make failure modes predictable. **How** this is done: - Use **per-operation socket timeouts** rather than an unbounded read loop. - Server: each request/response step sets a timeout appropriate to the step (WORK vs non-WORK). - Client: command handling is bounded with a hard timeout (WORK vs non-WORK) enforced in `_process_message_with_timeout(...)`. - Bound the WORK solver execution time: - `subprocess.run(..., timeout=...)` inside `run_work_binary(...)`. Defaults are chosen for the protocol (short for non-WORK, long for WORK), and are configurable via CLI flags. ### Message size limits **Why**: Avoid memory growth / DoS by sending huge lines. **How** this is done: - Use a strict **line-based protocol**: UTF-8, **single line**, newline-terminated (\n). - Enforce a **maximum line length** (i.e. MAX_LINE_LENGTH = 1000). - Reject: - non-UTF-8 bytes - missing newline (if the peer never sends \n, you’ll time out) - overlong lines - unexpected extra tokens for commands that require a fixed arity Notes: If you read “until newline”, a “no newline” input will typically surface as a **timeout** (good), not a “no newline” error. This situation is tested in the pytest suite (src.test_protocol.test_receive_message_no_newline). ### Validating server commands **Why**: The server controls the client’s behavior. Tight validation prevents weird edge cases and reduces attack surface. **How** this is done: - Parse exactly one line, then: - split into tokens - require the command token to be in the allowlist (`DEFAULT_VALID_MESSAGES`) - Treat protocol violations as `ProtocolError` and network/TLS failures as `TransportError`. - Validate sensitive inputs before using them for hashing / subprocess calls: - `token` character set / length - `n_bits` is an integer in a reasonable range - On server side, validate: - WORK suffix correctness - checksum correctness for info replies ### Retry strategy **Why**: The client may be asked to try multiple ports. **How** this is done: - The client iterates the configured port list and attempts to connect until one succeeds. - There is no exponential backoff (kept intentionally simple for the protocol). - The client does not “retry” after protocol violations during an established session. ### Subprocess safety The WORK solver is intentionally an external executable. It is treated like untrusted input/output. **How** this is done: - Never use ```shell=True```. - Pass args as a list: ```["/abs/path/work_challenge", token, n_bits]```. - Resolve and vet the binary path: - ```Path(...).resolve(strict=True)``` (prevents PATH tricks) - refuse symlinks - in POSIX: - reject world-writable binary or directory (this is not done in Windows) - require executable bit - Scrub environment: - minimal env ```{"LC_ALL": "C"}``` - Bound resource usage: - ```timeout=...``` (you do this) - Parse output defensively: - parse exactly one RESULT: line - enforce suffix character set and max length ### Sanitizing paths **Why**: If you ever accept paths from configuration/CLI, you want deterministic, safe behavior. **How** this is done: - Convert to ```Path```, then: - ```resolve(strict=True)``` - ```is_file()``` - not ```is_symlink()``` - check permissions (e.g., not world writable) - Prefer an allowlist approach: - keep binaries/certs under known directories (```./build```, ```./certificates```) - reject anything outside that root (```path.is_relative_to(root)```) ### TLS hardening: versions, ciphers, and options **Why**: TLS defaults are usually okay, but should be documented. **How** this is done: - Require TLS 1.2+: - context.minimum_version = ssl.TLSVersion.TLSv1_2 - Use hostname verification when “secure mode” is on: - check_hostname = True - verify_mode = ssl.CERT_REQUIRED ### For future implementations: - Certificate pinning **Why**: Even with CA verification, pinning reduces risk if a CA is compromised or for tighter control. --- ## WORK Solver (C++) **Goal**: Find a ``suffix`` so that ``SHA256(token + suffix)`` ends with ``N`` binary ``0``s (i.e., ``bits = N`` zero bits). ### Strategy - SHA256 is **initialized** with ```token```, then a copy is **updated** with each new suffix. - **Counter-based suffix** (deterministic, minimal RNG use). - **Thread sharding**: each thread walks different counters (``base + tid``, ``step = total_threads``). - **Bit test**: Check full zero bytes; then mask remaining bits: - **Keyspace size**: - In binary, each digit represents 1 bit. e.g. for $n_{\text{bits}} = 20$, we have 20 digits. - The expected number of trials to find a digest whose binary representation has $n_{\text{bits}}$ trailing zeros is $2^{n_{\text{bits}}}$. For $n_{\text{bits}}=20$, we would have $2^{20} \approx 1 \times 10^6$ expected trials. - Therefore, the suffix length is defined so that keyspace $k ≥ 2^{n_{\text{bits}}}$. - With a string that is $s$ long and a $c$ large character set (e.g. $c=26$ for the lowercase alphabet), the number of unique strings that can be generated is $k = c^s$. For $c=64$ and $s=4$, we have $k=64^4 \approx 17 \times 10^6$. - This would be enough string characters for a $2^20 \approx 1 \times 10^6$ expected trials to find a binary that has 20 trailing zeros. ### Build The Python client expects an executable WORK solver. By default, it looks for it at: `src/tlslp/_bin/work_challenge` (override with `--work-binary`). #### Recommended ```bash make build-cpp ``` #### Manual build If you build manually (alternative to [Recommended](#recommended)) from cpp/, compile and then copy the binary into the package’s _bin/ directory (so the default path works). From root: ```bash mkdir -p build cd cpp g++ -O3 -std=c++17 work_challenge.cpp work_core.cpp -o ../build/work_challenge -lssl -lcrypto -pthread cp ../build/work_challenge ../src/tlslp/_bin/work_challenge ``` ### Run: From root: ```bash ./src/tlslp/_bin/work_challenge # stdout line: RESULT:\n ``` ### Benchmarking Tests were carried out for various difficulties on a standard laptop (Intel i5-1235U, 1300 MHz, 10 cores). The tool automatically created 12 threads. The calculation times for the average of multiple runs are shown in the following table. | **n_bits** | **16** | **20** | **24** | **28** | **32** | **36** | |-----------------------|-------|-------|-------|-------|-------|-------| | **Number of runs** | 100 | 100 | 100 | 100 | 100 | 100 | | **Total run time (s)** | 1 | 1 | 8 | 260 | 5319 | 24397 | | **Average run time (s)** | 0.01 | 0.01 | 0.08 | 2.60 | 53.19 | 243.97 | --- ## Future work - Create integration test asserting full transcript from main().