Skip to content
This repository has been archived by the owner on Jun 15, 2024. It is now read-only.

Multi-Node support #168

Open
flosell opened this issue Apr 9, 2017 · 0 comments
Open

Multi-Node support #168

flosell opened this issue Apr 9, 2017 · 0 comments

Comments

@flosell
Copy link
Owner

flosell commented Apr 9, 2017

Narrative

AS A developer on a complex project
I WANT to run builds across multiple machines
SO that I can use all their capabilities in a single pipeline

Motivation

  • Complex projects might need more compute power than is available on a single machine
  • Platform dependent projects (e.g. mobile development, embedded software) need different machines with specific capabilities for some parts of their build (e.g. compile and test binaries on windows, osx and linux)

Possible Solutions

Quick: Additional simple steps to execute commands remotely

  • Implement a library to run shell commands on a remote machine and transmit the context (e.g. working directory)

  • Possible usage:

    (defn step-that-does-stuff-remotely [args ctx]
      (remote-ssh/bash ctx :osx (:cwd args) "./do-stuff.sh"))
  • Pseudo-Code implementation:

    (ns lambdacd-remote-ssh.core)
    
    (defn bash [ctx node-label cwd & cmds]
      (let [host (select-host ctx node-label)]
        (scp-working-directory ctx cwd host)
        (execute-via-ssh ctx cmds)))
  • can be implemented as a separate library right away (and by the community) within the existing design, no additional thoughts or architectural changes needed. However, it does only solve the immediate problem of executing shell scripts. Other types of steps (the real power of using clojure for build pipelines) could not profit from this or needed to be re-implemented in a remote fashion as well.

Comprehensive: Master/Agent LambdaCD with control-flow

  • Distribute LambdaCD code across multiple nodes and use control-flow steps to orchestrate which node executes a particular part of the pipeline
  • Possible usage:
    (def pipeline `(
      (node :master
        clone-and-build-artifact)
      (in-parallel
        (node :osx
          run-tests)
        (node :linux
          run-tests)
        (node :windows
          run-tests))))
  • Pseudo-Code implementation:
    (defn node [node-label & steps]
      (fn [args ctx]
        (tell-host-to-execute-steps (select-host ctx node-label) args ctx steps)))
  • Since LambdaCD steps can be arbitrary clojure code with arbitrary dependencies, this solution requires a way to deploy the LambdaCD code onto multiple nodes and keep it in sync
  • would allow all of LambdaCD to be used across nodes
  • A single process (e.g. the master) would control the main orchestration (but this could be as simple as "hand off to someone else")

Use snapshots and retriggering-features to hand off to different nodes

  • We already have retriggering to (re)start pipelines at arbitrary points. We could use this to "hand off" pipeline execution to another process somewhere else
  • Possible usage:
    (def pipeline `(
      clone-and-build-artifact
      (hand-off-to :osx)
      run-tests))
  • Since retriggering is already complex and doesn't cover all corner-cases, this is would probably only allow hand-offs at specific points.
  • It would allow to have multiple equal nodes so we could e.g. seamlessly hand off a pipeline to a new LambdaCD instance when we deploy ourselves.
  • Since LambdaCD steps can be arbitrary clojure code with arbitrary dependencies, this solution requires a way to deploy the LambdaCD code onto multiple nodes and keep it in sync
  • would allow all of LambdaCD to be used across nodes

Options to distribute LambdaCD code across multiple nodes

Rely on external processes to distribute the code

Similar to how LambdaCD is currently rolled out and updated on a single machine, we rely on others to roll out and update LambdaCD on multiple machines, e.g. by creating a separate build pipeline that rolls out LambdaCD (meta-pipeline). This would also fit the current philosophy of treating LambdaCD as "just another piece of code"

Use some kind of reflection to copy the code from master to agents

The master could use some kind of reflection (in the simplest case, provide the location of your own uberjar) to access its own code and copy it onto the agents on demand (e.g. using scp)

Rely on nrepl to execute code for us remotely

We could run nrepl-instances on the agents, then just call them to execute parts of the pipeline. Still requires the nrepl agents to have all dependencies (e.g jgit) available or a way to pull them as well as having the raw pipeline code available so it can be executed by a remote nrepl

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant