Skip to content

nomad: core refactor and setup #48

@noahehall

Description

@noahehall

C

  • complete consul ticket before continuing
  • the core stack now includes vault, haproxy and consul+envoy;
    • bff, ui and postgres are now in a separate stack
    • this separation of platform (core) and web (ui, bff, db) concerns helps drive faster iteration
  • we need to refactor and integrate nomad for orchestration in validation
  • core goals
    • take the output from dev as input to validation
    • validation: execute services on prod like infra
    • push artifacts to nexus for downstream envs

T

  • nomad review: its been awhile
    • nomad notes
    • nomad docs
  • refactor existing nomad logic with intelligence gained from consul ticket
    • directory hierarchy: i think it should be IaC now instead of a nomad dir
    • nomad.sh incorporate new utils dir
    • docker env file: incorporate new .env.auto logic
  • save docker images as tar files so you can use artifact + load instead of running a registry
    • push this to the nexus ticket as that will determine which route we take
  • take another swing at nomad pack it should reduce the amount of inhouse stuff we have to create
    • stay away from levant, no matter how sweet it is
  • integrate nomad with core
  • review nomad resource utilization and update defaults (we were way off in estimates)
  • update aws AMIs to include nomad binary, cni plugins, and post install files
  • think through the interoperability between envs and devise a more efficient management process
    • the initial nomad integration is tiresome, it shouldnt be this way
    • albeit just awhole lotta copypasta and things, this highlights a need for automation/better architecture

A



issue 1: perm
chown: /consul/data: Operation not permitted

we switched the container workdir from /consul to /opt/consul to align with consul web docs
however if you read the consul dockerhub docs it uses /consul and not /opt/consul
solution is to follow the docker hub docs rather than dealing with nomad perm issues at this juncture
a longer term solution is to deal with nomad volume perm issues which doesnt seem as straight forward

issue 2: perm
su-exec: setgroups(994): Operation not permitted

relates to issue 1
finding the root cause of nomad perm issues will likely solve this
and truly resolve issue 1
quick fix: remove `USER consul` from image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    SLOW lane

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions