Networking-Forums.com

Professional Discussions => Everything Else in the Data Center => Topic started by: Dieselboy on May 31, 2019, 12:11:53 AM

Title: nginx replacement - Traefik reverse proxy basic setup guide
Post by: Dieselboy on May 31, 2019, 12:11:53 AM
Summary

For a while we had been using nginx as a reverse proxy. Located within the DMZ, this accepted requests for urls like jira.domain.com and otherstuff.domain.com via a single public IP. nginx then forwarded the requests to the internal system. We purchased a wildcard cert ($500AUD) from godaddy and used that for this setup. We needed to secure the private key while simultaneously copying this and associated certs to the right places to request the cert in the first place as well as set up nginx to work with it. Once nginx was working it did work well. But there were issues with maintenance such as renewing certs or adding extra config where internal systems had changed such as decom and replace, for example.

To summarise the ongoing challenges:

I came across a different tool a few weeks ago called Traefik (https://traefik.io/). I found that it has a lot of features that can overcome the challenges mentioned above. For example;


Setup guide

I replaced the nginx proxy with Traefik. I'm using a static 1-1 configuration for this, while I've utilised the dynamic ability (will explain further on).

Traffic flow

A browser looks up the URL to IP and is pointed to the Traefik proxy. The proxy matches the 'frontend' endpoint based on the URL and this in turn matches the 'backend' which is the server that will respond to the client browser. Simply:
Browser -> Traefik -> web server

Prerequisites for this guide

Step-by-step

docker network create web

mkdir -p /opt/traefik
touch /opt/traefik/docker-compose.yml
touch /opt/traefik/acme.json && chmod 600 /opt/traefik/acme.json
touch /opt/traefik/traefik.toml



version: '2'

services:
  proxy:
    image: traefik
    command: --configFile=/traefik.toml
    restart: unless-stopped
    networks:
      - web
    ports:
      - 80:80
      - 443:443
    labels:
      # Below true value enables the Traefik dashboard. You can run without a dashboard, set this to false
      - traefik.enable=true
      # Below rule sets the front end tule to access the dashboard. Your browser url basically
      - traefik.frontend.rule=Host:dashboard.domain.com
      - traefik.port=8080
      - traefik.docker.network=web
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /opt/traefik/traefik.toml:/traefik.toml
      - /opt/traefik/acme.json:/acme.json
      - /opt/traefik/endpoints:/opt/traefik/endpoints  

networks:
  web:
    external: true




debug = false

logLevel = "ERROR"
#uncomment below if you want to skip SSL validation.
#useful when going from Traefik to untrusted SSL connection (such as backend with self-signed cert)
#Risk of man-in-the-middle as this is what this enables, so suggest to understand this before enabling
#insecureSkipVerify = true

defaultEntryPoints = ["https","http"]

[entryPoints]
  [entryPoints.http]
  address = ":80"
    [entryPoints.http.redirect]
    entryPoint = "https"
  [entryPoints.https]
  address = ":443"
  [entryPoints.https.tls]

[retry]

[docker]
endpoint = "unix:///var/run/docker.sock"
# Change the domain below
domain = "domain.com"
watch = true
exposedbydefault = false

[acme]
email = "valid@domain.com"
storage = "acme.json"
# Use the letsencrypt staging env. to test this before going live.
# To go live, comment out the caServer. The live server rate-limits certs so it could refuse cert issue if you try the live too often
caServer = "https://acme-staging-v02.api.letsencrypt.org/directory"
entryPoint = "https"
OnHostRule = true
# http challenge attempts to connect to traefik via http. There's also DNS challenge which letsencrypt checks the domain for a TXT record.
[acme.httpChallenge]
entryPoint = "http"

# enable web configuration backend.
[web]
address = ":8080"

# this example uses file and watch. Traefik watches the folder for configs and applies them if there's new configs
[file]
  directory = "/opt/traefik/endpoints/"
  watch = true



For this, best to start with none or one for the first time set up. Then you can expand if you need. This example uses cacti web server.

cacti.toml config file located in endpoints directory.
Backend = real web server
Frontend = what the client browser accesses


[backends]
  [backends.backend-cacti]
    [backends.backend-cacti.servers]
      [backends.backend-cacti.servers.server-cacti-ext]
        url = "http://cacti.internal.domain.com"
        weight = 0

[frontends]
[frontends.frontend-cact]
    backend = "backend-cact"
    passHostHeader = true
# uncomment the below if you want to enable HTTP basic auth. XXX is username YYYY is password. I've not tested this.
#    basicAuth = [
#      HTTP Authentication
#      "xxx:yyyyyyyyyyyy",
#    ]
    [frontends.frontend-cact.routes]
          [frontends.frontend-cact.routes.route-cact-ext]
        rule = "Host:cacti.domain.com"



NOTE: : I have created each front end / back end in separate .toml files to make it simple to locate and manage.


docker-compose -f docker-compose.yml up

We have successfully configured Traefik with kubernetes and DNS challenge with godaddy to issue wildcard cert. This method causes Traefik to connect to godaddy automatically, create the TXT record so that let's encrypt can validate the *.domain.com. Then traefik downloads the valid SSL cert and applies it.

Conclusion

I was so impressed by this and how easy it was to set up, that I deployed one as an internal proxy to accept requests from internal users and forward to internal servers with HTTPS enabled. Without the proxy, we either had to figure out how to install certs on all the systems OR use HTTP only OR use the default, untrusted HTTPS cert if it came with one.
For example, cacti backend is not HTTPS enabled. Traefik proxies HTTPS to HTTP. I didnt need to figure out where / how to install the cert on cacti. So my maintenance consideration for cacti is not complicated by enabling SSL.
Similarly, some other systems (such as JIRA) use a bundled JAVA with a JAVA cert store. It can be an issue to renew SSL certs there. It requires an outage window to do it because you need to stop the server, update the cert store and start the server. If you messed up, then your server wont start and this will take an extended amount of time. With this running as a microservice, I can use HTTP on JIRA and de-couple the SSL configuration from the server itself. This removes the necessity for downtime. In addition, the internal proxy uses internal CA-issued cert (because it's not reachable from the outside).
Although I had been considering to forward all requests (internal and external) through the DMZ proxy. I tested it, it works. But creates additional load on the firewall to go into the DMZ from inside, not that load is an issue anyway.

:mrgreen:

Title: Re: nginx replacement - Traefik reverse proxy basic setup guide
Post by: deanwebb on June 04, 2019, 05:27:51 PM
Thanks for the write-up, it's a great article!
Title: Re: nginx replacement - Traefik reverse proxy basic setup guide
Post by: wintermute000 on June 05, 2019, 04:17:15 AM
So do you literally have 1 container?
How do you do HA/failover/autoscaling? (guessing if its compose that's not part of the deal?)


Can you clarify what you mean by you've deployed it into K8 but you're using docker-compose?
Title: Re: nginx replacement - Traefik reverse proxy basic setup guide
Post by: Dieselboy on June 05, 2019, 09:22:22 PM
I have a few of these running. On both the ones I own - yes within docker (one container), using docker-compose. No HA, no scaling. One is using letsencrypt, the other is using a CA-signed cert from my internal CA.

We have another one running within K8 env. Again it's a single one but is using letsencrypt.

HA / scaling is something that will be looked into next. For now, the cloud app is not using HA because it's in dev stage. This does support HA but I've not tried configuring it yet.
Title: Re: nginx replacement - Traefik reverse proxy basic setup guide
Post by: wintermute000 on June 06, 2019, 05:41:53 AM
Yeah I've messed with precanned ones with letsencrypt built in, pretty slick (the ones where you pass in the parameters at launch via your compose or CLI syntax).

The real frontier is getting it autoscaling and HA etc. and I'm wondering how that's achieved , you'd need LBs in front I'm guessing but how to LB to the LBs lol and then service discovery etc... should just get off my arse and learn K8 properly. Amongst the list of a million other things to learn LOL

Eff it just do it in AWS and run up a ELB, no scaling no worries ha. (read the other day AWS actually run a 'shadow VPC' and scale out HA instances of netscalers sideways, that's why you don't get a static IP only DNS and that's also why they reserve 8 addys for the ELB...)
Title: Re: nginx replacement - Traefik reverse proxy basic setup guide
Post by: Dieselboy on June 06, 2019, 09:40:28 PM
traefik is a LB. For HA you need shared volume to store the cert.

For this k8 deployment, all the config is done in the yaml file. TBH, it's equivalent to the example I used above, very simple.

There are better load balancers out there, but I like this one for it's very small footprint, automation / microservice aspect and simple nature. I mean I have the busiest one set up with 2GB memory and 4 vCPUs and the actual usage during production hours is 25% memory consumption (512MB used memory) and <5% cpu. At the moment it's proxying to 5 different apps. And it's running RHEL 7, Docker and the one container for traefik.

If you want (in my opinion) a great LB... check out Avinetworks' avinet load balancer. The minimum deployment needs 28GB RAM (it's pretty big!) but the level of features you get from using it is immense. For example, you can drill down into a single users session and see the urls that they are accessing and the response codes. OR you can search on an error code and drill into those sessions. OR you can select the "404" errors and drill into exactly what URLs caused those response codes, and which pages they came from / links they clicked on. Plus a ton more, I just used the HTTP codes as an example but you can search on a lot of other things that would show up in the web browser dev tools.
I think that as a provider, being able to be alerted to errors and bad web responses, and being able to act specifically to address those is a valuable resource. Ordinarily, I dont think it's easy to do that (unless I'm lacking some key dev knowledge). This LB also has the ability to scale on demand and scale the application on demand (scale out / in). https://avinetworks.com/why-avi/