Friday, June 28, 2019

Workaround: HTTP 502 "Incomplete response received from application" in Rails app on Phusion Passenger server

I've recently been fighting an intermittent issue in a specific Ruby on Rails app running on a Phusion Passenger web server on my local macOS development machine: HTTP requests to the app would intermittently return the error "Incomplete response received from application" wrapped in a simple h2 element, instead of the desired resource.

When this occurred, I'd also see a message like the following appear in the Phusion Passenger web server log:

App 6510 output: [ 2019-06-27 12:07:49.0318 6510/0x00007f83639eb6f8(HTTP helper worker) utils.rb ]: *** Exception Errno::EBADF (Bad file descriptor) (process 6510, thread 0x00007f83639eb6f8(HTTP helper worker))

Oddly, no one else working on this application saw these errors while running the app on their machines, even though as far as we could tell, we were all running pretty much the exact same environment.

The workaround

 I found that I could get this issue to stop manifesting on my machine by appending the following parameter to my passenger server start-up command:

--spawn-method direct

This resulted in the complete Passenger start command for the application looking like:

bundle exec passenger start --port 3000 --max-pool-size 4 --min-instances 4 --spawn-method direct


The key to coming up with this workaround was an article on the Phusion Passenger website: Spawn methods explained (for Ruby developers).

That article talks about how, in order to save memory and improve start-up time, Passenger uses a "spawn method" setting of "smart" for Ruby applications by default. That setting accomplishes that by having multiple server processes share certain objects in memory.

A caveat of the "smart" spawn mode noted by the article is unintentional file descriptor sharing. This caught my attention: Research I'd done on the "EBADF (Bad file descriptor)" error noted that it could be caused when a file descriptor -- that is, a handle pointing to a resource like a local file or a network socket -- is closed, but then accessed again after being closed. If such a handle was being unintentionally shared between web server processes, it does seem entirely plausible that one process could close a handle, then the other process could try to use it for something.

It seemed logical that configuring Passenger to not share objects between processes, via using the "direct" spawn mode setting instead of the "smart" setting, would avoid this. Sure enough, using the "direct" setting did immediately stop the "Incomplete response from application" errors I was seeing in my local environment.  Additionally, running the app on my local machine (with myself as the only user), I didn't observe any meaningful degradation in app performance.

What I'm still unsure about at this point is exactly what specific handle was being incorrectly shared between processes, and why no one else working on this application was having the same error that I was manifest in their environments. Thus, I'm considering this a "workaround" rather than a "solution" for the time being.  (And I'm not committing a change to have the app use the "direct" spawn mode in any environment other than my own.)

Other solutions attempted

Prior to finding the workaround, as all signs pointed to this being some kind of environmental issue with my local machine, I tried to "fix" my environment in quite a few different ways:
  • Use a different client browser
  • Restart the Passenger web server
  • Reboot 
  • Reinstall Ruby 2.6.3
  • Revert to an older Ruby version (2.5.3) using rvm (and uninstall 2.6.3)
  • Reinstall gems via bundle install --force
  • Clone a fresh copy of the Rails application code itself from the remote repository
  • Run the application on a different port (other than 3000)
  • Update the database (mySQL) to the latest patch version
  • Increase the limit of open files per process via ulimit -n 8192
  • Increase the database pool size in database.yml
None of those attempted fixes made a difference; the "Incomplete response received from application" response to HTTP requests, and the "Errno::EBADF (Bad file descriptor)" in the server log continued to manifest.

I had initially suspected the Ruby 2.6.3 upgrade to have something to do with the problem, as I had first started noticing the issue around the time I made that upgrade locally. However, as downgrading to 2.5.3 didn't fix the issue -- and others working on this app were running 2.6.3 with no issues -- I ended up concluding that the 2.6.3 upgrade did not seem to be the direct cause of the issue.

I'd very much like to understand what is actually causing this error in my environment. In the meantime, though, I'm very happy to finally have a viable workaround, so I can get work done again!

Wednesday, February 20, 2019

Throttling Skype bandwidth on macOS

I'm a full-time remote worker, with all of my teammates located together in the office. To help maintain close communication with my team, I have a "telepresence" Skype call with the office that's always on, so my teammates and I can see and talk to one another.

We're currently using Microsoft's Skype as our videoconferencing solution, as it is the only one I've found that (1) works reasonably reliably; (2) supports auto-answer (so I can activate the call without someone in the office needing to manually answer it every day); and (3) is free.

Unfortunately, as with many Americans -- particularly in localities where only a single company provides high-speed land-line Internet -- my Internet service provider recently started imposing a bandwidth cap on my monthly Internet usage. My cap is 1000 GB per month. Between my job, and my family's normal Internet usage, we've been either coming close to, or exceeding, that cap on a regular basis.

As one part of a strategy to try and keep my home's Internet usage under the cap, I looked into how much bandwidth that always-on Skype call was using. Using the Network tab of my macOS copy of Activity Monitor, it turns out that it was quite a bit: An average of around 350 KB per second (with peaks over 400 KB/sec).

Over the course of the typical 6 hours per day where my team (Pacific Time / GMT -8) and I (Eastern Time / GMT -5) are both online, that works out to about 7.5 GB per day; which in turn works out to about 150 GB over a 4-week period where I'm in the office 5 days per week.

Unfortunately, macOS 10.14 Mojave does not provide an out-of-the-box a way to limit the bandwidth used by a particular application.

Skype for Mac (version 8.39) itself also doesn't provide a quality slider or other rate limiter; it always appears to consume as much as it can.

Complicating matters, Skype 8 also doesn't specify or allow configuration of which specific TCP ports it uses, for possible QoS throttling at the router.  Per a page on the Skype support site, Skype might be using TCP ports anywhere in the range 1000 through 65000.

After a fair amount of research and dead ends, the working solution that I finally landed on was the free version of a software package named Murus, which provides a GUI wrapper around the Packet Filter firewall functionality that comes with macOS.

My Murus configuration is based on a very helpful post on the Murus forum by user "hany".  The variant of hany's instructions that I used is as follows:

  1. Install Murus.
  2. In the main Murus window, in the "Services Library" pane, add a new custom service named "Skype" (or whatever you like). 
  3. In the Ports field, enter 10000:22465, and then on a new line, 22467:65000.  (I excluded port 22466 because per Slack Support, that port is used for Slack calls, and I wanted those to remain highest quality.)
  1. Drag the new service to the "Managed Inbound Services" pane.
  2. On your new Skype service, click the little speedometer-looking icon to bring up the Bandwidth dialog.
  3. Set your desired values for the Upload Bandwidth and Download Bandwidth. I found that values of 768 Kb/second resulted in very usable (if slightly blurry) Skype calls.
  1. Press Cmd-S (or select Firewall > Save Configuration from the top menu).
  2. Click the Play icon button in the top toolbar of the Murus window to start limiting bandwidth. (Press the Stop icon button again later to turn the throttling off again.)
  3. I have found that I sometimes need to quit and restart Skype to see the changes take effect, while observing the bandwidth usage in Activity Monitor.
This is admittedly a non-ideal, somewhat brute force solution, since it affects not only Skype, but any application running on the computer using TCP or UDP ports in the 10000+ range.

However, it has proven to be effective: With this Murus configuration running, Activity Monitor shows my incoming (download) Skype bandwidth dropped to around 80 KB/sec.

That works out to about 34.5 GB of usage over a typical 6-hour-day, 20-day work month -- a savings of some 115 GB over running with no throttling over the course of the month, representing an 12% or so usage reduction in my imposed 1000 GB/month cap.  Not incredible, but not terrible, either!

Thank you to the Murus team for the software; to "hany" for the posted on the Murus forum that helped me with this solution... and to my local ISP monopoly for the opportunity to undertake this interesting learning experience!