wiki:PeruserUnderTheHood

Under the Hood

Here's some description how peruser internally works and how requests get handled inside the peruser.

The request cycle

Here's an example, how a simple request gets all the way to the processing:

  • Client connects to HTTP port, sends the request
  • Multiplexer receives the connection, reads the request and checks for which virtualhost it is
  • Multiplexer checks if a worker is available for the request, then forwards it.
  • Multiplexer returns to waiting for new connections
  • Worker receives the connection and handles the request
  • If KeepAlive? is enabled, the worker listens on the connection for more requests:
    • If the client happens to send a request for another virtualhost (which the worker cannot handle) within the same connection, it sends the connection back to the multiplexer
  • Worker returns to waiting for new connections

Process types

Main process

The main process never handles any connections from the outside, but only handles the maintenance of the children and spawns new children if required.

This process runs under root privileges as these are required to switch users after forking.

Multiplexer

Multiplexers basically listen on the public port (:80) and read the request to determine for which virtualhost it is. After determining the virtualhost, it forwards the request to it's server environment (worker pool). Multiplexers run under the user/group defined by "User" and "Group" directives.

It does this by firstly determining the virtualhost to which the request is made and then passing the request to the worker pool assigned to that virtualhost.

If the request is made to an SSL-enabled ip/port then the virtualhost determination is skipped and the socket is directly passed to the first virtualhost's worker pool defined in that ip/port. This also means that the multiplexer does no SSL handshaking - this is all done by the worker.

Processor / worker

Worker is the process where all the requests will be finally executed. Workers run under the user/group defined in the <Processor> tag.

It receives connections from the multiplexer and processes them.

Internally each <Processor> tag creates a PROCESSOR type worker in the child table that never gets cleaned up. This means that if the server limit has been reached, then a single worker can still handle requests for his virtualhosts.

Waiting for the processor

In Peruser 0.3.0 and earlier the multiplexer would send the connection to the worker pool without checking if there are any free workers to receive the connection. This would leave the multiplexer in a blocked state until some worker comes around and receives the connection. If there is some major problem with the virtualhost (eg MySQL server is not responding) then no worker may become available until one of them is killed by the parent if it's ExpireTimeout is reached - this leaves any new request made to that virtualhost leaving a multiplexer blocked, which will gradually bring the server down to a halt.

In order to fix this problem, the multiplexers now check if there are any free workers in the pool before passing the socket and try to wait for them if there isn't (or drop the request if none comes available).

If the workers are busy when the multiplexer starts to pass the connection, it will try to wait for them to finish - the maximum time in seconds to wait is calculated by formula: (availability / 100) * ProcessorWaitTimeout where availability is by default 100 and ProcessorWaitTimeout is the directive in the configuration (default is 5).

If the workers are still busy after waiting the maximum time, then it will reduce the availability of the worker pool by 10 and drop the request with error 503 (SERVICE UNAVAILABLE). However if any of the workers come available, the availability is reset to 100 and the request is passed.