Skip to main content

ADR 022: Custom BaseApp panic handling

Changelog​

  • 2020 Apr 24: Initial Draft
  • 2021 Sep 14: Superseded by ADR-045

Status​

SUPERSEDED by ADR-045

Context​

The current implementation of BaseApp does not allow developers to write custom error handlers during panic recovery runTx() method. We think that this method can be more flexible and can give Cosmos SDK users more options for customizations without the need to rewrite whole BaseApp. Also there's one special case for sdk.ErrorOutOfGas error handling, that case might be handled in a "standard" way (middleware) alongside the others.

We propose middleware-solution, which could help developers implement the following cases:

  • add external logging (let's say sending reports to external services like Sentry);
  • call panic for specific error cases;

It will also make OutOfGas case and default case one of the middlewares. Default case wraps recovery object to an error and logs it (example middleware implementation).

Our project has a sidecar service running alongside the blockchain node (smart contracts virtual machine). It is essential that node sidecar connectivity stays stable for TXs processing. So when the communication breaks we need to crash the node and reboot it once the problem is solved. That behaviour makes node's state machine execution deterministic. As all keeper panics are caught by runTx's defer() handler, we have to adjust the BaseApp code in order to customize it.

Decision​

Design​

Overview​

Instead of hardcoding custom error handling into BaseApp we suggest using set of middlewares which can be customized externally and will allow developers use as many custom error handlers as they want. Implementation with tests can be found here.

Implementation details​

Recovery handler​

New RecoveryHandler type added. recoveryObj input argument is an object returned by the standard Go function recover() from the builtin package.

type RecoveryHandler func(recoveryObj interface{}) error

Handler should type assert (or other methods) an object to define if object should be handled. nil should be returned if input object can't be handled by that RecoveryHandler (not a handler's target type). Not nil error should be returned if input object was handled and middleware chain execution should be stopped.

An example:

func exampleErrHandler(recoveryObj interface{}) error {
err, ok := recoveryObj.(error)
if !ok { return nil }

if someSpecificError.Is(err) {
panic(customPanicMsg)
} else {
return nil
}
}

This example breaks the application execution, but it also might enrich the error's context like the OutOfGas handler.

Recovery middleware​

We also add a middleware type (decorator). That function type wraps RecoveryHandler and returns the next middleware in execution chain and handler's error. Type is used to separate actual recovery() object handling from middleware chain processing.

type recoveryMiddleware func(recoveryObj interface{}) (recoveryMiddleware, error)

func newRecoveryMiddleware(handler RecoveryHandler, next recoveryMiddleware) recoveryMiddleware {
return func(recoveryObj interface{}) (recoveryMiddleware, error) {
if err := handler(recoveryObj); err != nil {
return nil, err
}
return next, nil
}
}

Function receives a recoveryObj object and returns:

  • (next recoveryMiddleware, nil) if object wasn't handled (not a target type) by RecoveryHandler;
  • (nil, not nil error) if input object was handled and other middlewares in the chain should not be executed;
  • (nil, nil) in case of invalid behavior. Panic recovery might not have been properly handled; this can be avoided by always using a default as a rightmost middleware in the chain (always returns an error');

OutOfGas middleware example:

func newOutOfGasRecoveryMiddleware(gasWanted uint64, ctx sdk.Context, next recoveryMiddleware) recoveryMiddleware {
handler := func(recoveryObj interface{}) error {
err, ok := recoveryObj.(sdk.ErrorOutOfGas)
if !ok { return nil }

return errorsmod.Wrap(
sdkerrors.ErrOutOfGas, fmt.Sprintf(
"out of gas in location: %v; gasWanted: %d, gasUsed: %d", err.Descriptor, gasWanted, ctx.GasMeter().GasConsumed(),
),
)
}

return newRecoveryMiddleware(handler, next)
}

Default middleware example:

func newDefaultRecoveryMiddleware() recoveryMiddleware {
handler := func(recoveryObj interface{}) error {
return errorsmod.Wrap(
sdkerrors.ErrPanic, fmt.Sprintf("recovered: %v\nstack:\n%v", recoveryObj, string(debug.Stack())),
)
}

return newRecoveryMiddleware(handler, nil)
}
Recovery processing​

Basic chain of middlewares processing would look like:

func processRecovery(recoveryObj interface{}, middleware recoveryMiddleware) error {
if middleware == nil { return nil }

next, err := middleware(recoveryObj)
if err != nil { return err }
if next == nil { return nil }

return processRecovery(recoveryObj, next)
}

That way we can create a middleware chain which is executed from left to right, the rightmost middleware is a default handler which must return an error.

BaseApp changes​

The default middleware chain must exist in a BaseApp object. Baseapp modifications:

type BaseApp struct {
// ...
runTxRecoveryMiddleware recoveryMiddleware
}

func NewBaseApp(...) {
// ...
app.runTxRecoveryMiddleware = newDefaultRecoveryMiddleware()
}

func (app *BaseApp) runTx(...) {
// ...
defer func() {
if r := recover(); r != nil {
recoveryMW := newOutOfGasRecoveryMiddleware(gasWanted, ctx, app.runTxRecoveryMiddleware)
err, result = processRecovery(r, recoveryMW), nil
}

gInfo = sdk.GasInfo{GasWanted: gasWanted, GasUsed: ctx.GasMeter().GasConsumed()}
}()
// ...
}

Developers can add their custom RecoveryHandlers by providing AddRunTxRecoveryHandler as a BaseApp option parameter to the NewBaseapp constructor:

func (app *BaseApp) AddRunTxRecoveryHandler(handlers ...RecoveryHandler) {
for _, h := range handlers {
app.runTxRecoveryMiddleware = newRecoveryMiddleware(h, app.runTxRecoveryMiddleware)
}
}

This method would prepend handlers to an existing chain.

Consequences​

Positive​

  • Developers of Cosmos SDK based projects can add custom panic handlers to:
    • add error context for custom panic sources (panic inside of custom keepers);
    • emit panic(): passthrough recovery object to the Tendermint core;
    • other necessary handling;
  • Developers can use standard Cosmos SDK BaseApp implementation, rather that rewriting it in their projects;
  • Proposed solution doesn't break the current "standard" runTx() flow;

Negative​

  • Introduces changes to the execution model design.

Neutral​

  • OutOfGas error handler becomes one of the middlewares;
  • Default panic handler becomes one of the middlewares;

References​