Working with expensive notebooks¶
marimo provides tools to control when cells run. Use these tools to prevent expensive cells, which may call APIs or take a long time to run, from accidentally running.
Stop execution with mo.stop
¶
Use mo.stop
to stop a cell from executing if a condition
is met:
# if condition is True, the cell will stop executing after mo.stop() returns
mo.stop(condition)
# this won't be called if condition is True
expensive_function_call()
Use mo.stop
with
mo.ui.run_button()
to require a button press for
expensive cells:
Configure how marimo runs cells¶
Disabling cell autorun¶
If you habitually work with very expensive notebooks, you can disable automatic execution. When automatic execution is disabled, when you run a cell, marimo marks dependent cells as stale instead of running them automatically.
Disabling autorun on startup¶
marimo autoruns notebooks on startup, with marimo edit notebook.py
behaving
analogously to python notebook.py
. This can also be disabled through the
notebook settings.
Disable individual cells¶
marimo lets you temporarily disable cells from automatically running. This is helpful when you want to edit one part of a notebook without triggering execution of other parts. See the reactivity guide for more info.
Caching¶
marimo provides two caching utilities to help you manage expensive computations:
- In-memory caching with
mo.cache
- Disk caching with
mo.persistent_cache
Both utilities can be used as decorators or context managers.
In-memory caching¶
Use mo.cache
to cache the return values of
expensive functions, based on their arguments:
When compute_predictions
is called with a value of
problem_parameters
it hasn't seen, it will compute the predictions and store
them in an in-memory cache. The next time it is called with the same
parameters, instead of recomputing the predictions, it will return the
previously computed value from the cache.
Comparison to functools.cache
mo.cache
is like functools.cache
but smarter.
functools
will sometimes evict values from the cache when it doesn't need to.
In particular, consider the case when a cell defining a @mo.cache
-d function
re-runs due to an ancestor of it running, or a UI element value changing.
mo.cache
will analyze the dataflow graph to determine whether or not the
decorated function has changed, and if it hasn't, it's cache won't be
invalidated. In contrast, on re-run a functools
cache is always invalidated,
because functools
has no knowledge about the structure of marimo's dataflow
graph.
Conversely, mo.cache
knows to invalidate the cache if
closed over variables change, whereas functools.cache
doesn't, yielding
incorrect cache hits.
mo.cache
is slightly slower than functools.cache
, but
in most applications the overhead is negligible. For performance critical code,
where the decorated function will be called in a tight loop, prefer
functools.cache
.
Disk caching¶
Use mo.persistent_cache
to cache variables to
disk. The next time your run your notebook, the cached variables will be loaded
from disk instead of being recomputed, letting you pick up where you left off.
Reserve this for expensive computations that you would like to persist across
notebook restarts. Cached outputs are automatically saved to __marimo__/cache
.
Example.
Roughly speaking, mo.persistent_cache
registers a
cache hit when the cell is not stale, meaning its code hasn't changed and
neither have its ancestors. On cache hit the code block won't execute and
instead variables will be loaded into memory.
Lazy-load expensive UIs¶
Lazily render UI elements that are expensive to compute using
marimo.lazy
.
For example,
In this example, mo.ui.table(data)
will not be rendered on the frontend until is it in the viewport.
For example, an element can be out of the viewport due to scroll, inside a tab that is not selected, or inside an accordion that is not open.
However, in this example, data is eagerly computed, while only the rendering of the table is lazy. It is possible to lazily compute the data as well: see the next example.
import marimo as mo
def expensive_component():
import time
time.sleep(1)
data = db.query("SELECT * FROM data")
return mo.ui.table(data)
accordion = mo.accordion({
"Charts": mo.lazy(expensive_component)
})
In this example, we pass a function to mo.lazy
instead of a component. This
function will only be called when the user opens the accordion. In this way,
expensive_component
lazily computed and we only query the database when the
user needs to see the data. This can be useful when the data is expensive to
compute and the user may not need to see it immediately.