ETS - Async Reads & Writes

ETS is an exceptional tool that I feel is greatly under-used in the Elixir world.

This talk by Claudio from Erlang-Solutions highlights a very good implementation for achieving concurrent reads as well as sequential-concurrent writes with ETS.

1defmodule Cache do
2  use GenServer
3
4  @table :cache
5  @name __MODULE__
6
7  def start_link() do
8    GenServer.start_link(@name, [], name: @name)
9  end
10
11  def init(_) do
12    # :public and read concurrency are very important
13    :ets.new(@table, [:named_table, :public, :read_concurrency])
14    {:ok, []}
15  end
16
17  def get(key) do
18    :ets.lookup(@table, key)
19  end
20
21  def set(key, val) do
22    GenServer.cast(__MODULE__, {:set, key, val})
23  end
24
25  def handle_cast({:set, key, val}, state) do
26    :ets.insert(@table, {key, val})
27    {:noreply, state}
28  end
29end

Since we have :public and :read_concurrency set in our table options, we have the ability to offload the reading of the ETS data to the calling process, allowing the Cache genserver to focus on inserting only.

However, you really need to understand your data for this to be effective as it really isn’t suited to data that is going to be updated due to race conditions.

For example, let’s say you get a flood of data that needs to be written to the cache e.g. 10,000,000 struct-objects.

These objects contain a tracking counter key of some sort.

Let’s that during these writes, a request comes in to increment counter :foo. So, something like this might happen:

1iex> foo = Cache.get(:foo)
2iex> foo.counter
3# => 10
4iex> foo = %Struct{foo | counter: foo.counter + 1}
5# => %Struct{counter: 11..}
6iex> Cache.set(:foo, foo)

Nothing wrong with this - it all looks fine. The problem occurs when you look at what already is in the Cache mailbox.

Let’s say that of those 10million write requests, req # 2,900,000 was to do the same thing - update :foo counter by one.

Since potentially this update would not have happened by the time the second update request comes through, Cache.get(:foo) is still going to return the old value of 10 whereas it really should be 11.

This means that the cache is going to get 2 write requests for :foo with a counter value of 11 rather than for 11 and the next for 12.

Struggling

ETS - Async Reads & Writes

1 other person is subscribed

More articles from Struggling

There's more than code

Explained - Function Composition