> The problem with Postgres' NOTIFY is that all notifications go through a single queue!

> Even if you have 20 database connections making 20 transactions in parallel, all of them need to wait for their turn to lock the notification queue, add their notification, and unlock the queue again. This creates a bottleneck especially in high-throughput databases.

We're currently working hard on optimizing LISTEN/NOTIFY: https://www.postgresql.org/message-id/flat/6899c044-4a82-49b...

If you have any experiences of actual workload where you are currently experiencing performance/scalability problems, I would be interested in hearing from you, to better understand the actual workload. In some workloads, you might only listen to a single channel. For such single-channel workloads, the current implementation seems hard to tweak further, given the semantics and in-commit-order guarantees. However, for multi-channel workloads, we could do a lot better, which is what the linked patch is about. The main problem with the current implementation for multi-channel workloads, is that we currently signal and wake all listening backends (a backend is the PostgreSQL processes your client is connected to), even if they are not interested in the specific channels being notified in the current commit. This means that if you have 100 connections open in which each connect client has made a LISTEN on a different channel, then when someone does a NOTIFY on one of those channels, instead of just signaling the backend that listen on that channel, all 100 backends will be signaled. For multi-channel workloads, this could mean an enormous extra cost coming from the context-switching due to the signaling.

I would greatly appreciate if you could please reply to this comment and share your different workloads when you've had problems with LISTEN/NOTIFY, to better understand approximately how many listening backends you had, and how many channels you had, and the mix of volume on such channels. Anything that could help us do better realistic simulations of such workloads, to improve the benchmark tests we're working on. Thank you.

▲ JoelJacobson 9 hours ago | parent | next [-]

Here is the Commitfest entry if you want to help with reviewing/development/testing of the patch: https://commitfest.postgresql.org/patch/6078/

▲ barrell 3 hours ago | parent | prev | next [-]

I have listen/notify on most changes in my database. Not sure I've experienced any performance issue though but I can't say I've been putting things through their paces. IMHO listen/notify's simplicity outweighed the perf gains by WAL.

I'm only sharing this should it be helpful:

  def up do
    whitelist = Enum.join(@user_columns ++ ["tick"], "', '")

    execute """
    CREATE OR REPLACE FUNCTION notify_phrasing() RETURNS trigger AS $$
    DECLARE
      notif jsonb;
      col_name text;
      col_value text;
      uuids jsonb := '{}'::jsonb;
      user_columns text[] := ARRAY['#{whitelist}'];
    BEGIN
      -- First, add all UUID columns
      FOR col_name IN
        SELECT column_name
        FROM information_schema.columns
        WHERE table_name = TG_TABLE_NAME AND data_type = 'uuid'
      LOOP
        EXECUTE format('SELECT ($1).%I::text', col_name)
        INTO col_value
        USING CASE WHEN TG_OP = 'DELETE' THEN OLD ELSE NEW END;

        IF col_value IS NOT NULL THEN
          uuids := uuids || jsonb_build_object(col_name, col_value);
        END IF;
      END LOOP;

      -- Then, add user columns if they exist in the table
      FOREACH col_name IN ARRAY user_columns
      LOOP
        IF EXISTS (
          SELECT 1
          FROM information_schema.columns
          WHERE table_name = TG_TABLE_NAME AND column_name = col_name
        ) THEN
          EXECUTE format('SELECT ($1).%I::text', col_name)
          INTO col_value
          USING CASE WHEN TG_OP = 'DELETE' THEN OLD ELSE NEW END;

          IF col_value IS NOT NULL THEN
            uuids := uuids || jsonb_build_object(col_name, col_value);
          END IF;
        END IF;
      END LOOP;

      notif = jsonb_build_object(
        'table', TG_TABLE_NAME,
        'event', TG_OP,
        'uuids', uuids
      );

      PERFORM pg_notify('phrasing', notif::text);
      RETURN NULL;
    END;
    $$ LANGUAGE plpgsql;
    """

    # Create trigger for each table
    Enum.each(@tables, fn table ->
      execute """
      CREATE TRIGGER notify_phrasing__#{table}
      AFTER INSERT OR UPDATE OR DELETE ON #{table}
      FOR EACH ROW EXECUTE FUNCTION notify_phrasing();
      """
    end)
  end

I do react to most (70%) of my database changes in some way shape or form, and post them to a PubSub topic with the uuids. All of my dispatching can be done off of uuids.

▲ tobyhinloopen 9 hours ago | parent | prev | next [-]

We use it like this:

    CREATE TRIGGER notify_events_trg AFTER INSERT ON xxx.events FOR EACH ROW EXECUTE PROCEDURE public.notify_events();

    CREATE FUNCTION public.notify_events() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
    BEGIN
      PERFORM pg_notify('events', row_to_json(NEW)::text);
      RETURN NEW;
    END;
    $$;

And then we have a bunch of triggers like this on many tables:

    CREATE TRIGGER create_category_event_trg AFTER INSERT OR DELETE OR UPDATE ON public.categories FOR EACH ROW EXECUTE PROCEDURE public.create_category_event();

    CREATE FUNCTION public.create_category_event() RETURNS trigger
        LANGUAGE plpgsql SECURITY DEFINER
        AS $$
    DECLARE
      category RECORD;
      payload JSONB;
    BEGIN
      category := COALESCE(NEW, OLD);
      payload := jsonb_build_object('id', category.id);
      IF NEW IS NULL OR NEW.deleted_at IS NOT NULL THEN
        payload := jsonb_set(payload, '{deleted}', 'true');
      END IF;
      INSERT INTO xxx.events (channel, inserted_at, payload)
        VALUES ('category', NOW() AT TIME ZONE 'utc', payload);
      RETURN NULL;
    END;
    $$;

We found no notable performance issues. We have a single LISTEN in another application. We did some stress testing and found that it performs way better than we would ever need

▲

JoelJacobson 9 hours ago | parent [-]

Thanks for the report. For that use-case (if you have a single application using a single connection with a LISTEN) then it's expected that is should perform well, since then there is only a single backend which will be context-switched to when each NOTIFY signals it.

▲

oulipo2 8 hours ago | parent [-]

Just out of curiosity, could you try to frame in what context this would or would not work? If you have multiple backends with multiple connections for instance? And then if we start with such a "simple" solution and we later need to scale with distributed backends, how should we do this?

▲

JoelJacobson 7 hours ago | parent [-]

In the linked "Optimize LISTEN/NOTIFY" pgsql-hackers, I've shared a lot of benchmark results for different workloads, which also include results on how PostgreSQL currently works (this is "master" in the benchmark results), that can help you better understand the expectations for different workloads.

The work-around solution we used at Trustly (a company I co-founded), is a component named `allas` that a colleague of mine at that time, Marko Tikkaja, created to solve our problems, that massively reduced the load on our servers. Marko has open sourced and published this work here: https://github.com/johto/allas

Basically, `allas` opens up a single connection to PostgreSQL, on which it LISTEN on all the channels it needs to listen on. Then clients connect to `allas` over the PostgreSQL protocol, so it's basically faking a PostgreSQL server, and when clients do LISTEN on a channel with allas, allas will then LISTEN on that channel on the real PostgreSQL server on the single connection it needs. Thanks to `allas` being implemented in Go, using Go's efficient goroutines for concurrency, it efficiently scales with lots and lots of connections. I'm not a Go-expert myself, but I've understood Go is quite well suited for this type of application.

This component is still being used at Trustly, and is battle-tested and production grade.

That said, it would of course be much better to avoid the need for a separate component, and fix the scalability issues in core PostgreSQL, so that's what I'm currently working on.

	▲	phrotoma 3 hours ago \| parent [-]
		For folks like myself, who don't know much about DB internals but know a bit about Kubernetes, this sounds very akin to the K8s API watch cache. https://danielmangum.com/posts/k8s-asa-watching-and-caching/

▲ oulipo2 9 hours ago | parent | prev [-]

The post seems to say that NOTIFY is generally not a good idea, then comments here say that NOTIFY can actually work, but it depends on some particular things (which are not easy to know for newcomers to Postgres), makes it a bit complicated to know what is the way to go for a new database

In my case I have an IoT setting, where my devices can change their "DesiredState", and I want to listen on this to push some message to MQTT... but then there might be also other cases where I want to listen to some messages elsewhere (eg do something when there is an alert on a device, or listen to some unrelated object, eg users, etc)

I'm not clear right now what would be the best setting to do this, the tradeoffs, etc

Imagine I have eg 100k to 10M range of devices, that sometimes these are updated in bulks and change their DesiredState 10k at a time, would NOTIFY work in that case? Should I use the WAL/Debezium/etc?

Can you try to "dumb down" in which cases we can use NOTIFY/LISTEN and in which case it's best not to? you're saying something about single-channel/multi-channel/etc but to a newcomer I'm not clear on what all these are