User:LukasLalinsky/LiveLuceneUpdate

From MusicBrainz Wiki
Jump to navigationJump to search

Just an idea about live updating of Lucene indexes. This assumes we have up-to-date DB available, so it would either need to talk to the main DB server or use some instant replication system.

  1. Create table:
    CREATE TABLE pending_lucene_index (    id SERIAL PRIMARY KEY,    operation CHAR(1),    index VARCHAR(100),    entityid INT);
  2. Create a trigger on each table we are interested in:
    -- ------------------------------------------------------------------------------------- Artist-- -----------------------------------------------------------------------------------CREATE OR REPLACE FUNCTION li_ins_artist() RETURNS TRIGGER AS 'BEGIN    INSERT INTO pending_lucene_index (operation, index, entityid) VALUES ('I', 'artist', NEW.id);    NOTIFY pending_lucene_index;    return NULL;END;' LANGUAGE 'plpgsql';CREATE OR REPLACE FUNCTION li_upd_artist() RETURNS TRIGGER AS 'BEGIN    INSERT INTO pending_lucene_index (operation, index, entityid) VALUES ('U', 'artist', OLD.id);    NOTIFY pending_lucene_index;    return NULL;END;' LANGUAGE 'plpgsql';CREATE OR REPLACE FUNCTION li_del_artist() RETURNS TRIGGER AS 'BEGIN    INSERT INTO pending_lucene_index (operation, index, entityid) VALUES ('D', 'artist', OLD.id);    NOTIFY pending_lucene_index;    return NULL;END;' LANGUAGE 'plpgsql';CREATE TRIGGER li_ins_artist AFTER INSERT ON artist     FOR EACH ROW EXECUTE PROCEDURE li_ins_artist();CREATE TRIGGER li_upd_artist AFTER UPDATE ON artist    FOR EACH ROW EXECUTE PROCEDURE li_upd_artist();CREATE TRIGGER li_del_artist AFTER DELETE ON artist    FOR EACH ROW EXECUTE PROCEDURE li_del_artist();-- ------------------------------------------------------------------------------------- Artist Alias-- -----------------------------------------------------------------------------------CREATE OR REPLACE FUNCTION li_ins_artistalias() RETURNS TRIGGER AS 'BEGIN    INSERT INTO pending_lucene_index (operation, index, entityid) VALUES ('I', 'artist', NEW.ref);    NOTIFY pending_lucene_index;    return NULL;END;' LANGUAGE 'plpgsql';CREATE OR REPLACE FUNCTION li_upd_artistalias() RETURNS TRIGGER AS 'BEGIN    INSERT INTO pending_lucene_index (operation, index, entityid) VALUES ('U', 'artist', OLD.ref);    IF NEW.ref != OLD.ref    THEN        INSERT INTO pending_lucene_index (operation, index, entityid) VALUES ('U', 'artist', NEW.ref);    END    NOTIFY pending_lucene_index;    return NULL;END;' LANGUAGE 'plpgsql';CREATE OR REPLACE FUNCTION li_del_artistalias() RETURNS TRIGGER AS 'BEGIN    INSERT INTO pending_lucene_index (operation, index, entityid) VALUES ('D', 'artist', OLD.ref);    NOTIFY pending_lucene_index;    return NULL;END;' LANGUAGE 'plpgsql';CREATE TRIGGER li_ins_artistalias AFTER INSERT ON artist alias    FOR EACH ROW EXECUTE PROCEDURE li_ins_artistalias();CREATE TRIGGER li_upd_artistalias AFTER UPDATE ON artistalias    FOR EACH ROW EXECUTE PROCEDURE li_upd_artistalias();CREATE TRIGGER li_del_artistalias AFTER DELETE ON artistalias    FOR EACH ROW EXECUTE PROCEDURE li_del_artistalias();
  3. Have an application running in background, that uses
    LISTEN pending_lucene_index
    and
    notify = dbh.getnotify()
    to get notified about DB changes and on every change updates the index by selecting from the pending_lucene_index and using updateDocument, addDocument and deleteDocuments. For adding and updating documents can be easily used the current code for building indexes from scratch.