I am creating some procedures I would like to call in a pipeline and I am doing some testing about joining queries in procedures but it is not working as expected.
The setResultTypeId procedure will receive the data from the pipeline and check if the data is already present into the table and do some extra computing on the records
delimiter //
CREATE OR REPLACE PROCEDURE setResultTypeId(result_name_query Query(result_name Text)) AS
DECLARE
result_tbl QUERY(id BIGINT(19), name Text) =
SELECT rt.id, rnq.NAME as name
FROM RESULT_TYPES rt
right outer join result_name_query rnq on rt.name = rnq.name
where isnull(id);
records ARRAY(RECORD(id BIGINT(19), name Text)) = collect(result_tbl);
BEGIN
echo select * from result_tbl;
FOR rec IN records LOOP
/* extra code will be set here to create missing records , using an insert is not enough.
INSERT IGNORE INTO RESULT_TYPES(NAME, CREATED_AT) VALUES(rec.name, now());
id = last_insert_id();
call GenerateGram(rec.name,id,1);
*/
END LOOP;
END //
The test procedure is to simulate the pipeline call
delimiter //
CREATE OR REPLACE PROCEDURE test() AS
DECLARE
result_name_query Query(result_name Text) = select name from temp_r_name;
BEGIN
call setResultTypeId(result_name_query);
END //
call test();
The echo returns no row and no error
id name
but when I am doing it manually
select name from temp_r_name
name
abcd
efgh
select id, name from RESULT_TYPES
id name
1004 abcd
1006 pIC50
SELECT rt.id, rnq.NAME as name
FROM RESULT_TYPES rt
right outer join temp_r_name rnq on rt.name = rnq.name
where isnull(id);
id name
efgh
Maybe I am doing something wrong, but I can’t find some insight from the documentation
I did more testing by changing the query. When I remove the right outer join and replace by a union + a group by, it is working but I have some doubts about the performance. Then I guess it is a bug or a feature that needs to be implemented.
delimiter //
CREATE OR REPLACE PROCEDURE setResultTypeId(result_name_query Query(result_name Text)) AS
DECLARE
result_tbl QUERY(name Text) =
select name from (
select max(id) as id , name from (
SELECT rt.ID as id , rt.NAME as name
FROM RESULT_TYPES rt
UNION
select null as id, rnq.result_name as name
from result_name_query rnq) group by name) where isnull(id);
records ARRAY(RECORD(name Text)) = collect(result_tbl);
BEGIN #echo select * from result_name_query;
echo select * from result_tbl;
FOR rec IN records LOOP
/* extra code will be set here to create missing records , using an insert is not enough.
INSERT IGNORE INTO RESULT_TYPES(NAME, CREATED_AT) VALUES(rec.name, now());
id = last_insert_id();
call GenerateGram(rec.name,id,1);
*/
END LOOP;
END //
It looks like you found a bug. I’ll file an internal ticket for it. The isnull() applied to the fields filled in as NULL by the right outer join is kind of a corner case. Consider a query rewrite, like a NOT EXISTS or NOT IN to get a similar result. Sounds like you already found a different one.
result_tbl QUERY(id BIGINT(19), name Text) =
SELECT rt.id, rnq.NAME as name
FROM RESULT_TYPES rt
right outer join result_name_query rnq on rt.name = rnq.name
where isnull(id);
was that rnq.NAME and rnq.name should probably be rnq.result_name
Ideally, MemSQL should had thrown an error here saying result_name_query has no field named name, but it didn’t and this is a bug we are fixing. You can proceed with your work by correcting the field name and I think the query then should work fine.