Hi
I’m trying to load data via a pipeline and I used to do this without a stored procedure, but now I need the use of the stored procedure. However, the pipeline now fails claiming I have invalid json. Previously it didn’t complain, but removed the characters not supported by the character set (not desired behavior, but I didn’t notice).
I’m getting the following error:
Invalid JSON value for column 'search_entry'
My sproc is defined as:
delimiter //
create or replace procedure search_loader_proc(batch query(client_id bigint, search_entry json collate utf8mb4_general_ci))
as
begin
-- At some point we'll do something here, but now it just fails immediately
end //
delimiter ;
The pipeline is defined as:
create pipeline search_loader_pl as
load data
s3 '{...}'
config '{...}'
batch_interval 5000
into procedure search_loader_proc
format json
(
`search_entry` <- `search` default null,
`client_id` <- `client_id`
);
And the file that fails contains:
{"client_id":188502,"search":{"id":6860607798000749872,"term":"718134926477854cn","timestamp":1550870738000}}
{"client_id":188502,"search":{"id":6680607797497681070,"term":"dlx640481151nl","timestamp":1550871640000}}
{"client_id":188502,"search":{"id":8660607795855117842,"term":"dlm641035354cn","timestamp":1550872279000}}
{"client_id":188502,"search":{"id":6660807796649241451,"term":"4en030179682jp","timestamp":1550872489000}}
{"client_id":188502,"search":{"id":6660807796704819018,"term":"😊cx002632257no","timestamp":1550872563000}}
{"client_id":188502,"search":{"id":6660607798304807551,"term":"jlf484089544cn","timestamp":1550872697000}}
{"client_id":188502,"search":{"id":6660607795206686679,"term":"xrs331252679lv","timestamp":1550869311000}}
The specific line that fails is:
{"client_id":188502,"search":{"id":6660807796704819018,"term":"😊cx002632257no","timestamp":1550872563000}}
So I’m assuming its the emoji, but I’ve defined the search_entry
parameter to be utf8mb4_general_ci
. Any guidance on how to overcome this?