We are creating pipeline that uses AVRO schema to ingest data in tables from kafka. As with avro we get flexibility of evolving our schema we are looking for a similar feature with pipeline too.
But with Alter pipeline we noticed that there is no option to update schema, we can only set offsets/transforms thus our only solution left to achieve this is
1)Stop pipeline
2)fetch offset from all partitions
3)Create new pipeline with updated schema
4)Alter pipeline with previous pipeline offset
5)Drop old pipeline
6)start this new pipeline.
This looks tedious specially maintaining offsets. Is memsql has any plans to support schema evolution or any better way to achieve this use case?
We have exactly the same issue. But the drop/replace solution doesn’t quite work for us, because it would mean we would have to replace the pipeline at exactly the point where the old schema version has been drained and before the new one starts.
The scheme we have settled on now is to create a new topic and a new pipeline for a new version, and alter table to match it. let the old version drain when it does, and then drop it. (Of course all this assumes that schema changes are backward compatible.)
Agreed, there’s no good way to replace the pipeline at the exact point where the schema changes. For now, working with a new topic is in fact likely to be the best workaround. It’s possible to use a transform to rewrite records of older schemas as instances of the pipeline’s expected schema, replacing the pipeline with a new schema and transform as appropriate, but it’s not simple and carries a performance cost.
We are in fact working on schema registry integration for exactly these reasons. It’ll be ready for release soon after 7.0, though I’m not qualified to give a more precise date.