I recently saw that UTF-8mb4 support is a listed exception from MySQL. This means data with emojis (which our data set contains) returns an error on load.
Are there plans to support full UTF-8mb4, and as a result, emojis on the roadmap?
I recently saw that UTF-8mb4 support is a listed exception from MySQL. This means data with emojis (which our data set contains) returns an error on load.
Are there plans to support full UTF-8mb4, and as a result, emojis on the roadmap?
I too really misses this feature.
Our workaround is to escape Unicodes outside the range before inserting and reverse the process when fetching.
This is not on the near-term roadmap but we have heard about this and are considering it for a future release. Thanks for the feedback!
mpskovvang how exactly do you do the escaping now?
Thanks Hanson!
The one workaround we’ve found that works is to encode the data in Base64 then decode on read in MemSQL.
Sorry, a bit late…
I do only encode/decode the unsupported unicodes.
This actual works really great. The only real drawback is the byte size. I can even perform a FULLTEXT seach for emojies as long as I encode the query string first.
My Unicode class:
<?php
namespace App;
class Unicode
{
public static function encode($string)
{
return preg_replace_callback('/[\x{FFFF}-\x{10FFFF}]+/u', function ($match) {
return str_replace('"', '', json_encode($match[0]));
}, $string);
}
public static function decode($string)
{
return preg_replace_callback('/(\\\u[0-9a-f]{4})+/', function ($match) {
return json_decode('"' . $match[0] . '"');
}, $string);
}
}
Hi @nick-at and @mpskovvang
I am the PM currently working on looking at adding emoji (basically utf8mb4) support to MemSQL. Can you guys please email me so we can chat? Martin, I emailed you at your katoni.dk email by the way
My email is jliang (at) memsql (dot) com
any expected ETA for support in UTF8mb4?
The project is funded and we will have a estimation soon.
Hi , nikita any update , we were about to start Memsql , but once we found lack of suck primary feature we decided to wait until you finish it , Hope you have a specific plan and deadline when it will be done , as I can see talk about this feature started a long time and no one could have a final answer , thanks
utf8mb4 support is planned for a MemSQL release targeted at later this year. Work should start on it shortly.
-Adam
we are storing survey results in MemSQL, leveraging the pipeline feature to allow real time reporting (which works quite nicely) but we also came across this problem now because someone used an emoji in their response which led to the comment being cut off.
is there a release date for this feature? We need to discuss internally if we need to write a workaround but we wouldn’t do that if the fix is released in the next weeks.
Thanks!
Christoph
Hi Christoph,
We were a bit delayed on starting on this feature. The work is still in progress (its mostly written now), but it didn’t make the cut off for a release we plan to ship later this year. I can assure you it is coming though, but not within the next few weeks.
-Adam
It will ship in the next major release of Singlestore (it didn’t make it into 7.3). We are considering if we can backport it into 7.3 in a patch release early next year.
-Adam
Thanks @adam that would be great!
At the moment I remove emojis from the strings to avoid this problem…
Hi folks,
The feature will be in the next major release of SingleStore. Its looking too complex to backport to 7.3 at this point, so likely 7.5 will be the first release it will be available.
-Adam
Thanks Adam for the update, do you have an idea when the 7.5 might be released please?
Also, is there a public roadmap available somewhere too?