Uploaded image for project: 'eZ Publish / Platform'
  1. eZ Publish / Platform
  2. EZP-29810

DOC: Upgrade DB requirements for utf8mb4

    XMLWordPrintable

Details

    • Icon: Improvement Improvement
    • Resolution: Done
    • Icon: High High
    • n/a
    • None
    • Documentation
    • None

    Description

      Since we switched mysql/mariadb charset and collation from utf8 to utf8mb4 in EZP-28186 we should update the requirements page to ensure it covers the change.

      We must also update the utf8mb4 upgrade advice in kernel 7.2. No, it's good enough as is. It recommends utf8mb4_unicode_520_ci but says you may want others if you have other requirements, which is fine. It is not possible to choose a non-mb4 collation for the mb4 charset, so you can't really go wrong.

      Possible collations:

      • utf8mb4_general_* and utf8_unicode_* collations were afaik introduced with the utf8mb4 charset in mysql v5.5.3 and mariadb 5.5.something (undocumented), and provides basic collation. The "general" collations are faster, while the "unicode" collations give more correct results, since they support combinations of characters.
      • utf8mb4_unicode_520_* collations were introduced in mysql v5.6 and mariadb 10, and fixes some issues, like treating different characters as the same. They are based on Unicode Collation Algorithm (UCA) 5.2.0 weight keys. Mysql v5.6+ should be avoided for legacy since it "technically works but executes several hundred times slower" on legacy content attribute sorting queries, according to the doc.
      • utf8mb4_0900_* collations are faster than the 520 collations but were introduced in mysql v8.0, so they are afaik not supported yet.

      MySQL recommendation:

      • Use mysql 5.5.x, release v5.5.3 or newer. If you do this, use a utf8_unicode_* collation unless you know you have collation performance issues. In that case, use a utf8mb4_general_* collation.
      • If you use mysql 5.6 or newer, use a utf8mb4_unicode_520_* collation unless you know you have collation performance issues. In that case, use a utf8mb4_general_* collation.

      MariaDB recommendation:

      • If you use mariadb 5.5, use a utf8_unicode_* collation unless you know you have collation performance issues. In that case, use a utf8mb4_general_* collation.
      • If you use mariadb 10.x, use a utf8mb4_unicode_520_* collation unless you know you have collation performance issues. In that case, use a utf8mb4_general_* collation.

      More info on utf8mb4: https://mathiasbynens.be/notes/mysql-utf8mb4

      Attachments

        Activity

          People

            Unassigned Unassigned
            gunnstein.lye@ibexa.co Gunnstein Lye
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0 minutes
                0m
                Logged:
                Time Spent - 7 hours
                7h