WRITELOOP

CONVERTING MYSQL TABLES TO UTF-8

2017 January 3

MySQL (convert character set to UTF-8):

  1. Below is a query to verify all fields on a table that are strings (can be used to validate if the changes were successfully applied): SELECT TABLE_SCHEMA, TABLE_NAME, CCSA.CHARACTER_SET_NAME AS DEFAULT_CHAR_SET, COLUMN_NAME, COLUMN_TYPE, C.CHARACTER_SET_NAME FROM information_schema.TABLES AS T JOIN information_schema.COLUMNS AS C USING (TABLE_SCHEMA, TABLE_NAME) JOIN information_schema.COLLATION_CHARACTER_SET_APPLICABILITY AS CCSA ON (T.TABLE_COLLATION = CCSA.COLLATION_NAME) WHERE TABLE_SCHEMA=SCHEMA() AND C.DATA_TYPE IN (’enum’, ‘varchar’, ‘char’, ’text’, ‘mediumtext’, ’longtext’) ORDER BY TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME;
  2. How to convert the table: ALTER TABLE my_table CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci; IMPORTANT: If eventually saving JSON through python on mysql text field types, make sure to use json_dumps turning off ensure_ascii and manually encoding the string. E.g.:

unicode_str = json.dumps(row_dict, ensure_ascii=False).encode(‘utf-8’)

TAGS