{"id":233023,"date":"2023-02-02T19:20:00","date_gmt":"2023-02-02T16:20:00","guid":{"rendered":"https:\/\/wordpress.mediadoma.com\/?p=233023"},"modified":"2022-11-10T18:48:11","modified_gmt":"2022-11-10T15:48:11","slug":"gb2312-voi-muude-mitte-ansi-maerkide-teisendamine-utf-8-kodeeringusse-nii-mysql-i-kui-ka-failide-maergistik","status":"publish","type":"post","link":"https:\/\/wordpress.mediadoma.com\/et\/gb2312-voi-muude-mitte-ansi-maerkide-teisendamine-utf-8-kodeeringusse-nii-mysql-i-kui-ka-failide-maergistik\/","title":{"rendered":"GB2312 (v\u00f5i muude mitte-ANSI-m\u00e4rkide) teisendamine UTF-8 kodeeringusse (nii MySQL-i kui ka failide m\u00e4rgistik)"},"content":{"rendered":"\n<p>Minu esimene veebisait <a href=\"https:\/\/steakovercooked.com\/\" target=\"_blank\" rel=\"noopener nofollow\" class=\"external external_icon\">steakovercooked.com<\/a> sai alguse 2006. aastal (9 aastat tagasi). Ja sel ajal ei teadnud ma failikodeeringust\/m\u00e4rgikomplektist suurt midagi ja ka UTF-8 polnud sel ajal veebilehtede jaoks nii populaarne. T\u00e4nap\u00e4eval muutub UTF-8 nii populaarseks, st WordPress kasutab kogu saidil UTF-8 kodeeringut, nii et saate praktiliselt kuvada mis tahes keelt \u00fchel saidil ilma probleemideta.<\/p>\n<p><a href=\"https:\/\/wordpress.mediadoma.com\/wp-content\/uploads\/2022\/01\/post-155185-61e5574685d46.jpg\" data-rel=\"lightbox\"><img decoding=\"async\" class=\"SDStudio-light-box-enable SDStudio-editor-tools-md-imp\" src=\"https:\/\/wordpress.mediadoma.com\/wp-content\/uploads\/2022\/01\/post-155185-61e5574685d46.jpg\" alt=\"GB2312 (v\u00f5i muude mitte-ANSI-m\u00e4rkide) teisendamine UTF-8 kodeeringusse (nii MySQL-i kui ka failide m\u00e4rgistik)\" ><\/a><\/p>\n<p>UTF-8-ascii-iso-8859-1<\/p>\n<p>K\u00f5igi failide koodileht (PHP, HTML, CSS ja m\u00f5ned muud lihtteksti failid) olid enamasti ANSI koodilehtedel ja hiina m\u00e4rgid on mitmebaidise kodeeringuga. Nende m\u00e4rkide kuvamiseks (ANSI-kodeeringus) brauseris peate need panema HTML-is p\u00e4isesildi vahele, et brauserid m\u00f5istaksid:<\/p>\n<pre><code>&lt;meta http-equiv=\"Content-Type\" content=\"text\/html; charset=gb2312\"&gt;<\/code><\/pre>\n<p>HTML5-s saate kirjutada palju l\u00fchema meetodiga:<\/p>\n<pre><code>&lt;meta charset=\"gb2312\"&gt;<\/code><\/pre>\n<p>Seega ei n\u00e4e enamik hiina keelt mittek\u00f5nelejaid m\u00e4rke, kui nad just brauseri jaoks ei installi keelepaketti GB2312. Samuti on t\u00f5en\u00e4oline, et m\u00f5ned tavalised tekstiredaktorid ajavad tegelastega sassi. Hiina keeles on \u00fcks m\u00e4rk kaks baiti, kuid m\u00f5nikord l\u00f5ikab tekstiredaktor lihtsalt pooleks.<\/p>\n<h3>Teisendage failid (ANSI) UTF-8-ks<\/h3>\n<p>Enne metap\u00e4ise muutmist :<\/p>\n<pre><code>&lt;meta charset=\"utf-8\"&gt;<\/code><\/pre>\n<p>Peaksite failid teisendama UTF-8 kodeeringusse. Selleks on palju v\u00f5imalusi. Lihtsaim viis oleks kasutada UTF-8 kodeeringuna salvestamiseks m\u00e4rkmikku.<\/p>\n<p><a href=\"https:\/\/wordpress.mediadoma.com\/wp-content\/uploads\/2022\/01\/post-155185-61e55747b5a84.jpg\" data-rel=\"lightbox\"><img decoding=\"async\" class=\"SDStudio-light-box-enable SDStudio-editor-tools-md-imp\" src=\"https:\/\/wordpress.mediadoma.com\/wp-content\/uploads\/2022\/01\/post-155185-61e55747b5a84.jpg\" alt=\"GB2312 (v\u00f5i muude mitte-ANSI-m\u00e4rkide) teisendamine UTF-8 kodeeringusse (nii MySQL-i kui ka failide m\u00e4rgistik)\" ><\/a><\/p>\n<p>notepad-convert-to-utf-8<\/p>\n<p>Kui teil on palju faile, saate seda teha Linuxi (<a href=\"https:\/\/helloacm.com\/site-news-vps-upgraded-again-to-handle-large-traffic\/\" target=\"_blank\" rel=\"noopener nofollow\" class=\"external external_icon\">VPS-server) utiliidi<\/a> <strong>iconv<\/strong> abil. J\u00e4rgmine skript (salvestatud failinimega, nt toUTF) teisendab \u00fche faili UTF-8-ks.<a href=\"https:\/\/helloacm.com\/site-news-vps-upgraded-again-to-handle-large-traffic\/\" target=\"_blank\" rel=\"noopener nofollow\" class=\"external\"><\/a><\/p>\n<pre><code>#!\/bin\/bash\n# https:\/\/helloacm.com\n\u00a0\nif [ \"$#\" -ne 1 ] ||! [ -r \"$1\" ]; then\n\u00a0 \u00a0 echo \"Usage: $0 file1\"\n\u00a0 \u00a0 exit 1\nfi\n\u00a0\nx=`file -bi $1 | grep 'utf' | wc -l`\nif [ $x -eq 1 ]; then\n\u00a0 echo \"$1 already converted\"\nelse\n\u00a0 echo converting $1 to UTF8\n\u00a0 iconv -f \"gb2312\" -t \"UTF-8\" $1 -o $1\nfi<\/code><\/pre>\n<p>V\u00f5imalike probleemide v\u00e4ltimiseks peame v\u00e4ltima kahekordset teisendamist. Kontrollib <strong><code>file -bi $1 | grep 'utf' | wc -l<\/code><\/strong>, kas fail on juba UTF-8 kodeeringuga. K\u00e4sk <strong>iconv -f &quot;gb2312&quot; -t &quot;UTF-8&quot; $1 -o $1<\/strong> teisendab faili gb2132-st UTF-8-ks (muutke seda vastavalt).<\/p>\n<p>N\u00fc\u00fcd saame k\u00f5ik praeguses kataloogis ja k\u00f5igis alamkataloogides olevad failid *.php faililaiendiga ts\u00fcklistada:<\/p>\n<pre><code>for x in `find. -type f -name \"*.php\"`; do\n   toUTF $x\ndone    <\/code><\/pre>\n<h3>Teisendage MySQL-i andmebaas UTF-8-ks<\/h3>\n<p>Minu puhul on kogu minu eelmine mysql-i andmebaas vaikimisi ANSI-kodeeringuga (<strong>latin1_swedish_ci<\/strong> v\u00f5rdlemine), see rikutakse t\u00e4nap\u00e4evastes brauserites, kui seal on GB2312 m\u00e4rke (mitmebaidine). N\u00e4iteks PhpMyAdminil on UTF-8 kodeering ja ANSI\/GB2312 m\u00e4rgid kuvatakse brauseris rikutuna.<\/p>\n<p>Nende andmete salvestamiseks UTF-8-sse on lihtsaim viis tabeli eksportimine (soovitatav phpMyAdmin) SQL-faili; veenduge, et ekspordiksite selle <strong>ISO 8859-1<\/strong> abil (inglise keele t\u00e4ielik katvus). <strong>iso 8859-1<\/strong> on tuntud ka kui ANSI, kuid GB2312 m\u00e4rke saab salvestada mitmebaidise stringina. Kui avate SQL-i m\u00e4rkmikus, n\u00e4ete endiselt hiina t\u00e4hem\u00e4rke, peate lihtsalt salvestama UTF-8 kodeeringuna.<\/p>\n<p><a href=\"https:\/\/wordpress.mediadoma.com\/wp-content\/uploads\/2022\/01\/post-155185-61e55748eff91.jpg\" data-rel=\"lightbox\"><img decoding=\"async\" class=\"SDStudio-light-box-enable SDStudio-editor-tools-md-imp\" src=\"https:\/\/wordpress.mediadoma.com\/wp-content\/uploads\/2022\/01\/post-155185-61e55748eff91.jpg\" alt=\"GB2312 (v\u00f5i muude mitte-ANSI-m\u00e4rkide) teisendamine UTF-8 kodeeringusse (nii MySQL-i kui ka failide m\u00e4rgistik)\" ><\/a><\/p>\n<p>phpmyadmin<\/p>\n<p>Oh, veel \u00fcks asi enne UTF-8 salvestamist. Peaksite otsima ja asendama s\u00f5na &quot;latin1&quot; s\u00f5naga &quot;utf-8&quot; SQL-failis. Seej\u00e4rel importige SQL uuesti, kasutades phpMyAdminit, nii et olete valmis. K\u00f5ik andmed s\u00e4ilitatakse ja muudetakse UTF-8 kodeeringusse ning v\u00f5rdlemine (varchar, tekst, pikk tekst jne) muudetakse <strong>utf8_general_ci<\/strong>.<\/p>\n<h3>MySQL UTF-8 seaded<\/h3>\n<p>PHP-s saate m\u00e4\u00e4rata vaikem\u00e4rgistiku:<\/p>\n<pre><code>\u00a0 mysql_query(\"SET NAMES 'utf8'\");\n\u00a0 mysql_query(\"SET CHARACTER SET utf8\");<\/code><\/pre>\n<p>Mysql_set_charset <strong>teeb<\/strong> sarnaselt:<\/p>\n<pre><code>if (!function_exists('mysql_set_charset')) {\n\u00a0 function mysql_set_charset($charset, $dbh)\n\u00a0 {\n\u00a0 \u00a0 return mysql_query(\"set names $charset\", $dbh);\n\u00a0 }\n}\n\/\/ mysql_set_charset \u2014 Sets the client character set\nmysql_set_charset(\"utf-8\", $link); \/\/(PHP 5 &gt;= 5.2.3) <\/code><\/pre>\n<p>Saate m\u00e4\u00e4rata ka vaikem\u00e4rgistiku, kui MySQL-server k\u00e4ivitub (salvestage \u00fclaltoodud funktsioonide helistamise funktsioonide \u00fcldkulud). Minge faili <strong>\/etc\/mysql\/my.cnf<\/strong> redigeerimiseks ja taask\u00e4ivitage mysql server, nt <strong>sudo teenus mysql restart<\/strong>. Lisage <strong>faili my.cnf<\/strong> j\u00e4rgmine :<\/p>\n<pre><code>[client]\ndefault-character-set=utf8\n\n[mysql]\ndefault-character-set=utf8\n\n[mysqld]\ncollation-server = utf8_unicode_ci\ninit-connect='SET NAMES utf8'\ncharacter-set-server = utf8<\/code><\/pre>\n<h3>Miks UTF-8?<\/h3>\n<p>UTF-8 k\u00e4sitleb t\u00e4hestikut\u00e4hti 1 baiti (sama ANSI-ga), kuid kasutab 3 baiti \u00fche hiina t\u00e4hem\u00e4rgi t\u00e4histamiseks, samas kui GB2312 kodeerimisel kasutatakse 2 baiti. Seega, kui teie lehed sisaldavad palju hiina t\u00e4hem\u00e4rke, s\u00e4\u00e4stab ANSI\/GB2312 ruumi, kuid UTF-8 ja ANSI tarbivad t\u00e4pselt sama ruumi, kui tegemist on ainult ingliskeelsete t\u00e4htedega.<\/p>\n<p>UTF-8 s\u00e4\u00e4stab teid tulevikus probleemidest. Kui olete UTF-8-ks teisendanud, ei pea te muretsema m\u00e4rgistiku ega kodeeringu p\u00e4rast. UTF-8 on rahvusvaheliselt m\u00e4rgis\u00f5bralikum, kuna enamik brausereid teab, kuidas teksti \u00f5igesti kuvada. Minu puhul pean failid teisendama UTF-8 kodeeringusse, kuna mu lemmiktekstiredaktorid, nii PsPAD kui ka Sublime text, ei tea, kuidas ANSI\/GB2312 \u00f5igesti kuvada.<\/p>\n<p><div id=\"PostUnique_PostSource\" style=\"padding-top: 50px\">:  <a target=\"_blank\" rel=\"noopener nofollow\" href=\"\/\/helloacm.com\" class=\"external external_icon\">helloacm.com<\/a><\/div><\/p>\n","protected":false},"excerpt":{"rendered":"<p>GB2312 (v\u00f5i muude mitte-ANSI-m\u00e4rkide) teisendamine UTF-8 kodeeringusse (nii MySQL-i kui ka failide m\u00e4rgistik)<\/p>\n","protected":false},"author":1,"featured_media":224526,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","_wp_rev_ctl_limit":""},"categories":[718,749,833,894,916,842,863],"tags":[1165],"class_list":["post-233023","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-arendaja","category-avatud-laehtekoodiga","category-juhend-algajatele","category-kood","category-muud","category-opetused","category-wordpress-4","tag-affiai-et"],"_links":{"self":[{"href":"https:\/\/wordpress.mediadoma.com\/et\/wp-json\/wp\/v2\/posts\/233023","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wordpress.mediadoma.com\/et\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wordpress.mediadoma.com\/et\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wordpress.mediadoma.com\/et\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/wordpress.mediadoma.com\/et\/wp-json\/wp\/v2\/comments?post=233023"}],"version-history":[{"count":0,"href":"https:\/\/wordpress.mediadoma.com\/et\/wp-json\/wp\/v2\/posts\/233023\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wordpress.mediadoma.com\/et\/wp-json\/wp\/v2\/media\/224526"}],"wp:attachment":[{"href":"https:\/\/wordpress.mediadoma.com\/et\/wp-json\/wp\/v2\/media?parent=233023"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wordpress.mediadoma.com\/et\/wp-json\/wp\/v2\/categories?post=233023"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wordpress.mediadoma.com\/et\/wp-json\/wp\/v2\/tags?post=233023"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}