{"id":266,"date":"2010-11-22T00:13:00","date_gmt":"2010-11-22T07:13:00","guid":{"rendered":"http:\/\/truelogic.org\/wordpress\/?p=266"},"modified":"2010-11-22T00:13:00","modified_gmt":"2010-11-22T07:13:00","slug":"handling-special-symbols-in-csv-file-in-php","status":"publish","type":"post","link":"https:\/\/truelogic.org\/wordpress\/2010\/11\/22\/handling-special-symbols-in-csv-file-in-php\/","title":{"rendered":"Handling special symbols in csv file in PHP"},"content":{"rendered":"<p>PHP provides the fgetcsv function to automatically import values from a csv file. However it does not consistently work with cases where the starting of a field value has an extended character like the pound sign or a foreign character. In such cases, fgetcsv simply omits that first character and gets the remaining data of the column.<\/p>\n<p>Eg.&#8221;<strong>Buyagift.co.uk&#8221;,&#8221;273&#8243;,&#8221;xmas1075&#8243;,&#8221;\u00a310 off when you spend over \u00a375 on all products excluding our amazing special offers&#8221;<\/strong><\/p>\n<p>will be imported as<\/p>\n<p><strong>&#8220;Buyagift.co.uk&#8221;,&#8221;273&#8243;,&#8221;xmas1075&#8243;,&#8221;10 off when you spend over \u00a375 on all products excluding our amazing special offers&#8221;<\/strong><\/p>\n<p>One officially recommended way of handling this is to set the locale before running fgetcsv . But somehow this does not always work. After trying various approaches we have come up with our own solution which works in all cases.<\/p>\n<p>We do our own preprocessing of the csv file before passing it to fgetcsv(). Our approach is to detect any character following a comma which is beyond the ascii printable range of codes from 32 to 126. If it detects any such character it precedes it with a slash. The complete\u00a0 data thus updated is written back to the file and then passed to fgetcsv().<\/p>\n<p>So <strong>&#8220;Buyagift.co.uk&#8221;,&#8221;273&#8243;,&#8221;xmas1075&#8243;,&#8221;\u00a310 off when you spend over \u00a375 on all products excluding our amazing special offers&#8221;<\/strong><\/p>\n<p>will become <strong>&#8220;Buyagift.co.uk&#8221;,&#8221;273&#8243;,&#8221;xmas1075&#8243;,&#8221;\u00a310 off when you spend over \u00a375 on  all products excluding our amazing special offers&#8221;<\/strong><\/p>\n<p>At a later stage the once the data array has been obtained from fgetcsv, we remove the preceding slash by running stripcslashes() to the value<strong>.<\/strong><\/p>\n<p>The code fragment is given below:<\/p>\n<div style=\"font-family:Courier;padding:2px;background-color:#4769d4;color:yellow;\">$f = fopen(&#8220;myfile.csv&#8221;, &#8220;r&#8221;);<br \/>\nif ($f) {<br \/>\n$data = fread($f, filesize(&#8220;myfile.csv&#8221;));<br \/>\nfclose($f);<br \/>\n$arr = explode(&#8220;n&#8221;, $data);<br \/>\n$newData = &#8220;&#8221;;<br \/>\nfor($i = 0; $i &lt; count($arr); $i++) {<br \/>\n$item = $arr[$i];<br \/>\nif (strlen($item) &gt; 2) {<br \/>\n$x = explode(&#8220;,&#8221;,$item);<br \/>\nfor($j = 0; $j &lt; count($x); $j++)<br \/>\n{<br \/>\nif (ord($x[$j]) &gt; 126)<br \/>\n{<br \/>\n$x[$j] = &#8220;\\&#8221;.$x[$j];<br \/>\n}<br \/>\n}<br \/>\n$arr[$i] = implode(&#8220;,&#8221;,$x);<br \/>\n$item = $arr[$i];<br \/>\n$newData .= $item . &#8220;rn&#8221;;<br \/>\n}<\/p>\n<p>}<br \/>\n}<\/p>\n<p>\/\/ rewrite processed data back to file<br \/>\n$f = fopen(&#8220;myfile.csv&#8221;, &#8220;w&#8221;);<br \/>\nif ($f) {<br \/>\nfwrite($f, $newData);<br \/>\nfclose($f);<br \/>\n}<br \/>\nelse<br \/>\necho(&#8220;Unable to create file:&#8221;);<\/p><\/div>\n<p><strong>Handling The Euro Symbol <span id=\"main\" style=\"visibility: visible;\"><span id=\"search\" style=\"visibility: visible;\">\u20ac<\/span><\/span><\/strong><\/p>\n<p><span style=\"visibility: visible;\"><span style=\"visibility: visible;\">The above approach does not work for the Euro symbol since it does not get correctly interpreted even if a file is open in utf-8 mode. For this, the solution to open the file and simply replace the euro symbol with its html entity equiv: &amp; html ; (remove the spaces in between).<\/span><\/span><\/p>\n<p><span style=\"visibility: visible;\"><span style=\"visibility: visible;\">Eg. <strong>$data = str_replace(&#8220;\u0080&#8221;,&#8221;&amp;euro;&#8221;,$data);<\/strong><br \/>\n<\/span><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<div class=\"mh-excerpt\"><p>PHP provides the fgetcsv function to automatically import values from a csv file. However it does not consistently work with cases where the starting of <a class=\"mh-excerpt-more\" href=\"https:\/\/truelogic.org\/wordpress\/2010\/11\/22\/handling-special-symbols-in-csv-file-in-php\/\" title=\"Handling special symbols in csv file in PHP\">[&#8230;]<\/a><\/p>\n<\/div>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[36,85,114,137,200,232,240,257],"class_list":["post-266","post","type-post","status-publish","format-standard","hentry","category-apachephp","tag-amit","tag-csv","tag-file","tag-handle","tag-php","tag-sengupta","tag-special","tag-truelogic"],"_links":{"self":[{"href":"https:\/\/truelogic.org\/wordpress\/wp-json\/wp\/v2\/posts\/266","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/truelogic.org\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/truelogic.org\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/truelogic.org\/wordpress\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/truelogic.org\/wordpress\/wp-json\/wp\/v2\/comments?post=266"}],"version-history":[{"count":0,"href":"https:\/\/truelogic.org\/wordpress\/wp-json\/wp\/v2\/posts\/266\/revisions"}],"wp:attachment":[{"href":"https:\/\/truelogic.org\/wordpress\/wp-json\/wp\/v2\/media?parent=266"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/truelogic.org\/wordpress\/wp-json\/wp\/v2\/categories?post=266"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/truelogic.org\/wordpress\/wp-json\/wp\/v2\/tags?post=266"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}