csvgrep throws UTF-8 error, but on same data csvcut and csvlook don't
Created by: tlongers
Having a UTF-8 issue with csvgrep
that I can't resolve through examining previous issues.
1. The csvkit command causing the issue
csvgrep -c 1 -m "Operación" test.csv
2. The input file
test.txt (.csv renamed .txt for Github)
Running file test.csv
gives the following:
test.csv: UTF-8 Unicode text
3. The output text (including the traceback)
Message:
Your file is not "utf-8" encoded. Please specify the correct encoding with the -e flag. Use the -v flag to see the complete error.
Traceback:
name
Traceback (most recent call last):
File "/usr/local/bin/csvgrep", line 11, in <module>
sys.exit(launch_new_instance())
File "/usr/local/lib/python2.7/site-packages/csvkit/utilities/csvgrep.py", line 71, in launch_new_instance
utility.run()
File "/usr/local/lib/python2.7/site-packages/csvkit/cli.py", line 114, in run
self.main()
File "/usr/local/lib/python2.7/site-packages/csvkit/utilities/csvgrep.py", line 65, in main
for row in filter_reader:
File "/usr/local/lib/python2.7/site-packages/six.py", line 558, in next
return type(self).__next__(self)
File "/usr/local/lib/python2.7/site-packages/csvkit/grep.py", line 60, in __next__
if self.test_row(row):
File "/usr/local/lib/python2.7/site-packages/csvkit/grep.py", line 71, in test_row
result = test(value)
File "/usr/local/lib/python2.7/site-packages/csvkit/grep.py", line 122, in <lambda>
return lambda x: obj in x
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 7: ordinal not in range(128)
4. Versions / locale etc
csvgrep -V
csvgrep 1.0.2
Python --version
Python 2.7.13
pip -V
pip 9.0.1 from /usr/local/lib/python2.7/site-packages (python 2.7)
echo $LANG
en_GB.UTF-8
5. Operating system and version
OSX Sierra 10.12.5 (16F73)
6. Remarks
The same issue does not occur with csvcut
or csvlook
.
csvcut -c 1 test.csv | csvlook
Produces this output:
| name |
| --------- |
| Operación |