-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UnicodeDecodeError #64
Comments
The issue of encodings in R is really a problematic one. The solution you are suggesting is probably not a good one as setting Can you try if either of these scripts will work? It tests if you can pass just the path and the second explicitly converts the path to your encoding.
|
Both of the presented scripts give an error.
I ran the following scripts (in ShiftJIS and UTF-8) in R. x <- "日本語"
print("end of script") $ rscript test.sjis.R
[1] "end of script"
$ rscript test.utf8.R
Error: invalid multibyte character in parser at line 1
Execution halted I just assigned a multibyte string and did not evaluate its value, but I got an error. It is expected that an error due to an encoding mismatch occurred when the R interpreter was parsing the script. Therefore, I thought I needed to add an encoding option when loading the script file to solve the problem. |
Its all rather strange. I tested the solution you proposed and while it worked fine with non Could you, please, test your solution (the changes to utils.py) with layer that would be named: I think that there might be some R setting causing the problems, most likely the locale. |
"ěščř" coded by ISO 8859-2 is 0xEC 0xB9 0xE8 0xF8. Strictly interpreted as UTF-8, there are no applicable characters, but the unavailable bytes are not included. It doesn't make an error, but I'm not sure if it works correctly. "日本語" coded by Shift-JIS (cp932) is 0x93 0xFA 0x96 0x7B 0x8C 0xEA. In UTF-8, 0x8X and 0x9X are not allowed in the most significant byte, so I guess it will be an error. |
I would guess that it is related to the issue mentioned here: https://stackoverflow.com/questions/46946483/czech-encoding-in-r. Setting it in .Rprofile file would make it permanent for the system. You can select one of the available code pages from here, unfortunately, the UTFs are not available. I don't see a way to solve it reasonably. |
Looks like I found a solution that might work while not breaking anything. Could you try changing the It works only for |
Hi,
An error occurs when selecting a layer that contains Japanese characters in the file path.
The reason is probably because the R script will be generated in UTF-8 but R will try to interpret it as Shift_JIS or CP932 (Japanese encoding).
OS: Windows10 (locale: Japanese)
Sample code
and log
If you fix it as follows, the error will not occur.
The text was updated successfully, but these errors were encountered: