Giter Club home page Giter Club logo

gbk2utf8's Introduction

converting files encoding in some path from GBK to UTF8

HitCount Join the chat at https://gitter.im/gbk2utf8/Lobby GitHub stars GitHub forks

how to run

for any platform

java -jar gbk2utf8-0.1.1-SNAPSHOT-all.jar $src-gbk-path $dst-utf8-path

for help, run java -jar gbk2utf8-0.1.1-SNAPSHOT-all.jar

NOTE

a precompiled package is available at v0.1.1 Releases, you can go to this page and download it directly.

for Mac platform (64bit)

$ gbk2utf8        # converting from gbk2 to utf8
Usage: Gbk2Utf8 <src-gbk-path> <dst-utf8-path> [include-extension (default 'java')]
$ gbk2utf8 convert
Usage: Gbk2Utf8 <src-gbk-path> <dst-utf8-path> [include-extension (default 'java')]
$ gbk2utf8 pkgrename # rename package
Usage: PackageRename <src-path> <dst-path> <original-package-prefix> <alternative-package-prefix>
Example: PackageRename ~/original ~/alternative com.example. io.downgoon
$ gbk2utf8 version
0.1.1

NOTE

gbk2utf8-0.1.1.pkg will be installed in /usr/local/gbk2utf8 directory and a softlink /usr/local/bin/gbk2utf8 (linking to /usr/local/gbk2utf8/bin/gbk2utf8) will be created.

installing gbk2utf8-0.1.1.pkg

for Ubuntu platform (64bit)

  • download
wget  https://github.com/downgoon/gbk2utf8/releases/download/v0.1.1/gbk2utf8_0.1.1_armhf.deb
  • install
dpkg -i gbk2utf8_0.1.1_armhf.deb
  • run
$ gbk2utf8        # converting from gbk2 to utf8
Usage: Gbk2Utf8 <src-gbk-path> <dst-utf8-path> [include-extension (default 'java')]
$ gbk2utf8 convert
Usage: Gbk2Utf8 <src-gbk-path> <dst-utf8-path> [include-extension (default 'java')]
$ gbk2utf8 pkgreanme   # package renaming
Usage: Gbk2Utf8 <src-gbk-path> <dst-utf8-path> [include-extension (default 'java')]
$ gbk2utf8 version
0.1.1

how to build

git clone https://github.com/downgoon/gbk2utf8.git
cd gbk2utf8
mvn clean package
java -jar target/gbk2utf8-0.0.1-SNAPSHOT-all.jar $src-gbk-path $dst-utf8-path	

Highlight

  • recursive converting all necessary files in the specified path
  • encoding detective do nothing to files not encoded in GBK in specified path

how to package

see https://github.com/downgoon/pkgmaker


package rename (a new tool)

java -cp ./gbk2utf8-0.0.1-SNAPSHOT-all.jar io.downgoon.tools.PackageRename $src-gbk-path $dst-utf8-path <src-path> <dst-path> <original-package-prefix> <alternative-package-prefix>


Use Cases


反馈问题

如果您有不清楚的地方或更好的建议,请在Issue中给我们反馈。 操作步骤如下:

  1. 找到Issue入口 Issue入口

  2. 新建Issue 新建Issue

  3. 别忘了最好设置一个Issue类型 反馈问题

gbk2utf8's People

Contributors

downgoon avatar gitter-badger avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

gbk2utf8's Issues

Mac下iconv命令行能做转码

在Mac下有默认命令行 iconv,能对文本进行编码转换。

  • Mac OS X comes with iconv utility that can convert text between encodings. Run the following command in Terminal to convert a gbk chinese text file to utf-8:

    iconv -f gbk -t utf-8 chinese-gbk.txt > chinese-utf8.txt

    参数:-f 表示 --from-code=ENCODING-t 表示 --to-code=ENCODING

  • To list the encodings that iconv supports:

    iconv -l
    iconv --help

convert gbk to utf-8

convert all files in the specified path from gbk encoding to utf-8 in a recursive way.

ArrayIndexOutOfBound Problem in EncodingDetect utf16_probability

In line 736 to 739, the if statement:
if (rawtext.length > 1 && ((byte) 0xFE == rawtext[0] && (byte) 0xFF == rawtext[1]) || ((byte) 0xFF == rawtext[0] && (byte) 0xFE == rawtext[1]))
It can potentially cause ArrayIndexOutOfBound When rawtext.length <= 1 and gets to (byte) 0xFF == rawtext[0] && (byte) 0xFE == rawtext[1])

BUILD FAILURE due to test

mvn clean package failed. I have to skip test to build jar file.

[INFO] Scanning for projects...
[INFO] 
[INFO] ------------------------------------------------------------------------
[INFO] Building gbk2utf8 0.1.1-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ gbk2utf8 ---
[INFO] Deleting D:\Projects\github\gbk2utf8\target
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ gbk2utf8 ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory D:\Projects\github\gbk2utf8\src\main\resources
[INFO] 
[INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ gbk2utf8 ---
[INFO] Compiling 3 source files to D:\Projects\github\gbk2utf8\target\classes
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ gbk2utf8 ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory D:\Projects\github\gbk2utf8\src\test\resources
[INFO] 
[INFO] --- maven-compiler-plugin:2.3.2:testCompile (default-testCompile) @ gbk2utf8 ---
[INFO] Compiling 2 source files to D:\Projects\github\gbk2utf8\target\test-classes
[INFO] 
[INFO] --- maven-surefire-plugin:2.12.4:test (default-test) @ gbk2utf8 ---
[INFO] Surefire report directory: D:\Projects\github\gbk2utf8\target\surefire-reports

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running io.downgoon.tools.Gbk2Utf8Test
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.022 sec
Running io.downgoon.tools.PackageRenameTest
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec <<< FAILURE!
testReplacement(io.downgoon.tools.PackageRenameTest)  Time elapsed: 0.001 sec  <<< FAILURE!
org.junit.ComparisonFailure: expected:<src/main/java/[io/downgoon]/tools/Hello.java> but was:<src/main/java/[com/example]/tools/Hello.java>
	at org.junit.Assert.assertEquals(Assert.java:99)
	at org.junit.Assert.assertEquals(Assert.java:117)
	at io.downgoon.tools.PackageRenameTest.testReplacement(PackageRenameTest.java:29)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.internal.runners.TestMethod.invoke(TestMethod.java:59)
	at org.junit.internal.runners.MethodRoadie.runTestMethod(MethodRoadie.java:98)
	at org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:79)
	at org.junit.internal.runners.MethodRoadie.runBeforesThenTestThenAfters(MethodRoadie.java:87)
	at org.junit.internal.runners.MethodRoadie.runTest(MethodRoadie.java:77)
	at org.junit.internal.runners.MethodRoadie.run(MethodRoadie.java:42)
	at org.junit.internal.runners.JUnit4ClassRunner.invokeTestMethod(JUnit4ClassRunner.java:88)
	at org.junit.internal.runners.JUnit4ClassRunner.runMethods(JUnit4ClassRunner.java:51)
	at org.junit.internal.runners.JUnit4ClassRunner$1.run(JUnit4ClassRunner.java:44)
	at org.junit.internal.runners.ClassRoadie.runUnprotected(ClassRoadie.java:27)
	at org.junit.internal.runners.ClassRoadie.runProtected(ClassRoadie.java:37)
	at org.junit.internal.runners.JUnit4ClassRunner.run(JUnit4ClassRunner.java:42)
	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
	at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
	at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)


Results :

Failed tests:   testReplacement(io.downgoon.tools.PackageRenameTest): expected:<src/main/java/[io/downgoon]/tools/Hello.java> but was:<src/main/java/[com/example]/tools/Hello.java>

Tests run: 2, Failures: 1, Errors: 0, Skipped: 0

[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2.042 s
[INFO] Finished at: 2017-04-28T15:51:19+08:00
[INFO] Final Memory: 24M/314M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12.4:test (default-test) on project gbk2utf8: There are test failures.
[ERROR] 
[ERROR] Please refer to D:\Projects\github\gbk2utf8\target\surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

skip test

D:\Projects\github\gbk2utf8 (master)
$ mvn clean package -DskipTests
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building gbk2utf8 0.1.1-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ gbk2utf8 ---
[INFO] Deleting D:\Projects\github\gbk2utf8\target
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ gbk2utf8 ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory D:\Projects\github\gbk2utf8\src\main\resources
[INFO]
[INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ gbk2utf8 ---
[INFO] Compiling 3 source files to D:\Projects\github\gbk2utf8\target\classes
[INFO]
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ gbk2utf8 ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory D:\Projects\github\gbk2utf8\src\test\resources
[INFO]
[INFO] --- maven-compiler-plugin:2.3.2:testCompile (default-testCompile) @ gbk2utf8 ---
[INFO] Compiling 2 source files to D:\Projects\github\gbk2utf8\target\test-classes
[INFO]
[INFO] --- maven-surefire-plugin:2.12.4:test (default-test) @ gbk2utf8 ---
[INFO] Tests are skipped.
[INFO]
[INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ gbk2utf8 ---
[INFO] Building jar: D:\Projects\github\gbk2utf8\target\gbk2utf8-0.1.1-SNAPSHOT.jar
[INFO]
[INFO] --- maven-shade-plugin:2.0:shade (default) @ gbk2utf8 ---
[INFO] Including commons-io:commons-io:jar:2.5 in the shaded jar.
[INFO] Minimizing jar io.downgoon.tools:gbk2utf8:jar:0.1.1-SNAPSHOT
[INFO] Minimized 136 -> 44 (32%)
[INFO] Attaching shaded artifact.
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 3.960 s
[INFO] Finished at: 2017-04-28T15:53:30+08:00
[INFO] Final Memory: 23M/308M
[INFO] ------------------------------------------------------------------------

not convenient enough in a production project

There are multi types of files encoded in GBK in a production project. such as .java, .properties, .xml and .sql. but gbk2utf8's default type is just only .java. so we have to do so many steps as follows:

gbk2utf8 /Users/downgoon/Documents/workspace/examples/SomeProject /Users/downgoon/Documents/workspace/examples/SomeProjectUTF8 java

gbk2utf8 /Users/downgoon/Documents/workspace/examples/SomeProject /Users/downgoon/Documents/workspace/examples/SomeProjectUTF8 properties

gbk2utf8 /Users/downgoon/Documents/workspace/examples/SomeProject /Users/downgoon/Documents/workspace/examples/SomeProjectUTF8 xml

gbk2utf8 /Users/downgoon/Documents/workspace/examples/SomeProject /Users/downgoon/Documents/workspace/examples/SomeProjectUTF8 sql

cp -R SomeProjectUTF8/* SomeProject/

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.